% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/conformal_infer.R
\name{int_conformal_full}
\alias{int_conformal_full}
\alias{int_conformal_full.default}
\alias{int_conformal_full.workflow}
\title{Prediction intervals via conformal inference}
\usage{
int_conformal_full(object, ...)

\method{int_conformal_full}{default}(object, ...)

\method{int_conformal_full}{workflow}(object, train_data, ..., control = control_conformal_full())
}
\arguments{
\item{object}{A fitted \code{\link[workflows:workflow]{workflows::workflow()}} object.}

\item{...}{Not currently used.}

\item{train_data}{A data frame with the \emph{original predictor data} used to
create the fitted workflow (predictors and outcomes). If the workflow does
not contain these values, pass them here. If the workflow used a recipe, this
should be the data that were inputs to the recipe (and not the product of a
recipe).}

\item{control}{A control object from \code{\link[=control_conformal_full]{control_conformal_full()}} with the
numeric minutiae.}
}
\value{
An object of class \code{"int_conformal_full"} containing the information
to create intervals (which includes the training set data). The \code{predict()}
method is used to produce the intervals.
}
\description{
Nonparametric prediction intervals can be computed for fitted workflow
objects using the conformal inference method described by Lei \emph{at al} (2018).
}
\details{
This function implements what is usually called "full conformal inference"
(see Algorithm 1 in Lei \emph{et al} (2018)) since it uses the entire training
set to compute the intervals.

This function prepares the objects for the computations. The \code{\link[=predict]{predict()}}
method computes the intervals for new data.

For a given new_data observation, the predictors are appended to the original
training set. Then, different "trial" values of the outcome are substituted
in for that observation's outcome and the model is re-fit. From each model,
the residual associated with the trial value is compared to a quantile of the
distribution of the other residuals. Usually the absolute values of the
residuals are used. Once the residual of a trial value exceeds the
distributional quantile, the value is one of the bounds.

The literature proposed using a grid search of trial values to find the two
points that correspond to the prediction intervals. To use this approach,
set \code{method = "grid"} in \code{\link[=control_conformal_full]{control_conformal_full()}}. However, the default method
\verb{"search} uses two different one-dimensional iterative searches on either
side of the predicted value to find values that correspond to the prediction intervals.

For medium to large data sets, the iterative search method is likely to
generate slightly smaller intervals. For small training sets, grid search
is more likely to have somewhat smaller intervals (and will be more stable).
Otherwise, the iterative search method is more precise and several folds
faster.

To determine a range of possible values of the intervals, used by both methods,
the initial set of training set residuals are modeled using a Gamma generalized
linear model with a log link (see the reference by Aitkin below). For a new
sample, the absolute size of the residual is estimated and a multiple of
this value is computed as an initial guess of the search boundaries.
}
\references{
Jing Lei, Max G'Sell, Alessandro Rinaldo, Ryan J. Tibshirani and Larry
Wasserman (2018) Distribution-Free Predictive Inference for Regression,
\emph{Journal of the American Statistical Association}, 113:523, 1094-1111

Murray Aitkin, Modelling Variance Heterogeneity in Normal Regression Using
GLIM, \emph{Journal of the Royal Statistical Society Series C: Applied Statistics},
Volume 36, Issue 3, November 1987, Pages 332–339.
}
\seealso{
\code{\link[=predict.int_conformal_full]{predict.int_conformal_full()}}
}
