% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/sample_verification.R
\name{sample_verification}
\alias{sample_verification}
\title{Add Sample Verification Column (Level-2)}
\usage{
sample_verification(
  FILENAME,
  data.in,
  exclusion.info,
  assay,
  output.res = FALSE,
  INPUT.DIR = NULL,
  OUTPUT.DIR = NULL,
  verbose = TRUE
)
}
\arguments{
\item{FILENAME}{(Character) A string used to identify the output level-1 file.
"<FILENAME>-<assay>-Level1.tsv".}

\item{data.in}{(Data Frame) A level-1 data frame from the format functions.}

\item{exclusion.info}{(Data Frame) A data frame containing the variables and 
values of the corresponding variables to exclude rows. 
See details for full explanation.}

\item{assay}{(Character) A string indicating what assay data the input file is. Valid 
input is one of the following: "Clint", "fup-UC", "fup-RED", or "Caco-2". 
This argument only needs to be specified when importing input data set with \code{FILENAME} 
or exporting a data file.}

\item{output.res}{(Logical) When set to \code{TRUE}, the resulting
data frame (level-2) will be exported to the user's per-session temporary directory
or \code{OUTPUT.DIR} (if specified) as a .tsv file. 
(Defaults to \code{FALSE}.)}

\item{INPUT.DIR}{(Character) Path to the directory where the input level-1 file exists. 
If \code{NULL}, looking for the input level-1 file in the current working
directory. (Defaults to \code{NULL}.)}

\item{OUTPUT.DIR}{(Character) Path to the directory to save the output file. 
If \code{NULL}, the output file will be saved to the user's per-session temporary
directory or \code{INPUT.DIR} if specified. (Defaults to \code{NULL}.)}

\item{verbose}{(\emph{logical}) Indicate whether printed statements should be shown.
(Default is TRUE.)}
}
\value{
A level-2 data frame with a verification column.
}
\description{
This function takes in a level-1 data frame and an exclusion list and 
returns a level-2 data frame with a verification column. The
verification column contains either "Y", indicating the row is good for analysis,
or messages contained in the exclusion list for why the data rows are excluded. 
If an exclusion list is not provided, all rows are assumed to be good for use 
in further analyses and are verified with "Y".
}
\details{
The `exclusion.info` should be a data frame with the following columns:
\tabular{rr}{
  Variables \tab level-1 variable(s) used to filter rows for exclusion\cr
  Values \tab Value(s) to exclude\cr
  Message \tab Simple explanation for the exclusion\cr
}
When filtering on multiple variable-value pairs, the character input for 
"Variables" and "Values" should be separated by a vertical bar "|" ,
and the variable-value pairs should match. See demonstration in Examples, Scenario 1. 

NOTE: Currently if NA's exist in a variable of interest for 'verification' assignments,
then that variable cannot be used for assigning verification. Thus, either alternative
variable-value pairs will need to be used in lieu of variable with missing values, or
(though less ideal) "manual coding" adjustments in the verification column may be necessary.

If the output level-2 data frame is chosen to be exported and an output directory 
is not specified, it will be exported to the user's R session temporary directory. 
This temporary directory is a per-session directory whose path can be found
with the following code: \code{tempdir()}. For more details, see 
\url{https://www.collinberke.com/til/posts/2023-10-24-temp-directories/}.

As a best practice, \code{INPUT.DIR} (when importing a .tsv file) and/or 
\code{OUTPUT.DIR} should be specified to simplify the process of importing 
and exporting files. This practice ensures that the exported files can easily 
be found and will not be exported to a temporary directory.
}
\examples{
level1 <- invitroTKstats::clint_L1

# Scenario 1: Pass in data.in and exclusion.info data frame from R session 

# Create a exclusion criteria data frame
# Use the excluded samples found in \code{invitroTKstats::clint_L2_heldout}
# If more than one variable is used to define a set of samples to be excluded,
# enter them as one string, separate the Variables with a vertical bar, "|",
# and do the same for Values. 

excluded_level2 <- invitroTKstats::clint_L2_heldout

exclusion_criteria <- data.frame(
  Variables = paste("Compound.Name", "Lab.Sample.Name", sep = "|"), 
  Values = paste(excluded_level2[,"Compound.Name"], excluded_level2[,"Lab.Sample.Name"], sep = "|"),
  Message = excluded_level2[,"Verified"]
  )
  
# Run the verification function.
my.level2 <- sample_verification(data.in=level1,
                                 exclusion.info = exclusion_criteria,
                                 output.res = FALSE)

# Scenario 2: Import 'tsv' as input data and do not pass in an exclusion.info data frame

\dontrun{
# Write the level-1 file to some folder
# Will need to replace <desired level-1 FOLDER> with desired export folder location.
# The <desired level-1 FOLDER> needs to already exist.   

write.table(level1,
file=here::here("<desired level-1 FOLDER>/Smeltz-Clint-Level1.tsv"),
sep="\t",
row.names=FALSE,
quote=FALSE)

# Run the verification function.
# Specify the path to import level-1 data with INPUT.DIR.
# Will need to replace INPUT.DIR = <desired level-1 FOLDER> with chosen output
# folder location from above 
# If no exclusion.info data frame is used, will label all samples as verified.
# A level-2 file is also exported to INPUT.DIR when OUTPUT.DIR is not specified.
my.level2 <- sample_verification(FILENAME="Smeltz", 
assay="Clint", INPUT.DIR = here::here("<desired level-1 FOLDER>"))
}

}
\author{
Zhihui (Grace) Zhao
}
