\encoding{UTF-8}
\name{get_comptab}
\alias{get_comptab}
\title{Calculate Compositional Differences}
\description{
  Compute differences of carbon oxidation state, stoichiometric hydration state and other compositional metrics between groups of up- and down-regulated proteins.
}

\usage{
  get_comptab(pdat, var1 = "ZC", var2 = "nH2O", plot.it = FALSE,
    mfun = "median", oldstyle = FALSE, basis = getOption("basis"))
}

\arguments{
  \item{pdat}{list, data object generated by a \code{\link{pdat_}} function}
  \item{var1}{character, the first variable}
  \item{var2}{character, the second variable}
  \item{plot.it}{logical, make a scatterplot?}
  \item{mfun}{character, either \samp{median} or \samp{mean}}
  \item{oldstyle}{logical, also calculate \code{\link{CLES}} and \emph{p}-values?}
  \item{basis}{character, keyword for basis species to use}
}

\details{
The available variables are:
\tabular{ll}{
  \samp{ZC}   \tab average oxidation state of carbon (\ZC; see \code{\link{ZCAA}}) \cr
  \samp{nH2O} \tab stoichiometric hydration state per residue (\nH2O; see \code{\link{H2OAA}}) \cr
  \samp{nO2} \tab stoichiometric oxidation state per residue (\nO2; see \code{\link{O2AA}}) \cr
  \samp{V0}   \tab standard molal volume per residue \cr
  \samp{nAA}  \tab protein length (number of amino acids) \cr
  \samp{GRAVY}\tab grand average of hydropathicity (see \code{\link{GRAVY}}) \cr
  \samp{pI}   \tab isoelectric point (see \code{\link{pI}}) \cr
  \samp{PS_TPPG17}   \tab phylostratum (see \code{\link{PS}}) \cr
  \samp{PS_LMM16}   \tab phylostratum (see \code{\link{PS}}) \cr
  \samp{MW}   \tab molecular weight per residue \cr
}
Differentially expressed proteins are identified by the value of \code{pdat$up2} (TRUE for up-regulated proteins and FALSE for down-regulated proteins).
The differences are calculated as (median for up-regulated proteins) - (median for down-regulated proteins); if \code{mfun} is \samp{mean}, means of the groups are used instead.
If \code{oldstyle} is TRUE, the function also calculates the common language effect size (\code{\link{CLES}}, in percent) and \emph{p}-value for each variable.

The \code{basis} argument is used to select the basis species, which are used for the calculation of \nH2O and \nO2.
The default for \code{getOption("basis")} is to use the \samp{QEC} basis species (see \code{\link{metrics}}).

Volume is calculated using amino acid group additivity as described by Dick et al. (2006).

Phylostrata are not compositional metrics, but are retrieved by matching UniProt accession numbers in a data file (see \code{\link{PS}}).
Because phylostratum numbers are discrete values, mean values are calculated regardless of the value of \code{mfun}.

Set \code{plot.it} to \code{TRUE} to make a scatterplot.
Open red squares and filled blue circles stand for up-regulated and down-regulated proteins, respectively.
}

\value{
A data frame is returned invisibly containing the columns \samp{dataset}, \samp{description}, \samp{n1} (number of down-regulated proteins), \samp{n2} (number of up-regulated proteins), followed two sets of columns for the variables.
These are denoted generically as (\samp{var.mfun1}, \samp{var.mfun2}, \samp{var.diff}, \samp{var.CLES}, \samp{var.p.value}), where \samp{var} is replaced by the name of \code{var1} or \code{var2}, and \samp{mfun} is replaced by the value of \code{mfun}.
For example, \samp{ZC.median1} and \samp{ZC.median2} are the median \ZC of the down- and up-regulated proteins, respectively.
}

\examples{
pd <- pdat_colorectal("JKMF10")
# default variables: ZC and nH2O
get_comptab(pd, plot.it = TRUE)
# protein length and per-residue volume
get_comptab(pd, "nAA", "V0", plot.it = TRUE)
}

\references{
Dick, J. M., LaRowe, D. E. and Helgeson, H. C. (2006) Temperature, pressure, and electrochemical constraints on protein speciation: Group additivity calculation of the standard molal thermodynamic properties of ionized unfolded proteins. \emph{Biogeosciences} \bold{3}, 311--336. \url{https://doi.org/10.5194/bg-3-311-2006}
}

\concept{Chemical composition}
