% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/dataDiscretize.R
\name{dataDiscretize}
\alias{dataDiscretize}
\alias{bulkDiscretize}
\title{Discretize data}
\usage{
dataDiscretize(data, classBoundaries = NULL, classStates = NULL,
  method = "quantile")

bulkDiscretize(formattedLst, xy, inparallel = FALSE)
}
\arguments{
\item{data}{numeric vector. The continuous data to be discretized.}

\item{classBoundaries}{numeric vector or single integer. Interval boundaries to be used for data discretization. 
Outer values (minimum and maximum) required. \code{-Inf} or \code{Inf} are allowed, in which case 
data minimum and maximum will be used to evaluate the mid values of outer classes. Alternatively, a single integer to 
indicate the number of classes, to split by quantiles (default) or equal intervals.}

\item{classStates}{vector. The state labels to be assigned to the discretized data.}

\item{method}{character. What splitting method should be used? This argument is ignored if 
a vector of values is passed to \code{classBoundaries}.
\itemize{
\item{\code{quantile} splits data into quantiles (default). }
\item{\code{equal} splits data into equally sized intervals based on data minimum and maximum. }}}

\item{formattedLst}{A formatted list as returned by \code{\link{linkNode}} and \code{\link{linkMultiple}}}

\item{xy}{matrix. A matrix of spatial coordinates; first column is x (longitude), second column is y (latitude) of locations (in rows).}

\item{inparallel}{logical or integer. Should the function use parallel processing facilities? Default is FALSE: a single process will be launched. If TRUE, all cores/processors but one will be used.
Alternatively, an integer can be provided to dictate the number of cores/processors to be used.}
}
\value{
\code{dataDiscretize} returns a named list of 4 vectors: 
\itemize{
\item{\code{$discreteData}}{the discretized data, labels are applied accordingly if \code{classStates} argument is provided }
\item{\code{$classBoundaries}}{the class boundaries, i.e. values splitting the classes }
\item{\code{$midValues}}{the mid point for each class (the mean of its lower and upper boundaries) }
\item{\code{$classStates}}{the labels assigne to each class }
}
\code{bulkDataDiscretize} returns a matrix: in columns each node associated to input spatial data, 
in rows their discretized values at coordinates specified by argument \code{xy}.
}
\description{
These functions discretize continuous input data into classes. Classes can be defined 
by the user or, if the user provides the number of expected classes, calculated 
from quantiles (default option) or by equal intervals.\cr
\code{dataDiscretize} processes a single variable at a time, provided as vector.
\code{bulkDiscretize} discretizes multiple input rasters, by using parallel processing.
}
\details{
dataDiscretize
}
\examples{
s <- runif(30)

# Split by user defined values. Values out of boundaries are set to NA:
dataDiscretize(s, classBoundaries = c(0.2, 0.5, 0.8)) 

# Split by quantiles (default):
dataDiscretize(s, classStates = c('a', 'b', 'c'))

# Split by equal intervals:
dataDiscretize(s, classStates = c('a', 'b', 'c'), method = "equal")

# When -Inf and Inf are provided as external boundaries, $midValues of outer classes
# are calculated on the minimum and maximum values:
dataDiscretize(s, classBoundaries=c(0, 0.5, 1), classStates=c("first", "second"))[c(2,3)]
dataDiscretize(s, classBoundaries=c(-Inf, 0.5, Inf), classStates=c("first", "second"))[c(2,3)]

## Discretize multiple spatial data by location
data(ConwyData)
list2env(ConwyData, environment())

network <- LandUseChange
spatialData <- c(ConwyLU, ConwySlope, ConwyStatus)

# Link multiple spatial data to the network nodes and discretize
spDataLst <- linkMultiple(spatialData, network, LUclasses, verbose = FALSE)
coord <- aoi(ConwyLU, xy=TRUE)
head( bulkDiscretize(spDataLst, coord) )
}
