% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/CNN.R
\name{CNN}
\alias{CNN}
\alias{CNN.default}
\alias{CNN.formula}
\title{Condensed Nearest Neighbors}
\usage{
\method{CNN}{formula}(formula, data, ...)

\method{CNN}{default}(x, classColumn = ncol(x), ...)
}
\arguments{
\item{formula}{A formula describing the classification variable and the attributes to be used.}

\item{data, x}{Data frame containing the tranining dataset to be filtered.}

\item{...}{Optional parameters to be passed to other methods.}

\item{classColumn}{positive integer indicating the column which contains the
(factor of) classes. By default, the last column is considered.}
}
\value{
An object of class \code{filter}, which is a list with seven components:
\itemize{
   \item \code{cleanData} is a data frame containing the filtered dataset.
   \item \code{remIdx} is a vector of integers indicating the indexes for
   removed instances (i.e. their row number with respect to the original data frame).
   \item \code{repIdx} is a vector of integers indicating the indexes for
   repaired/relabelled instances (i.e. their row number with respect to the original data frame).
   \item \code{repLab} is a factor containing the new labels for repaired instances.
   \item \code{parameters} is a list containing the argument values.
   \item \code{call} contains the original call to the filter.
   \item \code{extraInf} is a character that includes additional interesting
   information not covered by previous items.
}
}
\description{
Similarity-based method designed to select the most relevant instances for
subsequent classification with a \emph{nearest neighbor} rule. For more
information, see 'Details' and 'References' sections.
}
\details{
\code{CNN} searches for a 'consistent subset' of the provided dataset, i.e. a subset that is enough for
correctly classifying the rest of instances by means of 1-NN. To do so, \code{CNN} stores the first instance and
goes for a first sweep over the dataset, adding to the stored bag those instances which are not correctly classified by 1-NN taking the stored bag as training set.
Then, the process is iterated until all non-stored instances are correctly classified.

Although \code{CNN} is not strictly a label noise filter, it is included here for completeness, since
the origins of noise filters are connected with instance selection algorithms.
}
\examples{
# Next example is not run in order to save time
\dontrun{
data(iris)
out <- CNN(iris)
print(out)
length(out$remIdx)
identical(out$cleanData, iris[setdiff(1:nrow(iris),out$remIdx),])
}
}
\references{
Hart P. (May, 1968): The condensed nearest neighbor rule, \emph{IEEE Trans. Inf.
Theory}, vol. 14, no. 5, pp. 515-516.
}
\seealso{
\code{\link{RNN}}
}

