% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/07-data_summarise.R
\name{dossier_summarize}
\alias{dossier_summarize}
\title{Generate a report and summary of a dossier (list of datasets)}
\usage{
dossier_summarize(
  dossier,
  group_by = NULL,
  taxonomy = NULL,
  valueType_guess = FALSE
)
}
\arguments{
\item{dossier}{List of tibble, each of them being datasets.}

\item{group_by}{A character string of one column in the dataset that can be
taken as a grouping column. The visual element will be grouped and displayed
by this column.}

\item{taxonomy}{A tibble identifying the scheme used for variables
classification.}

\item{valueType_guess}{Whether the output should include a more accurate
valueType that could be applied to the dataset. TRUE by default.}
}
\value{
A list of tibbles of report for each listed dataset.
}
\description{
Assesses and summarizes the content and structure of a dossier
(list of datasets)  and reports potential issues to facilitate the
assessment of input data. The report can be used to help assess data
structure, presence of fields, coherence across elements, and taxonomy or
data dictionary formats. The summary provides additional information about
variable distributions and descriptive statistics. This report is compatible
with Excel and can be exported as an Excel spreadsheet.
}
\details{
A dossier must be a named list containing at least one data frame or
data frame extension (e.g. a tibble), each of them being datasets.
The name of each tibble will be use as the reference name of the dataset.
This report is compatible with Excel and can be exported as an Excel
spreadsheet.

A taxonomy is classification scheme that can be defined for variable
attributes. If defined, a taxonomy must be a data frame-like object. It must
be compatible with (and is generally extracted from) an Opal environment. To
work with certain functions, a valid taxonomy must contain at least the
columns 'taxonomy', 'vocabulary', and 'terms'. In addition, the taxonomy
may follow Maelstrom research taxonomy, and its content can be evaluated
accordingly, such as naming convention restriction, tagging elements,
or scales, which are specific to Maelstrom Research. In this particular
case, the tibble must also contain 'vocabulary_short', 'taxonomy_scale',
'vocabulary_scale' and 'term_scale' to work with some specific functions.

The valueType is a property of a variable and is required in certain
functions to determine the handling of the variables. The valueType refers
to the OBiBa-internal type of a variable. It is specified in a data
dictionary in a column \code{valueType} and can be associated with variables as
attributes. Acceptable valueTypes include 'text', 'integer', 'decimal',
'boolean', datetime', 'date'). The full list of OBiBa valueType
possibilities and their correspondence with R data types are available using
\link{valueType_list}.
}
\examples{
{

# use DEMO_files provided by the package
library(dplyr)

###### Example 1: Combine functions and summarise datasets.
dossier <- list(iris = tibble())

dossier_summary <- dossier_summarize(dossier)
glimpse(dossier_summary)

}

}
