Here is collection of tips and tricks to go further with desctable
You can define labels for variables using the .labels
argument in desc_table
labels <- c(mpg = "Miles/(US) gallon",
cyl = "Number of cylinders",
disp = "Displacement (cu.in.)",
hp = "Gross horsepower",
drat = "Rear axle ratio",
wt = "Weight (1000 lbs)",
qsec = "1/4 mile time",
vs = "Engine",
am = "Transmission",
gear = "Number of forward gears",
CARBURATOR = "Number of carburetors")
mtcars %>%
desc_table(.labels = labels) %>%
desc_output("DT")As you can see with CARBURATOR instead of
carb, not all variables need to have a label, and unused
labels are discarded.
desc_table chooses its own statistics this way:
N = length"%" = percent if there is at least a factormin, max, Q1,
Q3, median, mean,
sd, IQR if there is at least a numericYou can define your own automatic statistic function using the
.auto argument in desc_table.
This function should accept one argument, the table to choose statistics
for (in the case of a grouped dataframe the subtables will be passed to
the function). It should return a list of statistics.
Here is the code of stats_auto, the default value of
.auto
stats_auto <- function(data) {
data %>%
lapply(is.numeric) %>%
unlist() %>%
any -> numeric
data %>%
lapply(is.factor) %>%
unlist() %>%
any() -> fact
stats <- list("Min" = min,
"Q1" = ~quantile(., .25),
"Med" = stats::median,
"Mean" = mean,
"Q3" = ~quantile(., .75),
"Max" = max,
"sd" = stats::sd,
"IQR" = IQR)
if (fact & numeric)
c(list("N" = length,
"%" = percent),
stats)
else if (fact & !numeric)
list("N" = length,
"%" = percent)
else if (!fact & numeric)
stats
}If you often reuse the same statistics for multiple tables and you
don’t want to repeat yourself, you can splice a list to
desc_table using the rlang::!!! operator
stats = list(N = length,
Mean = mean,
SD = sd)
mtcars %>%
desc_table(!!!stats) %>%
desc_output("DT")When splicing, all stats need to be explicitly named
stats2 = list(N = length,
mean,
sd)
mtcars %>%
desc_table(!!!stats2) %>%
desc_output("DT")You can also define a “dumb” automatic function
default_stats <- function(data)
{
list(N = length,
mean,
sd)
}desc_table chooses its own statistical tests this
way:
fisher.test
fisher.test fails, fallback on
chisq.testwilcoxon.test if there are two groupskruskal.test if there are more than two groupsYou can define your own automatic statistic function using the
.auto argument in desc_tests.
This function should accept two arguments, the variable to compare and
the grouping variable, and return a statistical test that accepts a
formula argument and returns an object with a
p.value element.
Here is the code of tests_auto, the default value of
.auto
tests_auto <- function(var, grp) {
grp <- factor(grp)
if (nlevels(grp) < 2)
~no.test
else if (is.factor(var)) {
if (tryCatch(is.numeric(fisher.test(var ~ grp)$p.value), error = function(e) F))
~fisher.test
else
~chisq.test
} else if (nlevels(grp) == 2)
~wilcox.test
else
~kruskal.test
}You can also provide a default statistical test using the
.default argument
mtcars %>%
group_by(am) %>%
desc_table(mean, sd) %>%
desc_tests(.default = ~t.test) %>%
desc_output("DT")Note that as with named tests, it is necessary to prepend the test
name with a tilde (~).
You can still choose individual tests when you define either a
.auto or a .default test
mtcars %>%
group_by(am) %>%
desc_table(mean, sd, median, IQR) %>%
desc_tests(.default = ~t.test, carb = ~wilcox.test) %>%
desc_output("DT")Note that if a .default test is provided,
.auto is ignored.
You can set the number of significant digits to display with the
digits argument. The p values are truncated at
1E-digits.
iris %>%
group_by(Species) %>%
desc_table(mean, sd) %>%
desc_tests() %>%
desc_output("DT", digits = 10)Any additional argument given to desc_output will be
carried to the output function
iris %>%
group_by(Species) %>%
desc_table(mean, sd) %>%
desc_output("DT", filter = "top")