Automated Statistical Test

Wouter Zeevat

Introduction

The automatedtests package automatically selects and runs the most appropriate statistical test for your data — no manual decision-making needed. The function works with both individual vectors or a data frame and provides the results in an easy-to-understand format, which includes the test used and all the relevant statistics.

Supported Tests

number test
1 One-proportion test
2 Chi-square goodness-of-fit test
3 One-sample Student’s t-test
4 One-sample Wilcoxon test
5 Multiple linear regression
6 Binary logistic regression
7 Multinomial logistic regression
8 Pearson correlation
9 Spearman’s rank correlation
10 Cochran’s Q test
11 McNemar’s test
12 Fisher’s exact test
13 Chi-square test of independence
14 Student’s t-test for independent samples
15 Welch’s t-test for independent samples
16 Mann-Whitney U test
17 Student’s

Usage of automatical_test()

The automatical_test() function can be used with both individual vectors or a data frame. It automatically selects the most suitable statistical test based on the data provided.

Example 1: Using Individual Vectors

In this example, we will use two vectors: Species and Sepal.Length from the iris dataset. We will use the automatical_test() function to automatically choose the best statistical test for these vectors.

# Load the package
library(automatedtests)

# Example 1: Using individual vectors from the iris dataset
test1 <- automatical_test(iris$Species, iris$Sepal.Length, identifiers = FALSE)

# View the result summary
print(test1$getResult())
## 
##  Kruskal-Wallis rank sum test
## 
## data:  data[[quan_index]] by data[[qual_index]]
## Kruskal-Wallis chi-squared = 96.937, df = 2, p-value < 2.2e-16

In this case, the function automatically selects the best statistical test based on the data’s distribution and other characteristics.

Example 2: Forcing a Paired Test

Here, we simulate a before-and-after scenario, where data is collected before and after an intervention. The automatical_test() function can be instructed to use paired tests by setting the paired argument to TRUE.

# Example 2: Forcing a paired test
before <- c(200, 220, 215, 205, 210)
after <- c(202, 225, 220, 210, 215)
paired_data <- data.frame(before, after)

# Perform the paired test
test2 <- automatical_test(before, after, paired = TRUE)

# View the result summary
print(test2$getResult())
## 
##  Paired t-test
## 
## data:  data[[1]] and data[[2]]
## t = -7.3333, df = 4, p-value = 0.001841
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -6.065867 -2.734133
## sample estimates:
## mean difference 
##            -4.4

By setting paired = TRUE, the function forces the use of a paired statistical test, even if identifiers are not provided.

Example 3: One-Sample Test with Custom Compare Value

You can override the default compare_to value to perform one-sample tests. For example, you can test whether the data differs significantly from a specified value.

# Example 3: One-sample test
test3 <- automatical_test(iris$Sepal.Length, compare_to = 5)

# View the result summary
print(test3$getResult()$p.value)
## [1] 1.297119e-19

In this case, compare_to = 5 specifies that we are performing a one-sample test where we compare the Sepal.Length to the value 5.

Conclusion

The automatical_test() function simplifies the process of selecting and running statistical tests. It automatically picks the most appropriate test based on the data’s structure and characteristics. You can fine-tune its behavior with options like compare_to, identifiers, and paired.

For more detailed information on the results of each test, you can use the getResult() method to retrieve a summary of the test performed.

See Also