Repository of Reproducible Computations

Free Statistics

of Irreproducible Research!

Author's title

Author

*The author of this computation has been verified*

R Software Module

rwasp_regression_trees1.wasp

Title produced by software

Recursive Partitioning (Regression Trees)

Date of computation

Tue, 13 Dec 2011 13:11:32 -0500

Cite this page as follows

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2011/Dec/13/t1323799944apqsxhvxbuip3go.htm/, Retrieved Thu, 02 May 2024 16:19:05 +0000

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=154600, Retrieved Thu, 02 May 2024 16:19:05 +0000

QR Codes:

Paste this QR Code to cite your computation.

Original text written by user:

IsPrivate?

No (this computation is public)

User-defined keywords

Estimated Impact

Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)

-     [Recursive Partitioning (Regression Trees)] [] [2010-12-05 18:59:57] [b98453cac15ba1066b407e146608df68]
-   PD  [Recursive Partitioning (Regression Trees)] [WS 10 - PLC RP CAT] [2011-12-13 18:03:17] [8b13b85c94b9a060d82f72930775ea89]
-   P       [Recursive Partitioning (Regression Trees)] [WS 10 - 10 Fold C...] [2011-12-13 18:11:32] [e1c4030d3eb0ab0fcc7a7b48aeaac474] [Current]

Feedback Forum

Post a new message

Dataseries X:

Download CSV

Histogram

Boxplots

2	13	12	30	33	13	16
1	8	8	32	35	11	15
2	14	12	30	35	12	13
2	14	11	33	25	13	14
1	13	11	36	39	12	17
1	16	13	37	37	12	13
1	14	11	31	31	13	12
1	13	10	36	28	12	9
2	15	7	40	38	15	25
1	13	10	31	32	11	13
2	16	12	24	32	13	10
1	20	15	46	46	12	13
1	17	12	40	40	12	9
1	15	15	27	33	12	14
2	16	12	32	25	15	26
1	16	10	41	37	13	12
1	12	10	28	33	13	11
2	9	8	34	33	12	19
2	15	11	31	35	11	12
2	17	14	38	39	11	9
1	12	12	37	36	13	15
1	10	11	34	37	10	15
2	11	6	33	43	12	23
2	16	12	38	27	13	20
1	16	14	27	31	10	0
2	15	11	36	33	12	15
1	13	8	37	35	13	8
2	14	12	35	36	12	12
1	19	15	44	39	11	11
1	16	13	41	31	11	18
2	17	14	29	34	14	19
1	10	12	31	29	12	13
1	15	7	32	37	14	22
1	14	11	35	30	12	12
1	14	7	36	32	12	15
2	16	12	28	31	13	16
1	17	12	34	34	15	16
1	15	12	36	30	12	13
2	17	13	33	33	16	11
2	14	15	35	37	10	16
1	10	9	34	33	13	14
2	14	9	38	28	12	11
2	16	11	35	32	12	20
2	18	14	40	40	16	16
1	15	12	35	39	12	12
1	16	15	32	28	16	17
1	16	12	33	33	13	11
1	10	6	31	36	10	12
2	8	5	32	35	14	14
1	17	13	35	34	13	13
1	14	11	32	35	12	14
1	12	11	26	30	13	19
2	10	6	38	35	16	17
1	14	12	45	37	12	11
1	12	10	36	40	12	12
1	16	6	37	34	13	12
1	16	12	33	37	13	14
1	15	14	35	38	11	15
2	11	6	32	27	14	18
1	16	11	32	27	16	16
2	8	6	32	27	16	16
1	17	14	33	39	14	19
1	16	12	37	37	14	17
1	15	12	40	32	14	15
2	8	8	35	27	14	13
1	13	10	30	35	10	16
1	14	11	36	40	13	17
1	13	7	34	32	14	16
1	16	12	34	36	17	13
2	12	9	37	35	12	15
1	19	13	34	31	12	16
1	19	14	37	34	12	10
1	12	6	43	36	15	19
1	14	12	39	40	10	11
2	15	6	29	33	13	17
1	13	14	41	38	12	19
2	16	12	32	33	13	15
2	10	10	34	35	14	15
1	15	10	34	30	12	17
1	16	12	35	31	13	13
1	15	11	41	42	14	17
2	11	10	32	33	10	12
2	9	7	39	35	12	27
1	16	12	33	33	13	12
1	12	12	30	31	10	15
2	14	12	32	36	13	18
1	14	10	41	32	13	19
1	13	10	24	43	12	21
2	15	12	35	33	12	13
2	17	12	39	34	15	16
2	14	12	32	36	12	13
2	9	9	28	33	16	20
2	11	11	31	32	15	17
1	9	10	36	36	10	10
2	7	5	39	39	13	18
1	13	10	33	30	0	11
2	15	10	36	34	10	18
1	12	12	31	34	12	14
2	15	11	33	36	14	11
2	14	9	33	31	12	14
1	15	15	33	27	13	12
2	9	9	39	28	14	22
1	16	12	35	37	11	12
1	16	16	37	36	11	12
1	14	10	29	31	12	15
2	14	14	34	31	9	13
2	13	10	35	31	13	13
1	14	11	36	34	13	16
2	16	12	29	36	12	12
1	16	14	35	30	14	16
1	13	10	35	37	12	15
2	12	9	36	29	10	19
2	16	12	38	37	11	15
1	16	11	36	38	14	13
1	16	12	37	38	12	9
2	10	7	32	33	13	14
2	14	16	34	34	13	14
2	12	11	29	32	9	12
2	12	12	38	36	13	17
1	12	9	34	30	11	11
1	12	9	33	34	12	17
1	19	15	42	42	13	15
2	14	10	32	24	12	15
1	13	11	31	29	12	11
1	17	14	34	32	11	14
2	16	12	39	31	12	14
1	15	12	38	37	12	14
1	12	12	36	34	13	14
1	8	11	32	35	14	13
1	10	9	37	34	13	14
1	16	11	36	33	12	10
2	10	6	34	31	15	17
2	16	12	34	32	13	11
1	10	12	34	37	14	13
1	18	14	38	39	12	14
1	12	8	33	31	11	14
2	16	15	5	0	12	18
2	10	9	28	30	11	18
2	15	9	33	30	14	18
1	17	11	41	43	13	14
2	16	12	30	31	12	12
2	14	10	31	33	14	16
2	12	11	34	31	13	17
2	11	10	33	38	11	13
2	15	12	37	32	16	16
1	7	11	34	38	13	15

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	3 seconds
R Server	'AstonUniversity' @ aston.wessa.net
R Framework error message	The field 'Names of X columns' contains a hard return which cannot be interpreted. Please, resubmit your request without hard returns in the 'Names of X columns'.

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 3 seconds \tabularnewline
R Server & 'AstonUniversity' @ aston.wessa.net \tabularnewline
R Framework error message & The field 'Names of X columns' contains a hard return which cannot be interpreted.
Please, resubmit your request without hard returns in the 'Names of X columns'. \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=154600&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]3 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'AstonUniversity' @ aston.wessa.net[/C][/ROW]
[ROW][C]R Framework error message[/C][C]The field 'Names of X columns' contains a hard return which cannot be interpreted.
Please, resubmit your request without hard returns in the 'Names of X columns'.[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=154600&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=154600&T=0

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	3 seconds
R Server	'AstonUniversity' @ aston.wessa.net
R Framework error message	The field 'Names of X columns' contains a hard return which cannot be interpreted. Please, resubmit your request without hard returns in the 'Names of X columns'.

10-Fold Cross Validation
	Prediction (training)			Prediction (testing)
Actual	C1	C2	CV	C1	C2	CV
C1	542	158	0.7743	68	22	0.7556
C2	157	454	0.743	13	46	0.7797
Overall	-	-	0.7597	-	-	0.7651

\begin{tabular}{lllllllll}
\hline
10-Fold Cross Validation \tabularnewline
 & Prediction (training) & Prediction (testing) \tabularnewline
Actual & C1 & C2 & CV & C1 & C2 & CV \tabularnewline
C1 & 542 & 158 & 0.7743 & 68 & 22 & 0.7556 \tabularnewline
C2 & 157 & 454 & 0.743 & 13 & 46 & 0.7797 \tabularnewline
Overall & - & - & 0.7597 & - & - & 0.7651 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=154600&T=1

[TABLE]
[ROW][C]10-Fold Cross Validation[/C][/ROW]
[ROW][C][/C][C]Prediction (training)[/C][C]Prediction (testing)[/C][/ROW]
[ROW][C]Actual[/C][C]C1[/C][C]C2[/C][C]CV[/C][C]C1[/C][C]C2[/C][C]CV[/C][/ROW]
[ROW][C]C1[/C][C]542[/C][C]158[/C][C]0.7743[/C][C]68[/C][C]22[/C][C]0.7556[/C][/ROW]
[ROW][C]C2[/C][C]157[/C][C]454[/C][C]0.743[/C][C]13[/C][C]46[/C][C]0.7797[/C][/ROW]
[ROW][C]Overall[/C][C]-[/C][C]-[/C][C]0.7597[/C][C]-[/C][C]-[/C][C]0.7651[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=154600&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=154600&T=1

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

10-Fold Cross Validation
	Prediction (training)			Prediction (testing)
Actual	C1	C2	CV	C1	C2	CV
C1	542	158	0.7743	68	22	0.7556
C2	157	454	0.743	13	46	0.7797
Overall	-	-	0.7597	-	-	0.7651

Confusion Matrix (predicted in columns / actuals in rows)
	C1	C2
C1	61	18
C2	17	50

\begin{tabular}{lllllllll}
\hline
Confusion Matrix (predicted in columns / actuals in rows) \tabularnewline
 & C1 & C2 \tabularnewline
C1 & 61 & 18 \tabularnewline
C2 & 17 & 50 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=154600&T=2

[TABLE]
[ROW][C]Confusion Matrix (predicted in columns / actuals in rows)[/C][/ROW]
[ROW][C][/C][C]C1[/C][C]C2[/C][/ROW]
[ROW][C]C1[/C][C]61[/C][C]18[/C][/ROW]
[ROW][C]C2[/C][C]17[/C][C]50[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=154600&T=2

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=154600&T=2

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Confusion Matrix (predicted in columns / actuals in rows)
	C1	C2
C1	61	18
C2	17	50

Figure 1

PNG link

Postscript link

PDF link

Figure 2

PNG link

Postscript link

PDF link

Figure 3

PNG link

Postscript link

PDF link

Parameters (Session):

par1 = 2 ; par2 = quantiles ; par3 = 1 ; par4 = no ;

Parameters (R input):

par1 = 2 ; par2 = quantiles ; par3 = 2 ; par4 = yes ;

R code (references can be found in the software module):

library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}

Free Statistics

Description of Statistical Computation

Tree of Dependent Computations

Dataset

Tables (Output of Computation)

Figures (Output of Computation)

Input Parameters & R Code