Repository of Reproducible Computations

Free Statistics

of Irreproducible Research!

Author's title

Author

*The author of this computation has been verified*

R Software Module

rwasp_regression_trees1.wasp

Title produced by software

Recursive Partitioning (Regression Trees)

Date of computation

Wed, 14 Dec 2011 08:57:35 -0500

Cite this page as follows

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2011/Dec/14/t13238711260gow7dim51pmo4n.htm/, Retrieved Wed, 01 May 2024 18:30:37 +0000

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=154971, Retrieved Wed, 01 May 2024 18:30:37 +0000

QR Codes:

Paste this QR Code to cite your computation.

Original text written by user:

IsPrivate?

No (this computation is public)

User-defined keywords

Estimated Impact

Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)

-     [Recursive Partitioning (Regression Trees)] [] [2010-12-05 19:50:12] [b98453cac15ba1066b407e146608df68]
- R PD    [Recursive Partitioning (Regression Trees)] [cross validation] [2011-12-14 13:57:35] [5363b79245edacd2d961915f77b3b63a] [Current]

Feedback Forum

Post a new message

Dataseries X:

Download CSV

Histogram

Boxplots

1	78	20	17	30	28
2	46	38	17	42	39
3	18	0	0	0	0
4	84	49	22	54	54
5	125	74	30	86	80
6	215	104	31	157	144
7	50	37	19	36	36
8	48	53	25	48	48
9	37	42	30	45	42
10	86	62	26	77	71
11	69	50	20	49	49
12	59	65	25	77	74
13	85	28	15	28	27
14	84	48	22	84	83
15	44	42	12	31	31
16	67	47	19	28	28
17	49	71	28	99	98
18	47	0	12	2	2
19	77	50	28	41	43
20	20	12	13	25	24
21	49	16	14	16	16
22	81	76	27	96	95
23	58	29	25	23	22
24	45	38	30	33	33
25	73	50	18	46	45
26	22	33	17	59	59
27	138	45	22	72	66
28	74	59	28	72	70
29	102	49	25	62	56
30	35	40	16	55	55
31	39	40	23	27	27
32	38	51	20	41	37
33	88	41	11	51	48
34	102	73	20	26	26
35	42	43	21	65	64
36	1	0	0	0	0
37	54	46	27	28	21
38	46	44	14	44	44
39	41	31	29	36	36
40	49	71	31	100	89
41	56	61	19	104	101
42	47	28	30	35	31
43	25	21	23	69	65
44	62	42	20	73	71
45	41	44	22	106	102
46	72	34	19	53	53
47	26	15	32	43	41
48	77	46	18	49	46
49	75	43	26	38	37
50	51	47	25	51	51
51	28	12	22	14	14
52	54	42	19	40	40
53	64	56	24	79	77
54	67	41	26	52	51
55	48	48	27	44	43
56	44	30	10	34	33
57	55	44	26	47	47
58	17	25	21	32	31
59	55	42	21	31	31
60	72	28	34	40	40
61	47	33	29	42	42
62	62	32	18	34	35
63	45	28	16	40	40
64	29	31	23	35	30
65	25	13	22	11	11
66	37	38	29	43	41
67	60	39	31	53	53
68	57	68	21	82	82
69	32	32	21	41	41
70	15	5	21	6	6
71	102	53	15	82	81
72	52	33	9	47	47
73	53	48	21	108	100
74	58	36	18	46	46
75	51	52	31	38	38
76	31	0	24	0	0
77	50	52	24	45	45
78	78	45	22	57	56
79	23	16	21	20	18
80	66	33	26	56	54
81	56	48	22	38	37
82	51	33	26	42	40
83	24	24	20	37	37
84	32	37	25	36	36
85	36	16	19	34	34
86	42	32	22	53	49
87	180	48	25	85	82
88	83	36	19	36	36
89	46	29	21	33	33
90	40	26	20	57	55
91	33	37	23	50	50
92	66	58	22	71	71
93	52	35	21	32	31
94	51	24	12	45	42
95	30	18	9	33	31
96	89	37	32	53	51
97	49	86	24	64	64
98	12	13	1	14	14
99	83	20	24	38	37
100	51	32	20	39	37
101	24	8	4	8	8
102	19	38	15	38	38
103	44	45	21	24	23
104	52	24	23	22	22
105	35	23	12	18	18
106	22	2	16	3	1
107	32	52	24	49	48
108	22	5	9	5	5
109	0	0	0	0	0
110	26	43	22	47	46
111	48	18	17	33	33
112	35	41	18	44	41
113	47	45	21	56	57
114	55	29	17	49	49
115	5	0	0	0	0
116	0	0	0	0	0
117	37	32	20	45	45
118	65	58	26	78	78
119	81	17	26	51	46
120	32	24	20	25	25
121	19	7	1	1	1
122	58	62	24	62	59
123	33	30	14	29	29
124	42	49	26	26	26
125	37	3	12	4	4
126	12	10	2	10	10
127	41	42	16	43	43
128	23	18	22	36	36
129	35	40	28	43	41
130	9	1	2	0	0
131	9	0	0	0	0
132	49	29	17	33	32
133	3	0	1	0	0
134	41	46	17	53	53
135	3	5	0	0	0
136	16	8	4	6	6
137	0	0	0	0	0
138	41	21	25	19	18
139	31	21	26	26	26
140	4	0	0	0	0
141	11	0	0	0	0
142	20	15	15	16	16
143	40	40	18	84	84
144	16	17	19	28	22

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	5 seconds
R Server	'George Udny Yule' @ yule.wessa.net

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 5 seconds \tabularnewline
R Server & 'George Udny Yule' @ yule.wessa.net \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=154971&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]5 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'George Udny Yule' @ yule.wessa.net[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=154971&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=154971&T=0

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	5 seconds
R Server	'George Udny Yule' @ yule.wessa.net

10-Fold Cross Validation
	Prediction (training)			Prediction (testing)
Actual	C1	C2	CV	C1	C2	CV
C1	648	0	1	92	0	1
C2	28	596	0.9551	2	74	0.9737
Overall	-	-	0.978	-	-	0.9881

\begin{tabular}{lllllllll}
\hline
10-Fold Cross Validation \tabularnewline
 & Prediction (training) & Prediction (testing) \tabularnewline
Actual & C1 & C2 & CV & C1 & C2 & CV \tabularnewline
C1 & 648 & 0 & 1 & 92 & 0 & 1 \tabularnewline
C2 & 28 & 596 & 0.9551 & 2 & 74 & 0.9737 \tabularnewline
Overall & - & - & 0.978 & - & - & 0.9881 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=154971&T=1

[TABLE]
[ROW][C]10-Fold Cross Validation[/C][/ROW]
[ROW][C][/C][C]Prediction (training)[/C][C]Prediction (testing)[/C][/ROW]
[ROW][C]Actual[/C][C]C1[/C][C]C2[/C][C]CV[/C][C]C1[/C][C]C2[/C][C]CV[/C][/ROW]
[ROW][C]C1[/C][C]648[/C][C]0[/C][C]1[/C][C]92[/C][C]0[/C][C]1[/C][/ROW]
[ROW][C]C2[/C][C]28[/C][C]596[/C][C]0.9551[/C][C]2[/C][C]74[/C][C]0.9737[/C][/ROW]
[ROW][C]Overall[/C][C]-[/C][C]-[/C][C]0.978[/C][C]-[/C][C]-[/C][C]0.9881[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=154971&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=154971&T=1

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

10-Fold Cross Validation
	Prediction (training)			Prediction (testing)
Actual	C1	C2	CV	C1	C2	CV
C1	648	0	1	92	0	1
C2	28	596	0.9551	2	74	0.9737
Overall	-	-	0.978	-	-	0.9881

Confusion Matrix (predicted in columns / actuals in rows)
	C1	C2
C1	74	0
C2	3	67

\begin{tabular}{lllllllll}
\hline
Confusion Matrix (predicted in columns / actuals in rows) \tabularnewline
 & C1 & C2 \tabularnewline
C1 & 74 & 0 \tabularnewline
C2 & 3 & 67 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=154971&T=2

[TABLE]
[ROW][C]Confusion Matrix (predicted in columns / actuals in rows)[/C][/ROW]
[ROW][C][/C][C]C1[/C][C]C2[/C][/ROW]
[ROW][C]C1[/C][C]74[/C][C]0[/C][/ROW]
[ROW][C]C2[/C][C]3[/C][C]67[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=154971&T=2

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=154971&T=2

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Confusion Matrix (predicted in columns / actuals in rows)
	C1	C2
C1	74	0
C2	3	67

Figure 1

PNG link

Postscript link

PDF link

Figure 2

PNG link

Postscript link

PDF link

Figure 3

PNG link

Postscript link

PDF link

Parameters (Session):

par1 = kendall ;

Parameters (R input):

par1 = 5 ; par2 = quantiles ; par3 = 2 ; par4 = yes ;

R code (references can be found in the software module):

library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}

Free Statistics

Description of Statistical Computation

Tree of Dependent Computations

Dataset

Tables (Output of Computation)

Figures (Output of Computation)

Input Parameters & R Code