Repository of Reproducible Computations

Free Statistics

of Irreproducible Research!

Author's title

Author

*The author of this computation has been verified*

R Software Module

rwasp_regression_trees1.wasp

Title produced by software

Recursive Partitioning (Regression Trees)

Date of computation

Tue, 13 Dec 2011 15:45:50 -0500

Cite this page as follows

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2011/Dec/13/t132380916227qvs26ujsf4ciu.htm/, Retrieved Thu, 02 May 2024 16:41:22 +0000

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=154713, Retrieved Thu, 02 May 2024 16:41:22 +0000

QR Codes:

Paste this QR Code to cite your computation.

Original text written by user:

IsPrivate?

No (this computation is public)

User-defined keywords

Estimated Impact

Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)

-     [Recursive Partitioning (Regression Trees)] [] [2010-12-05 19:50:12] [b98453cac15ba1066b407e146608df68]
- R PD  [Recursive Partitioning (Regression Trees)] [] [2011-12-12 15:39:51] [d623f9be707a26b8ffaece1fc4d5a7ee]
-   P       [Recursive Partitioning (Regression Trees)] [] [2011-12-13 20:45:50] [4cf172296f32adf71d8383c359dbb80f] [Current]

Feedback Forum

Post a new message

Dataseries X:

Download CSV

Histogram

Boxplots

264530	34	124252
135248	30	98956
207253	42	98073
202898	35	106816
145249	26	41449
65295	31	76173
439387	29	177551
33186	18	22807
183696	30	126938
190673	29	61680
287239	42	72117
205260	50	79738
141987	33	57793
322679	49	91677
199717	40	64631
349227	52	106385
276709	33	161961
273576	35	112669
157448	25	114029
242782	43	124550
256814	40	105416
405942	37	72875
161189	25	81964
156389	46	104880
200181	41	76302
192645	35	96740
249893	38	93071
241171	36	78912
143182	28	35224
285266	37	90694
243048	40	125369
176062	42	80849
305210	48	104434
87995	33	65702
343613	39	108179
264159	37	63583
394976	41	95066
192718	32	62486
114673	17	31081
310108	39	94584
292891	36	87408
157518	38	68966
180362	36	88766
146175	35	57139
140319	45	90586
405267	38	109249
78800	26	33032
201970	45	96056
309762	46	146648
166270	41	80613
199186	34	87026
24188	4	5950
346142	41	131106
65029	18	32551
101097	14	31701
255082	37	91072
287314	53	159803
308944	36	143950
280943	37	112368
225816	36	82124
348955	46	144068
283283	28	162627
199642	42	55062
232791	38	95329
212262	33	105612
201345	28	62853
180424	31	125976
204450	40	79146
197813	32	108461
138731	25	99971
219074	43	77826
73566	23	22618
219392	42	84892
181728	38	92059
150006	34	77993
325723	40	104155
265348	36	109840
202410	37	238712
173420	34	67486
162366	37	68007
136341	25	48194
390163	45	134796
145905	26	38692
248834	40	93587
80953	8	56622
133301	27	15986
138630	32	113402
334082	37	97967
277542	57	74844
170849	41	136051
236398	37	50548
207178	38	112215
157125	28	59591
242395	36	59938
273632	33	137639
178489	32	143372
210247	34	138599
268066	35	174110
351056	58	135062
368833	30	175681
247842	45	130307
268118	37	139141
174311	36	44244
43287	19	43750
182915	23	48029
189021	35	95216
237531	36	92288
279589	36	94588
106655	23	197426
135798	41	151244
292930	40	139206
266805	42	106271
23623	1	1168
174970	36	71764
61857	11	25162
147760	42	45635
358662	34	101817
21054	0	855
230091	27	100174
31414	8	14116
284519	38	85008
209481	44	124254
161691	40	105793
137093	28	117129
38214	8	8773
166059	36	94747
319346	47	107549
186273	48	97392
374269	45	126893
275578	48	118850
371645	50	234853
179928	40	74783
94381	32	66089
269169	37	95684
382564	42	139537
118033	35	144253
370878	42	153824
147989	34	63995
236370	41	84891
193456	36	61263
189020	32	106221
344751	35	113587
224936	35	113864
173260	21	37238
291777	45	119906
130908	49	135096
209639	36	151611
262412	39	144645
1	0	0
14688	0	6023
98	0	0
455	0	0
0	0	0
0	0	0
195822	33	77457
347930	47	62464
0	0	0
203	0	0
7199	0	1644
46660	5	6179
17547	1	3926
107465	38	42087
969	0	0
179994	28	87656

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	4 seconds
R Server	'George Udny Yule' @ yule.wessa.net

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 4 seconds \tabularnewline
R Server & 'George Udny Yule' @ yule.wessa.net \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=154713&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]4 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'George Udny Yule' @ yule.wessa.net[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=154713&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=154713&T=0

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	4 seconds
R Server	'George Udny Yule' @ yule.wessa.net

10-Fold Cross Validation
	Prediction (training)			Prediction (testing)
Actual	C1	C2	CV	C1	C2	CV
C1	528	213	0.7126	56	23	0.7089
C2	63	686	0.9159	8	63	0.8873
Overall	-	-	0.8148	-	-	0.7933

\begin{tabular}{lllllllll}
\hline
10-Fold Cross Validation \tabularnewline
 & Prediction (training) & Prediction (testing) \tabularnewline
Actual & C1 & C2 & CV & C1 & C2 & CV \tabularnewline
C1 & 528 & 213 & 0.7126 & 56 & 23 & 0.7089 \tabularnewline
C2 & 63 & 686 & 0.9159 & 8 & 63 & 0.8873 \tabularnewline
Overall & - & - & 0.8148 & - & - & 0.7933 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=154713&T=1

[TABLE]
[ROW][C]10-Fold Cross Validation[/C][/ROW]
[ROW][C][/C][C]Prediction (training)[/C][C]Prediction (testing)[/C][/ROW]
[ROW][C]Actual[/C][C]C1[/C][C]C2[/C][C]CV[/C][C]C1[/C][C]C2[/C][C]CV[/C][/ROW]
[ROW][C]C1[/C][C]528[/C][C]213[/C][C]0.7126[/C][C]56[/C][C]23[/C][C]0.7089[/C][/ROW]
[ROW][C]C2[/C][C]63[/C][C]686[/C][C]0.9159[/C][C]8[/C][C]63[/C][C]0.8873[/C][/ROW]
[ROW][C]Overall[/C][C]-[/C][C]-[/C][C]0.8148[/C][C]-[/C][C]-[/C][C]0.7933[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=154713&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=154713&T=1

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

10-Fold Cross Validation
	Prediction (training)			Prediction (testing)
Actual	C1	C2	CV	C1	C2	CV
C1	528	213	0.7126	56	23	0.7089
C2	63	686	0.9159	8	63	0.8873
Overall	-	-	0.8148	-	-	0.7933

Confusion Matrix (predicted in columns / actuals in rows)
	C1	C2
C1	63	19
C2	11	71

\begin{tabular}{lllllllll}
\hline
Confusion Matrix (predicted in columns / actuals in rows) \tabularnewline
 & C1 & C2 \tabularnewline
C1 & 63 & 19 \tabularnewline
C2 & 11 & 71 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=154713&T=2

[TABLE]
[ROW][C]Confusion Matrix (predicted in columns / actuals in rows)[/C][/ROW]
[ROW][C][/C][C]C1[/C][C]C2[/C][/ROW]
[ROW][C]C1[/C][C]63[/C][C]19[/C][/ROW]
[ROW][C]C2[/C][C]11[/C][C]71[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=154713&T=2

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=154713&T=2

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Confusion Matrix (predicted in columns / actuals in rows)
	C1	C2
C1	63	19
C2	11	71

Figure 1

PNG link

Postscript link

PDF link

Figure 2

PNG link

Postscript link

PDF link

Figure 3

PNG link

Postscript link

PDF link

Parameters (Session):

par1 = 1 ; par2 = quantiles ; par3 = 2 ; par4 = yes ;

Parameters (R input):

par1 = 1 ; par2 = quantiles ; par3 = 2 ; par4 = yes ;

R code (references can be found in the software module):

library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}

Free Statistics

Description of Statistical Computation

Tree of Dependent Computations

Dataset

Tables (Output of Computation)

Figures (Output of Computation)

Input Parameters & R Code