Free Statistics

of Irreproducible Research!

Author's title

Author*The author of this computation has been verified*
R Software Modulerwasp_regression_trees1.wasp
Title produced by softwareRecursive Partitioning (Regression Trees)
Date of computationSat, 17 Dec 2011 08:28:46 -0500
Cite this page as followsStatistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2011/Dec/17/t1324128586wm0m831iuig8qv9.htm/, Retrieved Fri, 29 Mar 2024 07:50:00 +0000
Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=156272, Retrieved Fri, 29 Mar 2024 07:50:00 +0000
QR Codes:

Original text written by user:
IsPrivate?No (this computation is public)
User-defined keywords
Estimated Impact70
Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)
-     [Recursive Partitioning (Regression Trees)] [] [2010-12-05 18:59:57] [b98453cac15ba1066b407e146608df68]
- R PD    [Recursive Partitioning (Regression Trees)] [Recursive partiti...] [2011-12-17 13:28:46] [e7912d585babb6fa20e6bf5178c462ce] [Current]
Feedback Forum

Post a new message
Dataseries X:
33907	71433	152	74272	99	765
35981	53655	99	78867	128	1371
36588	70556	92	80176	57	1880
16967	74702	138	36541	95	232
25333	61201	106	55107	205	230
21027	686	95	45527	51	828
21114	87586	145	46001	59	1833
28777	6615	181	62854	194	906
35612	89725	190	78112	27	1781
24183	40420	150	52653	9	1264
22262	49569	186	48467	24	1123
20637	13963	174	44873	189	1461
29948	62508	151	65605	37	820
22093	90901	112	48016	81	107
36997	89418	143	81110	72	1349
31089	83237	120	68019	81	870
19477	22183	169	42198	90	1471
31301	24346	135	68531	216	731
18497	74341	161	40071	216	1945
30142	24188	98	65849	13	521
21326	11781	142	46362	153	1920
16779	23072	190	36313	185	1924
38068	49119	169	83521	131	100
29707	67776	130	64932	136	34
35016	86910	160	76730	182	325
26131	69358	176	56982	139	1677
29251	16144	111	63793	42	1779
22855	77863	165	49740	213	477
31806	89070	117	69447	184	1007
34124	34790	122	74708	44	1527




Summary of computational transaction
Raw Inputview raw input (R code)
Raw Outputview raw output of R engine
Computing time3 seconds
R Server'Gwilym Jenkins' @ jenkins.wessa.net

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 3 seconds \tabularnewline
R Server & 'Gwilym Jenkins' @ jenkins.wessa.net \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=156272&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]3 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'Gwilym Jenkins' @ jenkins.wessa.net[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=156272&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=156272&T=0

As an alternative you can also use a QR Code:  

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Inputview raw input (R code)
Raw Outputview raw output of R engine
Computing time3 seconds
R Server'Gwilym Jenkins' @ jenkins.wessa.net







Goodness of Fit
Correlation0.8971
R-squared0.8047
RMSE2871.5964

\begin{tabular}{lllllllll}
\hline
Goodness of Fit \tabularnewline
Correlation & 0.8971 \tabularnewline
R-squared & 0.8047 \tabularnewline
RMSE & 2871.5964 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=156272&T=1

[TABLE]
[ROW][C]Goodness of Fit[/C][/ROW]
[ROW][C]Correlation[/C][C]0.8971[/C][/ROW]
[ROW][C]R-squared[/C][C]0.8047[/C][/ROW]
[ROW][C]RMSE[/C][C]2871.5964[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=156272&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=156272&T=1

As an alternative you can also use a QR Code:  

The GUIDs for individual cells are displayed in the table below:

Goodness of Fit
Correlation0.8971
R-squared0.8047
RMSE2871.5964







Actuals, Predictions, and Residuals
#ActualsForecastsResiduals
13390733019.625887.375
23598133019.6252961.375
33658833019.6253568.375
41696721334.3571428571-4367.35714285714
52533321334.35714285713998.64285714286
62102721334.3571428571-307.357142857141
72111421334.3571428571-220.357142857141
82877733019.625-4242.625
93561233019.6252592.375
102418321334.35714285712848.64285714286
112226221334.3571428571927.642857142859
122063721334.3571428571-697.357142857141
132994833019.625-3071.625
142209321334.3571428571758.642857142859
153699733019.6253977.375
163108933019.625-1930.625
171947721334.3571428571-1857.35714285714
183130133019.625-1718.625
191849721334.3571428571-2837.35714285714
203014233019.625-2877.625
212132621334.3571428571-8.3571428571413
221677921334.3571428571-4555.35714285714
233806833019.6255048.375
242970733019.625-3312.625
253501633019.6251996.375
262613121334.35714285714796.64285714286
272925133019.625-3768.625
282285521334.35714285711520.64285714286
293180633019.625-1213.625
303412433019.6251104.375

\begin{tabular}{lllllllll}
\hline
Actuals, Predictions, and Residuals \tabularnewline
# & Actuals & Forecasts & Residuals \tabularnewline
1 & 33907 & 33019.625 & 887.375 \tabularnewline
2 & 35981 & 33019.625 & 2961.375 \tabularnewline
3 & 36588 & 33019.625 & 3568.375 \tabularnewline
4 & 16967 & 21334.3571428571 & -4367.35714285714 \tabularnewline
5 & 25333 & 21334.3571428571 & 3998.64285714286 \tabularnewline
6 & 21027 & 21334.3571428571 & -307.357142857141 \tabularnewline
7 & 21114 & 21334.3571428571 & -220.357142857141 \tabularnewline
8 & 28777 & 33019.625 & -4242.625 \tabularnewline
9 & 35612 & 33019.625 & 2592.375 \tabularnewline
10 & 24183 & 21334.3571428571 & 2848.64285714286 \tabularnewline
11 & 22262 & 21334.3571428571 & 927.642857142859 \tabularnewline
12 & 20637 & 21334.3571428571 & -697.357142857141 \tabularnewline
13 & 29948 & 33019.625 & -3071.625 \tabularnewline
14 & 22093 & 21334.3571428571 & 758.642857142859 \tabularnewline
15 & 36997 & 33019.625 & 3977.375 \tabularnewline
16 & 31089 & 33019.625 & -1930.625 \tabularnewline
17 & 19477 & 21334.3571428571 & -1857.35714285714 \tabularnewline
18 & 31301 & 33019.625 & -1718.625 \tabularnewline
19 & 18497 & 21334.3571428571 & -2837.35714285714 \tabularnewline
20 & 30142 & 33019.625 & -2877.625 \tabularnewline
21 & 21326 & 21334.3571428571 & -8.3571428571413 \tabularnewline
22 & 16779 & 21334.3571428571 & -4555.35714285714 \tabularnewline
23 & 38068 & 33019.625 & 5048.375 \tabularnewline
24 & 29707 & 33019.625 & -3312.625 \tabularnewline
25 & 35016 & 33019.625 & 1996.375 \tabularnewline
26 & 26131 & 21334.3571428571 & 4796.64285714286 \tabularnewline
27 & 29251 & 33019.625 & -3768.625 \tabularnewline
28 & 22855 & 21334.3571428571 & 1520.64285714286 \tabularnewline
29 & 31806 & 33019.625 & -1213.625 \tabularnewline
30 & 34124 & 33019.625 & 1104.375 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=156272&T=2

[TABLE]
[ROW][C]Actuals, Predictions, and Residuals[/C][/ROW]
[ROW][C]#[/C][C]Actuals[/C][C]Forecasts[/C][C]Residuals[/C][/ROW]
[ROW][C]1[/C][C]33907[/C][C]33019.625[/C][C]887.375[/C][/ROW]
[ROW][C]2[/C][C]35981[/C][C]33019.625[/C][C]2961.375[/C][/ROW]
[ROW][C]3[/C][C]36588[/C][C]33019.625[/C][C]3568.375[/C][/ROW]
[ROW][C]4[/C][C]16967[/C][C]21334.3571428571[/C][C]-4367.35714285714[/C][/ROW]
[ROW][C]5[/C][C]25333[/C][C]21334.3571428571[/C][C]3998.64285714286[/C][/ROW]
[ROW][C]6[/C][C]21027[/C][C]21334.3571428571[/C][C]-307.357142857141[/C][/ROW]
[ROW][C]7[/C][C]21114[/C][C]21334.3571428571[/C][C]-220.357142857141[/C][/ROW]
[ROW][C]8[/C][C]28777[/C][C]33019.625[/C][C]-4242.625[/C][/ROW]
[ROW][C]9[/C][C]35612[/C][C]33019.625[/C][C]2592.375[/C][/ROW]
[ROW][C]10[/C][C]24183[/C][C]21334.3571428571[/C][C]2848.64285714286[/C][/ROW]
[ROW][C]11[/C][C]22262[/C][C]21334.3571428571[/C][C]927.642857142859[/C][/ROW]
[ROW][C]12[/C][C]20637[/C][C]21334.3571428571[/C][C]-697.357142857141[/C][/ROW]
[ROW][C]13[/C][C]29948[/C][C]33019.625[/C][C]-3071.625[/C][/ROW]
[ROW][C]14[/C][C]22093[/C][C]21334.3571428571[/C][C]758.642857142859[/C][/ROW]
[ROW][C]15[/C][C]36997[/C][C]33019.625[/C][C]3977.375[/C][/ROW]
[ROW][C]16[/C][C]31089[/C][C]33019.625[/C][C]-1930.625[/C][/ROW]
[ROW][C]17[/C][C]19477[/C][C]21334.3571428571[/C][C]-1857.35714285714[/C][/ROW]
[ROW][C]18[/C][C]31301[/C][C]33019.625[/C][C]-1718.625[/C][/ROW]
[ROW][C]19[/C][C]18497[/C][C]21334.3571428571[/C][C]-2837.35714285714[/C][/ROW]
[ROW][C]20[/C][C]30142[/C][C]33019.625[/C][C]-2877.625[/C][/ROW]
[ROW][C]21[/C][C]21326[/C][C]21334.3571428571[/C][C]-8.3571428571413[/C][/ROW]
[ROW][C]22[/C][C]16779[/C][C]21334.3571428571[/C][C]-4555.35714285714[/C][/ROW]
[ROW][C]23[/C][C]38068[/C][C]33019.625[/C][C]5048.375[/C][/ROW]
[ROW][C]24[/C][C]29707[/C][C]33019.625[/C][C]-3312.625[/C][/ROW]
[ROW][C]25[/C][C]35016[/C][C]33019.625[/C][C]1996.375[/C][/ROW]
[ROW][C]26[/C][C]26131[/C][C]21334.3571428571[/C][C]4796.64285714286[/C][/ROW]
[ROW][C]27[/C][C]29251[/C][C]33019.625[/C][C]-3768.625[/C][/ROW]
[ROW][C]28[/C][C]22855[/C][C]21334.3571428571[/C][C]1520.64285714286[/C][/ROW]
[ROW][C]29[/C][C]31806[/C][C]33019.625[/C][C]-1213.625[/C][/ROW]
[ROW][C]30[/C][C]34124[/C][C]33019.625[/C][C]1104.375[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=156272&T=2

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=156272&T=2

As an alternative you can also use a QR Code:  

The GUIDs for individual cells are displayed in the table below:

Actuals, Predictions, and Residuals
#ActualsForecastsResiduals
13390733019.625887.375
23598133019.6252961.375
33658833019.6253568.375
41696721334.3571428571-4367.35714285714
52533321334.35714285713998.64285714286
62102721334.3571428571-307.357142857141
72111421334.3571428571-220.357142857141
82877733019.625-4242.625
93561233019.6252592.375
102418321334.35714285712848.64285714286
112226221334.3571428571927.642857142859
122063721334.3571428571-697.357142857141
132994833019.625-3071.625
142209321334.3571428571758.642857142859
153699733019.6253977.375
163108933019.625-1930.625
171947721334.3571428571-1857.35714285714
183130133019.625-1718.625
191849721334.3571428571-2837.35714285714
203014233019.625-2877.625
212132621334.3571428571-8.3571428571413
221677921334.3571428571-4555.35714285714
233806833019.6255048.375
242970733019.625-3312.625
253501633019.6251996.375
262613121334.35714285714796.64285714286
272925133019.625-3768.625
282285521334.35714285711520.64285714286
293180633019.625-1213.625
303412433019.6251104.375



Parameters (Session):
par1 = 1 ; par2 = none ; par3 = 3 ; par4 = no ;
Parameters (R input):
par1 = 1 ; par2 = none ; par3 = 3 ; par4 = no ;
R code (references can be found in the software module):
library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}