Repository of Reproducible Computations

Free Statistics

of Irreproducible Research!

Author's title

Author

*Unverified author*

R Software Module

rwasp_regression_trees1.wasp

Title produced by software

Recursive Partitioning (Regression Trees)

Date of computation

Sun, 18 Dec 2011 04:43:44 -0500

Cite this page as follows

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2011/Dec/18/t1324201447bjf3qiqqgiuagu1.htm/, Retrieved Sun, 05 May 2024 17:59:23 +0000

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=156649, Retrieved Sun, 05 May 2024 17:59:23 +0000

QR Codes:

Paste this QR Code to cite your computation.

Original text written by user:

IsPrivate?

No (this computation is public)

User-defined keywords

Estimated Impact

119

Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)

-     [Recursive Partitioning (Regression Trees)] [] [2010-12-05 20:13:50] [b98453cac15ba1066b407e146608df68]
- R PD  [Recursive Partitioning (Regression Trees)] [] [2011-12-12 15:35:06] [86f7284edee3dbb8ea5c7e2dec87d892]
-   PD      [Recursive Partitioning (Regression Trees)] [] [2011-12-18 09:43:44] [d41d8cd98f00b204e9800998ecf8427e] [Current]

Feedback Forum

Post a new message

Dataseries X:

Download CSV

Histogram

Boxplots

0	210907	146283
0	120982	98364
1	176508	86146
0	385534	195663
0	149061	95757
1	165446	85584
1	237213	143983
1	133131	59238
1	324799	151511
0	230964	136368
1	236785	112642
0	135473	94728
0	215147	121527
1	344297	127766
0	153935	98958
1	174724	85646
1	174415	98579
0	225548	130767
1	223632	131741
1	124817	53907
0	210767	146761
0	170266	82036
0	294424	171975
1	325107	159676
1	7176	1929
0	106408	58391
0	96560	31580
1	265769	136815
0	149112	69107
1	175824	50495
0	152871	108016
1	111665	46341
1	362301	79336
0	183167	93176
1	168809	127969
1	24188	15049
1	329267	155135
1	218946	102996
1	244052	160604
1	341570	158051
0	103597	44547
1	256462	174141
0	235800	184301
1	196553	129847
1	174184	117286
0	143246	71180
1	187559	109377
0	187681	85298
1	73566	23824
0	167488	82981
0	143756	73815
0	243199	132190
1	182999	128754
1	152299	67808
1	346485	131722
1	193339	106175
1	122774	25157
0	130585	76669
1	112611	57283
1	286468	105805
1	148446	72413
0	182079	96971
1	140344	71299
1	220516	77494
1	243060	120336
1	162765	93913
1	232138	181248
0	265318	146123
1	85574	32036
0	310839	186646
0	225060	102255
1	232317	168237
0	144966	64219
1	164709	115338
1	220801	84845
0	99466	153197
1	92661	29877
1	133328	63506
1	61361	22445
1	100750	68370
0	102010	42071
1	101523	50517
1	243511	103950
1	22938	5841
1	152474	84396
0	99923	35753
1	132487	55515
0	317394	209056
1	21054	6622
1	209641	115814
0	22648	11609
0	31414	13155
1	46698	18274
1	131698	72875
1	244749	142775
0	128423	20112
0	97839	61023
1	272458	132432
1	108043	45109
0	328107	170875
1	351067	214921
0	158015	100226
1	229242	78876
1	84207	6940
0	120445	49025
0	324598	122037
0	131069	53782
0	204271	127748
0	116048	77395
1	250047	89324
1	299775	103300
0	195838	112283
1	173260	10901
0	254488	120691
1	92499	25899
0	224330	139296
0	135781	52678
1	74408	23853
0	81240	17306
1	181633	89455
1	271856	147866
1	95227	14336
0	98146	30059
0	59194	22097
1	139942	96841
0	118612	41907
1	72880	27080
1	65475	35885
1	71965	28313
0	135131	36134
0	108446	55764
1	181528	66956
1	134019	47487
0	121848	35619
0	81872	45608
0	58981	7721
0	53515	20634
1	56375	31931
1	65490	37754
1	76302	40557
1	104011	94238
0	98104	44197
1	30989	4103
0	135458	44144
1	63123	27640
1	74914	28990
0	31774	4694
1	81437	42648
1	65745	25836
1	56653	22779
1	158399	40820
1	73624	32378
1	91899	39613
1	139526	60865
0	51567	20107
0	102538	48231
1	86678	39725
1	150580	62991
1	99611	49363
0	99373	24552
0	86230	31493
0	30837	3439
1	31706	19555
1	89806	21228
0	64175	28893
0	59382	21425
0	119308	50276
0	76702	37643
1	19764	9927
0	84105	27184
1	64187	18475
1	72535	35873

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	4 seconds
R Server	'Gertrude Mary Cox' @ cox.wessa.net

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 4 seconds \tabularnewline
R Server & 'Gertrude Mary Cox' @ cox.wessa.net \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=156649&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]4 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'Gertrude Mary Cox' @ cox.wessa.net[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=156649&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=156649&T=0

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	4 seconds
R Server	'Gertrude Mary Cox' @ cox.wessa.net

10-Fold Cross Validation
	Prediction (training)			Prediction (testing)
Actual	C1	C2	CV	C1	C2	CV
C1	1015	102	0.9087	115	8	0.935
C2	24	408	0.9444	3	45	0.9375
Overall	-	-	0.9187	-	-	0.9357

\begin{tabular}{lllllllll}
\hline
10-Fold Cross Validation \tabularnewline
 & Prediction (training) & Prediction (testing) \tabularnewline
Actual & C1 & C2 & CV & C1 & C2 & CV \tabularnewline
C1 & 1015 & 102 & 0.9087 & 115 & 8 & 0.935 \tabularnewline
C2 & 24 & 408 & 0.9444 & 3 & 45 & 0.9375 \tabularnewline
Overall & - & - & 0.9187 & - & - & 0.9357 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=156649&T=1

[TABLE]
[ROW][C]10-Fold Cross Validation[/C][/ROW]
[ROW][C][/C][C]Prediction (training)[/C][C]Prediction (testing)[/C][/ROW]
[ROW][C]Actual[/C][C]C1[/C][C]C2[/C][C]CV[/C][C]C1[/C][C]C2[/C][C]CV[/C][/ROW]
[ROW][C]C1[/C][C]1015[/C][C]102[/C][C]0.9087[/C][C]115[/C][C]8[/C][C]0.935[/C][/ROW]
[ROW][C]C2[/C][C]24[/C][C]408[/C][C]0.9444[/C][C]3[/C][C]45[/C][C]0.9375[/C][/ROW]
[ROW][C]Overall[/C][C]-[/C][C]-[/C][C]0.9187[/C][C]-[/C][C]-[/C][C]0.9357[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=156649&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=156649&T=1

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

10-Fold Cross Validation
	Prediction (training)			Prediction (testing)
Actual	C1	C2	CV	C1	C2	CV
C1	1015	102	0.9087	115	8	0.935
C2	24	408	0.9444	3	45	0.9375
Overall	-	-	0.9187	-	-	0.9357

Confusion Matrix (predicted in columns / actuals in rows)
	C1	C2
C1	112	12
C2	1	47

\begin{tabular}{lllllllll}
\hline
Confusion Matrix (predicted in columns / actuals in rows) \tabularnewline
 & C1 & C2 \tabularnewline
C1 & 112 & 12 \tabularnewline
C2 & 1 & 47 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=156649&T=2

[TABLE]
[ROW][C]Confusion Matrix (predicted in columns / actuals in rows)[/C][/ROW]
[ROW][C][/C][C]C1[/C][C]C2[/C][/ROW]
[ROW][C]C1[/C][C]112[/C][C]12[/C][/ROW]
[ROW][C]C2[/C][C]1[/C][C]47[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=156649&T=2

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=156649&T=2

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Confusion Matrix (predicted in columns / actuals in rows)
	C1	C2
C1	112	12
C2	1	47

Figure 1

PNG link

Postscript link

PDF link

Figure 2

PNG link

Postscript link

PDF link

Figure 3

PNG link

Postscript link

PDF link

Parameters (Session):

par1 = 2 ; par2 = equal ; par3 = 2 ; par4 = yes ;

Parameters (R input):

par1 = 2 ; par2 = equal ; par3 = 2 ; par4 = yes ;

R code (references can be found in the software module):

library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}

Free Statistics

Description of Statistical Computation

Tree of Dependent Computations

Dataset

Tables (Output of Computation)

Figures (Output of Computation)

Input Parameters & R Code