Repository of Reproducible Computations

Free Statistics

of Irreproducible Research!

Author's title

Author

*The author of this computation has been verified*

R Software Module

rwasp_regression_trees1.wasp

Title produced by software

Recursive Partitioning (Regression Trees)

Date of computation

Fri, 09 Dec 2011 14:38:35 -0500

Cite this page as follows

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2011/Dec/09/t13234595801p9bx197b3dmi7p.htm/, Retrieved Thu, 02 May 2024 21:44:57 +0000

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=153442, Retrieved Thu, 02 May 2024 21:44:57 +0000

QR Codes:

Paste this QR Code to cite your computation.

Original text written by user:

IsPrivate?

No (this computation is public)

User-defined keywords

Estimated Impact

125

Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)

-     [Recursive Partitioning (Regression Trees)] [] [2010-12-05 18:59:57] [b98453cac15ba1066b407e146608df68]
- R PD  [Recursive Partitioning (Regression Trees)] [Workshop 10 Recur...] [2011-12-09 18:45:00] [de8512d9b386046939a89973b76869e3]
- R  D    [Recursive Partitioning (Regression Trees)] [Workshop 10 Recur...] [2011-12-09 19:18:30] [de8512d9b386046939a89973b76869e3]
- R         [Recursive Partitioning (Regression Trees)] [Workshop 10 Recur...] [2011-12-09 19:22:17] [de8512d9b386046939a89973b76869e3]
- R  D        [Recursive Partitioning (Regression Trees)] [Workshop 10 Recur...] [2011-12-09 19:32:47] [de8512d9b386046939a89973b76869e3]
- R               [Recursive Partitioning (Regression Trees)] [Workshop 10 Recur...] [2011-12-09 19:38:35] [5c44e6aad476a1bab98fc6774eca4c08] [Current]
-  MP               [Recursive Partitioning (Regression Trees)] [Paper SHW Recursi...] [2011-12-16 14:43:06] [74be16979710d4c4e7c6647856088456] 
- RMP               [Multiple Regression] [Paper SHW  MLR 2] [2011-12-16 14:52:16] [de8512d9b386046939a89973b76869e3] 
- R P               [Recursive Partitioning (Regression Trees)] [Paper SHW Recursi...] [2011-12-16 14:58:20] [de8512d9b386046939a89973b76869e3] 

Feedback Forum

Post a new message

Dataseries X:

Download CSV

Histogram

Boxplots

869	58	28	103	84786	98364	120982
2172	108	30	103	101193	96933	179321
901	49	22	51	38361	79234	123185
463	0	26	70	68504	42551	52746
371	1	18	22	22807	6853	33170
1495	86	44	148	71701	75851	173326
2187	104	40	124	80444	93163	258873
1491	63	34	70	53855	96037	180083
1036	82	23	66	99645	94728	135473
1882	115	36	134	114789	105499	202925
1220	50	25	84	65553	98958	153935
1289	83	39	156	97500	77900	132943
1812	105	33	110	77873	178812	221698
1731	114	43	158	90183	163253	260561
807	38	30	109	61542	27032	84853
1940	71	32	92	55813	86572	215641
1499	59	28	70	55461	85371	167542
2747	106	30	93	70106	120642	269651
2099	34	39	31	71570	78348	116408
918	20	26	66	33032	56968	78800
3373	115	39	133	139077	161632	277965
1713	85	33	113	71595	87850	150629
1438	76	28	100	72260	127969	168809
496	8	4	7	5950	15049	24188
744	21	18	61	32551	25109	65029
1161	30	14	41	31701	45824	101097
2694	92	28	102	120733	162647	233328
1769	75	28	99	73107	60622	206161
3148	128	38	129	132068	179566	311473
2474	105	23	62	149193	184301	235800
2084	55	36	73	46821	75661	177939
1954	56	32	114	87011	96144	207176
1389	72	25	70	55183	117286	174184
2269	75	36	116	73511	109377	187559
1268	118	23	74	78664	73631	119016
1943	77	40	138	70054	86767	182192
1762	66	40	151	74011	93487	194979
1857	116	33	115	93133	94552	275541
1502	73	34	104	225920	128754	182999
1441	99	30	108	62133	66363	135649
1416	53	22	69	43836	61724	120221
1317	30	26	99	38692	68580	145790
870	49	8	27	56622	55792	80953
2008	75	45	93	67267	129484	241066
1885	68	33	69	41140	87831	204713
1369	81	28	99	138599	136048	182613
2845	130	24	85	162901	186646	310839
1391	39	32	50	37510	64219	144966
602	13	19	64	43750	19630	43287
1743	74	20	31	40652	76825	155754
2014	109	31	92	85872	109427	201940
2143	151	32	106	89275	118168	235454
1401	54	31	114	120662	146304	224549
530	23	11	30	25162	24610	61857
387	4	0	0	855	6622	21054
1742	62	24	60	97068	115814	209641
449	18	8	9	14116	13155	31414
1606	64	40	140	110681	68847	184510
568	16	8	21	8773	13983	38214
1459	48	35	124	83209	65176	151101
1955	130	38	120	103487	180759	250579
1002	59	31	114	71220	100226	158015
956	32	28	78	56926	54454	85439
3604	95	40	141	115168	170745	351619
1035	14	30	101	111194	6940	84207
1701	70	32	90	51633	86839	165543
1249	19	27	36	75345	44830	141722
3352	91	31	97	98952	103300	299775
1369	135	41	148	123969	58106	104389
2201	87	32	105	135400	122422	199476
207	4	0	0	6023	7953	14688
151	7	0	0	1644	4245	7199
474	12	5	13	6179	21509	46660
141	0	1	4	3926	7670	17547
1318	46	24	46	73224	53608	152601

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	3 seconds
R Server	'Gwilym Jenkins' @ jenkins.wessa.net

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 3 seconds \tabularnewline
R Server & 'Gwilym Jenkins' @ jenkins.wessa.net \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=153442&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]3 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'Gwilym Jenkins' @ jenkins.wessa.net[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=153442&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=153442&T=0

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	3 seconds
R Server	'Gwilym Jenkins' @ jenkins.wessa.net

Confusion Matrix (predicted in columns / actuals in rows)
	C1	C2
C1	58	5
C2	2	10

\begin{tabular}{lllllllll}
\hline
Confusion Matrix (predicted in columns / actuals in rows) \tabularnewline
 & C1 & C2 \tabularnewline
C1 & 58 & 5 \tabularnewline
C2 & 2 & 10 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=153442&T=1

[TABLE]
[ROW][C]Confusion Matrix (predicted in columns / actuals in rows)[/C][/ROW]
[ROW][C][/C][C]C1[/C][C]C2[/C][/ROW]
[ROW][C]C1[/C][C]58[/C][C]5[/C][/ROW]
[ROW][C]C2[/C][C]2[/C][C]10[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=153442&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=153442&T=1

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Confusion Matrix (predicted in columns / actuals in rows)
	C1	C2
C1	58	5
C2	2	10

Figure 1

PNG link

Postscript link

PDF link

Figure 2

PNG link

Postscript link

PDF link

Figure 3

PNG link

Postscript link

PDF link

Parameters (Session):

par1 = pearson ; par2 = equal ; par3 = 2 ; par4 = no ;

Parameters (R input):

par1 = 5 ; par2 = equal ; par3 = 2 ; par4 = no ;

R code (references can be found in the software module):

library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}

Free Statistics

Description of Statistical Computation

Tree of Dependent Computations

Dataset

Tables (Output of Computation)

Figures (Output of Computation)

Input Parameters & R Code