Repository of Reproducible Computations

Free Statistics

of Irreproducible Research!

Author's title

Author

*Unverified author*

R Software Module

rwasp_regression_trees1.wasp

Title produced by software

Recursive Partitioning (Regression Trees)

Date of computation

Fri, 23 Dec 2011 10:55:06 -0500

Cite this page as follows

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2011/Dec/23/t1324655753m0jk43q9q3kudpv.htm/, Retrieved Mon, 29 Apr 2024 18:25:40 +0000

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=160529, Retrieved Mon, 29 Apr 2024 18:25:40 +0000

QR Codes:

Paste this QR Code to cite your computation.

Original text written by user:

IsPrivate?

No (this computation is public)

User-defined keywords

Estimated Impact

Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)

-     [Recursive Partitioning (Regression Trees)] [] [2010-12-05 18:59:57] [b98453cac15ba1066b407e146608df68]
- R PD  [Recursive Partitioning (Regression Trees)] [Deel 3 - tutorial...] [2011-12-21 21:25:31] [43a132f5d1d3e2c258a569e3803c6f06]
-   P       [Recursive Partitioning (Regression Trees)] [Deel 3 recursive ...] [2011-12-23 15:55:06] [d41d8cd98f00b204e9800998ecf8427e] [Current]

Feedback Forum

Post a new message

Dataseries X:

Download CSV

Histogram

Boxplots

112285	56	144	94	146283	30
84786	56	103	103	98364	28
83123	54	98	93	86146	38
101193	89	135	103	96933	30
38361	40	61	51	79234	22
68504	25	39	70	42551	26
119182	92	150	91	195663	25
22807	18	5	22	6853	18
116174	44	84	93	95757	26
57635	33	80	60	85584	25
66198	84	130	123	143983	38
71701	88	82	148	75851	44
57793	55	60	90	59238	30
80444	60	131	124	93163	40
53855	66	84	70	96037	34
97668	154	140	168	151511	47
133824	53	151	115	136368	30
101481	119	91	71	112642	31
99645	41	138	66	94728	23
114789	61	150	134	105499	36
99052	58	124	117	121527	36
67654	75	119	108	127766	30
65553	33	73	84	98958	25
97500	40	110	156	77900	39
69112	92	123	120	85646	34
82753	100	90	114	98579	31
85323	112	116	94	130767	31
72654	73	113	120	131741	33
30727	40	56	81	53907	25
77873	45	115	110	178812	33
117478	60	119	133	146761	35
74007	62	129	122	82036	42
90183	75	127	158	163253	43
61542	31	27	109	27032	30
101494	77	175	124	171975	33
55813	46	64	92	86572	32
79215	99	96	126	159676	36
55461	66	84	70	85371	28
31081	30	41	37	58391	14
83122	146	126	120	136815	32
70106	67	105	93	120642	30
60578	56	80	95	69107	35
79892	58	73	90	108016	28
49810	34	57	80	46341	28
71570	61	40	31	78348	39
100708	119	68	110	79336	34
33032	42	21	66	56968	26
82875	66	127	138	93176	39
139077	89	154	133	161632	39
71595	44	116	113	87850	33
72260	66	102	100	127969	28
115762	259	148	140	155135	39
32551	17	21	61	25109	18
31701	64	35	41	45824	14
80670	41	112	96	102996	29
143558	68	137	164	160604	44
120733	132	230	102	162647	28
105195	105	181	124	174141	35
73107	71	71	99	60622	28
132068	112	147	129	179566	38
149193	94	190	62	184301	23
46821	82	64	73	75661	36
87011	70	105	114	96144	32
95260	57	107	99	129847	29
55183	53	94	70	117286	25
106671	103	116	104	71180	27
73511	121	106	116	109377	36
92945	62	143	91	85298	28
78664	52	81	74	73631	23
70054	52	89	138	86767	40
22618	32	26	67	23824	23
74011	62	84	151	93487	40
83737	45	113	72	82981	28
69094	46	120	120	73815	34
93133	63	110	115	94552	33
95536	75	134	105	132190	28
62133	46	96	108	66363	30
61370	53	78	98	67808	33
43836	37	51	69	61724	22
106117	90	121	111	131722	38
38692	63	38	99	68580	26
84651	78	145	71	106175	35
56622	25	59	27	55792	8
15986	45	27	69	25157	24
95364	46	91	107	76669	29
89691	144	68	107	105805	29
67267	82	58	93	129484	45
126846	91	150	129	72413	37
41140	71	74	69	87831	33
102860	63	181	118	96971	33
51715	53	65	73	71299	25
55801	62	97	119	77494	32
111813	63	121	104	120336	29
120293	32	99	107	93913	28
138599	39	152	99	136048	28
161647	62	188	90	181248	31
115929	117	138	197	146123	52
162901	92	254	85	186646	24
109825	93	87	139	102255	41
129838	54	178	106	168237	33
37510	144	51	50	64219	32
43750	14	49	64	19630	19
40652	61	73	31	76825	20
87771	109	176	63	115338	31
85872	38	94	92	109427	31
89275	73	120	106	118168	32
140867	72	58	93	68370	30
120662	50	98	114	146304	31
101338	71	130	110	103950	42
65567	65	101	83	84396	32
25162	25	31	30	24610	11
40735	41	120	98	55515	36
91413	86	195	82	209056	31
97068	42	89	60	115814	24
14116	19	39	9	13155	8
76643	95	106	115	142775	33
110681	49	83	140	68847	40
92696	64	37	120	20112	38
94785	38	77	66	61023	24
83209	32	56	124	65176	35
93815	65	132	152	132432	43
86687	52	144	139	112494	43
105547	65	153	144	170875	41
103487	83	143	120	180759	38
71220	29	79	114	100226	31
56926	33	39	78	54454	28
91721	247	95	119	78876	31
115168	139	169	141	170745	40
111194	29	12	101	6940	30
135777	110	134	133	122037	37
51513	67	69	83	53782	30
74163	42	119	116	127748	35
51633	65	119	90	86839	32
75345	94	75	36	44830	27
98952	95	103	97	103300	31
102372	67	197	98	112283	31
37238	63	16	78	10901	21
103772	83	140	117	120691	39
123969	45	89	148	58106	41
135400	70	125	105	122422	32
130115	83	167	132	139296	39
64466	70	96	73	89455	30
54990	103	151	86	147866	37
34777	34	23	48	14336	32
73224	48	61	46	53608	24

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	4 seconds
R Server	'Gwilym Jenkins' @ jenkins.wessa.net

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 4 seconds \tabularnewline
R Server & 'Gwilym Jenkins' @ jenkins.wessa.net \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=160529&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]4 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'Gwilym Jenkins' @ jenkins.wessa.net[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=160529&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=160529&T=0

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	4 seconds
R Server	'Gwilym Jenkins' @ jenkins.wessa.net

10-Fold Cross Validation
	Prediction (training)			Prediction (testing)
Actual	C1	C2	CV	C1	C2	CV
C1	565	94	0.8574	56	15	0.7887
C2	230	412	0.6417	34	44	0.5641
Overall	-	-	0.751	-	-	0.6711

\begin{tabular}{lllllllll}
\hline
10-Fold Cross Validation \tabularnewline
 & Prediction (training) & Prediction (testing) \tabularnewline
Actual & C1 & C2 & CV & C1 & C2 & CV \tabularnewline
C1 & 565 & 94 & 0.8574 & 56 & 15 & 0.7887 \tabularnewline
C2 & 230 & 412 & 0.6417 & 34 & 44 & 0.5641 \tabularnewline
Overall & - & - & 0.751 & - & - & 0.6711 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=160529&T=1

[TABLE]
[ROW][C]10-Fold Cross Validation[/C][/ROW]
[ROW][C][/C][C]Prediction (training)[/C][C]Prediction (testing)[/C][/ROW]
[ROW][C]Actual[/C][C]C1[/C][C]C2[/C][C]CV[/C][C]C1[/C][C]C2[/C][C]CV[/C][/ROW]
[ROW][C]C1[/C][C]565[/C][C]94[/C][C]0.8574[/C][C]56[/C][C]15[/C][C]0.7887[/C][/ROW]
[ROW][C]C2[/C][C]230[/C][C]412[/C][C]0.6417[/C][C]34[/C][C]44[/C][C]0.5641[/C][/ROW]
[ROW][C]Overall[/C][C]-[/C][C]-[/C][C]0.751[/C][C]-[/C][C]-[/C][C]0.6711[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=160529&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=160529&T=1

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

10-Fold Cross Validation
	Prediction (training)			Prediction (testing)
Actual	C1	C2	CV	C1	C2	CV
C1	565	94	0.8574	56	15	0.7887
C2	230	412	0.6417	34	44	0.5641
Overall	-	-	0.751	-	-	0.6711

Confusion Matrix (predicted in columns / actuals in rows)
	C1	C2
C1	72	1
C2	36	36

\begin{tabular}{lllllllll}
\hline
Confusion Matrix (predicted in columns / actuals in rows) \tabularnewline
 & C1 & C2 \tabularnewline
C1 & 72 & 1 \tabularnewline
C2 & 36 & 36 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=160529&T=2

[TABLE]
[ROW][C]Confusion Matrix (predicted in columns / actuals in rows)[/C][/ROW]
[ROW][C][/C][C]C1[/C][C]C2[/C][/ROW]
[ROW][C]C1[/C][C]72[/C][C]1[/C][/ROW]
[ROW][C]C2[/C][C]36[/C][C]36[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=160529&T=2

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=160529&T=2

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Confusion Matrix (predicted in columns / actuals in rows)
	C1	C2
C1	72	1
C2	36	36

Figure 1

PNG link

Postscript link

PDF link

Figure 2

PNG link

Postscript link

PDF link

Figure 3

PNG link

Postscript link

PDF link

Parameters (Session):

par1 = 1 ; par2 = quantiles ; par3 = 2 ; par4 = yes ;

Parameters (R input):

par1 = 1 ; par2 = quantiles ; par3 = 2 ; par4 = yes ;

R code (references can be found in the software module):

library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}

Free Statistics

Description of Statistical Computation

Tree of Dependent Computations

Dataset

Tables (Output of Computation)

Figures (Output of Computation)

Input Parameters & R Code