Repository of Reproducible Computations

Free Statistics

of Irreproducible Research!

Author's title

Author

*The author of this computation has been verified*

R Software Module

rwasp_regression_trees1.wasp

Title produced by software

Recursive Partitioning (Regression Trees)

Date of computation

Mon, 10 Dec 2012 09:28:01 -0500

Cite this page as follows

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2012/Dec/10/t13551498323su30o2qcfljj7p.htm/, Retrieved Thu, 25 Apr 2024 23:17:21 +0000

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=198166, Retrieved Thu, 25 Apr 2024 23:17:21 +0000

QR Codes:

Paste this QR Code to cite your computation.

Original text written by user:

IsPrivate?

No (this computation is public)

User-defined keywords

Estimated Impact

109

Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)

-     [Recursive Partitioning (Regression Trees)] [] [2010-12-05 20:30:15] [b98453cac15ba1066b407e146608df68]
- R PD    [Recursive Partitioning (Regression Trees)] [WS 9 Recursive pa...] [2012-12-10 14:28:01] [885fe6c051c4f145d5c497ce1b2b5522] [Current]

Feedback Forum

Post a new message

Dataseries X:

Download CSV

Histogram

Boxplots

18897	22424	19364	19434	22831	23072	37471	14690
17518	22125	18586	18389	22727	22551	36160	13824
8632	7653	8225	8405	8344	8695	9197	9477
832	554	822	854	830	935	1051	1150
3351	3357	3270	3346	3235	3329	3480	3447
8	8	3	4	5	5	4	4
1	1	1	1	1	1	1	2
7	10	11	9	10	9	10	9
217	222	204	205	191	197	196	191
911	947	918	939	937	967	1007	962
1932	1901	1862	1921	1823	1879	1982	2003
274	267	270	267	269	271	281	276
131	109	87	66	68	64	76	81
1708	1668	1738	1715	1726	1771	1861	2079
2609	1965	2308	2424	2486	2594	2729	2720
133	32	119	89	93	107	102	23
2476	1933	2189	2335	2393	2487	2627	2697
10	37	23	21	22	27	21	18
1510	1616	1378	1605	1534	1654	1421	1650
6427	7719	8279	6133	11706	9235	24339	1324
3812	5127	5890	4487	7888	6772	9522	3656
724	99	154	157	367	153	171	-211
1560	1996	1917	1223	2860	1964	13508	-2076
156	113	166	116	435	166	975	-178
3	3	2	2	15	13	27	13
172	380	151	148	141	167	137	120
65	1700	163	568	80	2043	768	338
593	2931	292	1348	774	631	122	698
281	469	227	309	266	267	292	321
1191	145	747	874	38	275	423	218
72	57	2	32	32	57	16	21
113	-6	27	120	32	184	860	604
19	86	2	18	3	3	12	246
97	11	27	120	32	185	860	381
18897	22424	19364	19434	22831	23072	37471	14690
16770	18775	17704	16289	21687	20252	35933	13873
6132	5145	5705	5818	5817	6171	6504	6749
648	299	484	535	511	548	638	758
1739	1710	1776	1910	1843	1990	2141	2097
160	167	176	193	183	202	163	179
621	570	592	743	655	735	851	835
804	821	842	831	847	869	886	881
3	3	3	3	3	3	3	3
150	149	163	140	156	182	238	199
549	528	558	354	470	438	450	453
95	80	132	-46	88	49	57	62
354	353	339	323	308	314	325	322
100	95	87	77	73	75	68	70
342	343	357	354	339	343	343	305
2854	2265	2530	2664	2653	2852	2932	3135
167	99	168	132	137	147	132	35
2687	2166	2362	2533	2516	2705	2799	3100
645	770	634	680	581	675	814	677
6113	7729	8065	5931	11602	9279	24726	1473
3567	4697	5792	4959	8473	6753	9199	3926
472	241	87	262	330	64	172	-142
1665	2360	1934	584	2229	1972	13856	-2338
328	318	154	6	256	181	1301	-241
0	1	0	0	12	10	21	10
81	112	99	120	302	300	177	259
1322	1286	1317	1325	1314	1322	1308	1370
154	143	156	152	144	151	151	171
1277	1448	1340	1689	1529	1544	1264	1656
1127	2253	486	694	699	1110	1165	1776
456	1356	63	2861	89	82	1019	926
224	200	149	91	165	216	94	131
1444	1990	1445	176	888	2517	413	-264
3	8	4	7	40	38	-64	-10
1444	2084	1443	187	850	2483	489	-231

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	6 seconds
R Server	'George Udny Yule' @ yule.wessa.net
R Framework error message	Warning: there are blank lines in the 'Data X' field. Please, use NA for missing data - blank lines are simply deleted and are NOT treated as missing values.

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 6 seconds \tabularnewline
R Server & 'George Udny Yule' @ yule.wessa.net \tabularnewline
R Framework error message & Warning: there are blank lines in the 'Data X' field.
Please, use NA for missing data - blank lines are simply
 deleted and are NOT treated as missing values. \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=198166&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]6 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'George Udny Yule' @ yule.wessa.net[/C][/ROW]
[ROW][C]R Framework error message[/C][C]Warning: there are blank lines in the 'Data X' field.
Please, use NA for missing data - blank lines are simply
 deleted and are NOT treated as missing values.[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=198166&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=198166&T=0

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	6 seconds
R Server	'George Udny Yule' @ yule.wessa.net
R Framework error message	Warning: there are blank lines in the 'Data X' field. Please, use NA for missing data - blank lines are simply deleted and are NOT treated as missing values.

10-Fold Cross Validation
	Prediction (training)			Prediction (testing)
Actual	C1	C2	CV	C1	C2	CV
C1	559	17	0.9705	50	4	0.9259
C2	0	53	1	0	7	1
Overall	-	-	0.973	-	-	0.9344

\begin{tabular}{lllllllll}
\hline
10-Fold Cross Validation \tabularnewline
 & Prediction (training) & Prediction (testing) \tabularnewline
Actual & C1 & C2 & CV & C1 & C2 & CV \tabularnewline
C1 & 559 & 17 & 0.9705 & 50 & 4 & 0.9259 \tabularnewline
C2 & 0 & 53 & 1 & 0 & 7 & 1 \tabularnewline
Overall & - & - & 0.973 & - & - & 0.9344 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=198166&T=1

[TABLE]
[ROW][C]10-Fold Cross Validation[/C][/ROW]
[ROW][C][/C][C]Prediction (training)[/C][C]Prediction (testing)[/C][/ROW]
[ROW][C]Actual[/C][C]C1[/C][C]C2[/C][C]CV[/C][C]C1[/C][C]C2[/C][C]CV[/C][/ROW]
[ROW][C]C1[/C][C]559[/C][C]17[/C][C]0.9705[/C][C]50[/C][C]4[/C][C]0.9259[/C][/ROW]
[ROW][C]C2[/C][C]0[/C][C]53[/C][C]1[/C][C]0[/C][C]7[/C][C]1[/C][/ROW]
[ROW][C]Overall[/C][C]-[/C][C]-[/C][C]0.973[/C][C]-[/C][C]-[/C][C]0.9344[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=198166&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=198166&T=1

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

10-Fold Cross Validation
	Prediction (training)			Prediction (testing)
Actual	C1	C2	CV	C1	C2	CV
C1	559	17	0.9705	50	4	0.9259
C2	0	53	1	0	7	1
Overall	-	-	0.973	-	-	0.9344

Confusion Matrix (predicted in columns / actuals in rows)
	C1	C2
C1	62	1
C2	0	6

\begin{tabular}{lllllllll}
\hline
Confusion Matrix (predicted in columns / actuals in rows) \tabularnewline
 & C1 & C2 \tabularnewline
C1 & 62 & 1 \tabularnewline
C2 & 0 & 6 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=198166&T=2

[TABLE]
[ROW][C]Confusion Matrix (predicted in columns / actuals in rows)[/C][/ROW]
[ROW][C][/C][C]C1[/C][C]C2[/C][/ROW]
[ROW][C]C1[/C][C]62[/C][C]1[/C][/ROW]
[ROW][C]C2[/C][C]0[/C][C]6[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=198166&T=2

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=198166&T=2

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Confusion Matrix (predicted in columns / actuals in rows)
	C1	C2
C1	62	1
C2	0	6

Figure 1

PNG link

Postscript link

PDF link

Figure 2

PNG link

Postscript link

PDF link

Figure 3

PNG link

Postscript link

PDF link

Parameters (Session):

par1 = 8 ; par2 = equal ; par3 = 2 ; par4 = yes ;

Parameters (R input):

par1 = 8 ; par2 = equal ; par3 = 2 ; par4 = yes ;

R code (references can be found in the software module):

library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}

Free Statistics

Description of Statistical Computation

Tree of Dependent Computations

Dataset

Tables (Output of Computation)

Figures (Output of Computation)

Input Parameters & R Code