Home » date » 2010 » Dec » 25 »

Recursive Partitioning

*The author of this computation has been verified*
R Software Module: /rwasp_regression_trees1.wasp (opens new window with default values)
Title produced by software: Recursive Partitioning (Regression Trees)
Date of computation: Sat, 25 Dec 2010 09:46:58 +0000
 
Cite this page as follows:
Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL http://www.freestatistics.org/blog/date/2010/Dec/25/t1293270479m63028km9fe0643.htm/, Retrieved Sat, 25 Dec 2010 10:48:00 +0100
 
BibTeX entries for LaTeX users:
@Manual{KEY,
    author = {{YOUR NAME}},
    publisher = {Office for Research Development and Education},
    title = {Statistical Computations at FreeStatistics.org, URL http://www.freestatistics.org/blog/date/2010/Dec/25/t1293270479m63028km9fe0643.htm/},
    year = {2010},
}
@Manual{R,
    title = {R: A Language and Environment for Statistical Computing},
    author = {{R Development Core Team}},
    organization = {R Foundation for Statistical Computing},
    address = {Vienna, Austria},
    year = {2010},
    note = {{ISBN} 3-900051-07-0},
    url = {http://www.R-project.org},
}
 
Original text written by user:
 
IsPrivate?
No (this computation is public)
 
User-defined keywords:
 
Dataseries X:
» Textbox « » Textfile « » CSV «
13 13 14 13 3 12 12 8 13 5 15 10 12 16 6 12 9 7 12 6 10 10 10 11 5 12 12 7 12 3 15 13 16 18 8 9 12 11 11 4 12 12 14 14 4 11 6 6 9 4 11 5 16 14 6 11 12 11 12 6 15 11 16 11 5 7 14 12 12 4 11 14 7 13 6 11 12 13 11 4 10 12 11 12 6 14 11 15 16 6 10 11 7 9 4 6 7 9 11 4 11 9 7 13 2 15 11 14 15 7 11 11 15 10 5 12 12 7 11 4 14 12 15 13 6 15 11 17 16 6 9 11 15 15 7 13 8 14 14 5 13 9 14 14 6 16 12 8 14 4 13 10 8 8 4 12 10 14 13 7 14 12 14 15 7 11 8 8 13 4 9 12 11 11 4 16 11 16 15 6 12 12 10 15 6 10 7 8 9 5 13 11 14 13 6 16 11 16 16 7 14 12 13 13 6 15 9 5 11 3 5 15 8 12 3 8 11 10 12 4 11 11 8 12 6 16 11 13 14 7 17 11 15 14 5 9 15 6 8 4 9 11 12 13 5 13 12 16 16 6 10 12 5 13 6 6 9 15 11 6 12 12 12 14 5 8 12 8 13 4 14 13 13 13 5 12 11 14 13 5 11 9 12 12 4 16 9 16 16 6 8 11 10 15 2 15 11 15 15 8 7 12 8 12 3 16 12 16 14 6 14 9 19 12 6 16 11 14 15 6 9 9 6 12 5 14 12 13 13 5 11 12 15 12 6 13 12 7 12 5 15 12 13 13 6 5 14 4 5 2 15 11 14 13 5 13 12 13 13 5 11 etc...
 
Output produced by software:

Enter (or paste) a matrix (table) containing all data (time) series. Every column represents a different variable and must be delimited by a space or Tab. Every row represents a period in time (or category) and must be delimited by hard returns. The easiest way to enter data is to copy and paste a block of spreadsheet cells. Please, do not use commas or spaces to seperate groups of digits!


Summary of computational transaction
Raw Inputview raw input (R code)
Raw Outputview raw output of R engine
Computing time4 seconds
R Server'RServer@AstonUniversity' @ vre.aston.ac.uk


Goodness of Fit
Correlation0.655
R-squared0.429
RMSE2.6073


Actuals, Predictions, and Residuals
#ActualsForecastsResiduals
114104
2812.2857142857143-4.28571428571429
31214.3529411764706-2.35294117647059
4712.2857142857143-5.28571428571429
51010.5555555555556-0.555555555555555
6710-3
71614.35294117647061.64705882352941
8118.52.5
914104
1068.5-2.5
111610.55555555555565.44444444444444
121110.55555555555560.444444444444445
131614.35294117647061.64705882352941
14128.53.5
15710.5555555555556-3.55555555555556
16138.54.5
171110.55555555555560.444444444444445
181514.35294117647060.647058823529411
1978.5-1.5
2098.50.5
2178.5-1.5
221414.3529411764706-0.352941176470589
231510.55555555555564.44444444444444
24710-3
251514.35294117647060.647058823529411
261714.35294117647062.64705882352941
271510.55555555555564.44444444444444
281412.28571428571431.71428571428571
291412.28571428571431.71428571428571
30810-2
31810-2
321412.28571428571431.71428571428571
331414.3529411764706-0.352941176470589
3488.5-0.5
35118.52.5
361614.35294117647061.64705882352941
371012.2857142857143-2.28571428571429
38810.5555555555556-2.55555555555556
391412.28571428571431.71428571428571
401614.35294117647061.64705882352941
411314.3529411764706-1.35294117647059
42510-5
4388.5-0.5
44108.51.5
45810.5555555555556-2.55555555555556
461314.3529411764706-1.35294117647059
471514.35294117647060.647058823529411
4868.5-2.5
491210.55555555555561.44444444444444
501612.28571428571433.71428571428571
51510.5555555555556-5.55555555555556
521510.55555555555564.44444444444444
531212.2857142857143-0.285714285714286
5488.5-0.5
551314.3529411764706-1.35294117647059
561412.28571428571431.71428571428571
57128.53.5
581614.35294117647061.64705882352941
59108.51.5
601514.35294117647060.647058823529411
6188.5-0.5
621614.35294117647061.64705882352941
631914.35294117647064.64705882352941
641414.3529411764706-0.352941176470589
65610.5555555555556-4.55555555555556
661314.3529411764706-1.35294117647059
671510.55555555555564.44444444444444
68712.2857142857143-5.28571428571429
691314.3529411764706-1.35294117647059
7048.5-4.5
711414.3529411764706-0.352941176470589
721312.28571428571430.714285714285714
731110.55555555555560.444444444444445
741410.55555555555563.44444444444444
751212.2857142857143-0.285714285714286
761512.28571428571432.71428571428571
771412.28571428571431.71428571428571
781312.28571428571430.714285714285714
79810-2
8068.5-2.5
8178.5-1.5
821314.3529411764706-1.35294117647059
831314.3529411764706-1.35294117647059
841110.55555555555560.444444444444445
85510-5
861212.2857142857143-0.285714285714286
8788.5-0.5
881112.2857142857143-1.28571428571429
891414.3529411764706-0.352941176470589
9098.50.5
911014.3529411764706-4.35294117647059
921310.55555555555562.44444444444444
931614.35294117647061.64705882352941
941614.35294117647061.64705882352941
951112.2857142857143-1.28571428571429
9688.5-0.5
97410.5555555555556-6.55555555555556
9878.5-1.5
991410.55555555555563.44444444444444
1001112.2857142857143-1.28571428571429
1011714.35294117647062.64705882352941
1021514.35294117647060.647058823529411
1031714.35294117647062.64705882352941
10458.5-3.5
10548.5-4.5
1061014.3529411764706-4.35294117647059
107118.52.5
1081514.35294117647060.647058823529411
109108.51.5
11098.50.5
1111210.55555555555561.44444444444444
1121512.28571428571432.71428571428571
113710.5555555555556-3.55555555555556
1141314.3529411764706-1.35294117647059
1151214.3529411764706-2.35294117647059
1161414.3529411764706-0.352941176470589
1171414.3529411764706-0.352941176470589
118810.5555555555556-2.55555555555556
1191514.35294117647060.647058823529411
12012102
12112102
1221614.35294117647061.64705882352941
12398.50.5
1241512.28571428571432.71428571428571
1251514.35294117647060.647058823529411
126612.2857142857143-6.28571428571429
1271414.3529411764706-0.352941176470589
1281512.28571428571432.71428571428571
1291010.5555555555556-0.555555555555555
13068.5-2.5
1311414.3529411764706-0.352941176470589
1321214.3529411764706-2.35294117647059
13388.5-0.5
1341112.2857142857143-1.28571428571429
1351314.3529411764706-1.35294117647059
136910-1
1371514.35294117647060.647058823529411
138138.54.5
1391514.35294117647060.647058823529411
1401412.28571428571431.71428571428571
14116106
1421412.28571428571431.71428571428571
14314104
144108.51.5
14510100
146410.5555555555556-6.55555555555556
147810.5555555555556-2.55555555555556
1481510.55555555555564.44444444444444
1491614.35294117647061.64705882352941
1501214.3529411764706-2.35294117647059
1511212.2857142857143-0.285714285714286
1521514.35294117647060.647058823529411
15398.50.5
1541214.3529411764706-2.35294117647059
1551414.3529411764706-0.352941176470589
15611101
 
Charts produced by software:
http://www.freestatistics.org/blog/date/2010/Dec/25/t1293270479m63028km9fe0643/2e0iw1293270411.png (open in new window)
http://www.freestatistics.org/blog/date/2010/Dec/25/t1293270479m63028km9fe0643/2e0iw1293270411.ps (open in new window)


http://www.freestatistics.org/blog/date/2010/Dec/25/t1293270479m63028km9fe0643/37ahz1293270411.png (open in new window)
http://www.freestatistics.org/blog/date/2010/Dec/25/t1293270479m63028km9fe0643/37ahz1293270411.ps (open in new window)


http://www.freestatistics.org/blog/date/2010/Dec/25/t1293270479m63028km9fe0643/4zjh21293270411.png (open in new window)
http://www.freestatistics.org/blog/date/2010/Dec/25/t1293270479m63028km9fe0643/4zjh21293270411.ps (open in new window)


 
Parameters (Session):
par1 = 3 ; par2 = none ; par3 = 4 ; par4 = no ;
 
Parameters (R input):
par1 = 3 ; par2 = none ; par3 = 4 ; par4 = no ;
 
R code (references can be found in the software module):
library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}
 





Copyright

Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.

Software written by Ed van Stee & Patrick Wessa


Disclaimer

Information provided on this web site is provided "AS IS" without warranty of any kind, either express or implied, including, without limitation, warranties of merchantability, fitness for a particular purpose, and noninfringement. We use reasonable efforts to include accurate and timely information and periodically update the information, and software without notice. However, we make no warranties or representations as to the accuracy or completeness of such information (or software), and we assume no liability or responsibility for errors or omissions in the content of this web site, or any software bugs in online applications. Your use of this web site is AT YOUR OWN RISK. Under no circumstances and under no legal theory shall we be liable to you or any other person for any direct, indirect, special, incidental, exemplary, or consequential damages arising from your access to, or use of, this web site.


Privacy Policy

We may request personal information to be submitted to our servers in order to be able to:

  • personalize online software applications according to your needs
  • enforce strict security rules with respect to the data that you upload (e.g. statistical data)
  • manage user sessions of online applications
  • alert you about important changes or upgrades in resources or applications

We NEVER allow other companies to directly offer registered users information about their products and services. Banner references and hyperlinks of third parties NEVER contain any personal data of the visitor.

We do NOT sell, nor transmit by any means, personal information, nor statistical data series uploaded by you to third parties.

We carefully protect your data from loss, misuse, alteration, and destruction. However, at any time, and under any circumstance you are solely responsible for managing your passwords, and keeping them secret.

We store a unique ANONYMOUS USER ID in the form of a small 'Cookie' on your computer. This allows us to track your progress when using this website which is necessary to create state-dependent features. The cookie is used for NO OTHER PURPOSE. At any time you may opt to disallow cookies from this website - this will not affect other features of this website.

We examine cookies that are used by third-parties (banner and online ads) very closely: abuse from third-parties automatically results in termination of the advertising contract without refund. We have very good reason to believe that the cookies that are produced by third parties (banner ads) do NOT cause any privacy or security risk.

FreeStatistics.org is safe. There is no need to download any software to use the applications and services contained in this website. Hence, your system's security is not compromised by their use, and your personal data - other than data you submit in the account application form, and the user-agent information that is transmitted by your browser - is never transmitted to our servers.

As a general rule, we do not log on-line behavior of individuals (other than normal logging of webserver 'hits'). However, in cases of abuse, hacking, unauthorized access, Denial of Service attacks, illegal copying, hotlinking, non-compliance with international webstandards (such as robots.txt), or any other harmful behavior, our system engineers are empowered to log, track, identify, publish, and ban misbehaving individuals - even if this leads to ban entire blocks of IP addresses, or disclosing user's identity.


FreeStatistics.org is powered by