Home » date » 2010 » Dec » 10 »

*The author of this computation has been verified*
R Software Module: /rwasp_regression_trees1.wasp (opens new window with default values)
Title produced by software: Recursive Partitioning (Regression Trees)
Date of computation: Fri, 10 Dec 2010 15:32:27 +0000
 
Cite this page as follows:
Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL http://www.freestatistics.org/blog/date/2010/Dec/10/t1291996423is6mzj8uprkao20.htm/, Retrieved Fri, 10 Dec 2010 16:53:46 +0100
 
BibTeX entries for LaTeX users:
@Manual{KEY,
    author = {{YOUR NAME}},
    publisher = {Office for Research Development and Education},
    title = {Statistical Computations at FreeStatistics.org, URL http://www.freestatistics.org/blog/date/2010/Dec/10/t1291996423is6mzj8uprkao20.htm/},
    year = {2010},
}
@Manual{R,
    title = {R: A Language and Environment for Statistical Computing},
    author = {{R Development Core Team}},
    organization = {R Foundation for Statistical Computing},
    address = {Vienna, Austria},
    year = {2010},
    note = {{ISBN} 3-900051-07-0},
    url = {http://www.R-project.org},
}
 
Original text written by user:
 
IsPrivate?
No (this computation is public)
 
User-defined keywords:
 
Dataseries X:
» Textbox « » Textfile « » CSV «
23 13 14 22 11 23 8 1 6 15 20 12 7 20 22 24 4 2 5 23 26 26 22 25 23 24 7 2 20 26 19 16 12 23 21 21 4 2 12 19 17 18 15 20 19 21 4 2 11 19 17 12 9 22 12 19 5 2 12 16 21 18 20 18 24 12 15 1 11 23 18 20 10 22 21 21 5 1 9 22 16 18 12 23 21 25 7 2 13 19 26 24 23 28 26 27 4 2 9 24 20 17 10 19 18 21 4 1 14 19 14 19 11 26 21 27 7 1 12 25 22 12 20 27 22 20 8 1 18 23 23 25 11 23 26 16 4 2 9 31 25 23 22 27 20 26 8 1 15 29 24 22 19 23 20 24 4 2 12 18 24 23 20 23 26 25 5 2 12 17 16 16 16 19 27 25 16 1 12 22 16 16 12 21 27 27 7 1 15 21 20 15 14 25 16 23 4 2 11 24 20 24 14 22 26 22 6 1 13 22 15 18 9 13 20 10 4 1 10 16 22 23 19 12 25 25 5 2 17 22 20 18 17 20 16 18 4 1 13 21 20 19 14 24 20 21 4 1 17 25 24 17 19 23 20 20 6 1 15 22 27 22 20 25 24 18 4 1 13 24 25 22 20 28 24 25 4 1 17 25 13 8 9 24 22 28 4 1 21 29 15 12 10 18 18 27 8 1 12 19 19 22 6 19 21 20 5 2 12 29 20 16 15 24 17 20 4 1 15 25 11 12 9 22 15 20 10 2 8 19 28 28 24 28 28 27 4 2 15 27 21 15 11 24 23 23 4 1 16 25 25 17 4 28 19 23 4 etc...
 
Output produced by software:

Enter (or paste) a matrix (table) containing all data (time) series. Every column represents a different variable and must be delimited by a space or Tab. Every row represents a period in time (or category) and must be delimited by hard returns. The easiest way to enter data is to copy and paste a block of spreadsheet cells. Please, do not use commas or spaces to seperate groups of digits!


Summary of computational transaction
Raw Inputview raw input (R code)
Raw Outputview raw output of R engine
Computing time7 seconds
R Server'George Udny Yule' @ 72.249.76.132


Goodness of Fit
Correlation0.5401
R-squared0.2917
RMSE3.6098


Actuals, Predictions, and Residuals
#ActualsForecastsResiduals
11116.25-5.25
22216.255.75
32323.9666666666667-0.966666666666665
42120.43617021276600.563829787234042
51920.4361702127660-1.43617021276596
61216.25-4.25
72420.43617021276603.56382978723404
82120.43617021276600.563829787234042
92120.43617021276600.563829787234042
102623.96666666666672.03333333333333
111820.4361702127660-2.43617021276596
122120.43617021276600.563829787234042
132216.255.75
142623.96666666666672.03333333333333
152023.9666666666667-3.96666666666667
162020.4361702127660-0.436170212765958
172623.96666666666672.03333333333333
182720.43617021276606.56382978723404
192720.43617021276606.56382978723404
201620.4361702127660-4.43617021276596
212623.96666666666672.03333333333333
222020.4361702127660-0.436170212765958
232523.96666666666671.03333333333333
241620.4361702127660-4.43617021276596
252020.4361702127660-0.436170212765958
262020.4361702127660-0.436170212765958
272420.43617021276603.56382978723404
282420.43617021276603.56382978723404
292216.255.75
301816.251.75
312120.43617021276600.563829787234042
321720.4361702127660-3.43617021276596
331516.25-1.25
342823.96666666666674.03333333333333
352320.43617021276602.56382978723404
361920.4361702127660-1.43617021276596
371520.4361702127660-5.43617021276596
382623.96666666666672.03333333333333
392023.9666666666667-3.96666666666667
401116.25-5.25
411720.4361702127660-3.43617021276596
421620.4361702127660-4.43617021276596
432120.43617021276600.563829787234042
441820.4361702127660-2.43617021276596
451720.4361702127660-3.43617021276596
462120.43617021276600.563829787234042
471820.4361702127660-2.43617021276596
481616.25-0.25
491316.25-3.25
502823.96666666666674.03333333333333
512516.258.75
522420.43617021276603.56382978723404
531520.4361702127660-5.43617021276596
542120.43617021276600.563829787234042
551116.25-5.25
562720.43617021276606.56382978723404
572320.43617021276602.56382978723404
582120.43617021276600.563829787234042
591620.4361702127660-4.43617021276596
602016.253.75
612123.9666666666667-2.96666666666667
621016.25-6.25
631823.9666666666667-5.96666666666667
642020.4361702127660-0.436170212765958
652123.9666666666667-2.96666666666667
662423.96666666666670.033333333333335
672620.43617021276605.56382978723404
682320.43617021276602.56382978723404
692220.43617021276601.56382978723404
701316.25-3.25
712723.96666666666673.03333333333333
722420.43617021276603.56382978723404
731920.4361702127660-1.43617021276596
741720.4361702127660-3.43617021276596
751620.4361702127660-4.43617021276596
762020.4361702127660-0.436170212765958
77816.25-8.25
781620.4361702127660-4.43617021276596
791720.4361702127660-3.43617021276596
802323.9666666666667-0.966666666666665
811820.4361702127660-2.43617021276596
822423.96666666666670.033333333333335
831716.250.75
842020.4361702127660-0.436170212765958
852220.43617021276601.56382978723404
862220.43617021276601.56382978723404
872020.4361702127660-0.436170212765958
881820.4361702127660-2.43617021276596
892120.43617021276600.563829787234042
902320.43617021276602.56382978723404
912823.96666666666674.03333333333333
921920.4361702127660-1.43617021276596
932220.43617021276601.56382978723404
941720.4361702127660-3.43617021276596
952523.96666666666671.03333333333333
962220.43617021276601.56382978723404
972120.43617021276600.563829787234042
981520.4361702127660-5.43617021276596
992020.4361702127660-0.436170212765958
1002520.43617021276604.56382978723404
1012120.43617021276600.563829787234042
1022423.96666666666670.033333333333335
1032320.43617021276602.56382978723404
1042223.9666666666667-1.96666666666667
1051420.4361702127660-6.43617021276596
1061120.4361702127660-9.43617021276596
1072216.255.75
1082220.43617021276601.56382978723404
109616.25-10.25
1101520.4361702127660-5.43617021276596
1112620.43617021276605.56382978723404
1122623.96666666666672.03333333333333
1132016.253.75
1142620.43617021276605.56382978723404
1151516.25-1.25
1162520.43617021276604.56382978723404
1172223.9666666666667-1.96666666666667
1182020.4361702127660-0.436170212765958
1191816.251.75
1202316.256.75
1212220.43617021276601.56382978723404
1222320.43617021276602.56382978723404
1231720.4361702127660-3.43617021276596
1242016.253.75
1252120.43617021276600.563829787234042
1262323.9666666666667-0.966666666666665
1272523.96666666666671.03333333333333
1282520.43617021276604.56382978723404
1292120.43617021276600.563829787234042
1302220.43617021276601.56382978723404
1311820.4361702127660-2.43617021276596
1321820.4361702127660-2.43617021276596
1331823.9666666666667-5.96666666666667
1342120.43617021276600.563829787234042
1352120.43617021276600.563829787234042
1362520.43617021276604.56382978723404
1372420.43617021276603.56382978723404
1382420.43617021276603.56382978723404
1392823.96666666666674.03333333333333
1402423.96666666666670.033333333333335
1412223.9666666666667-1.96666666666667
1422220.43617021276601.56382978723404
1432020.4361702127660-0.436170212765958
1442520.43617021276604.56382978723404
1451320.4361702127660-7.43617021276596
1462120.43617021276600.563829787234042
1472320.43617021276602.56382978723404
1481820.4361702127660-2.43617021276596
 
Charts produced by software:
http://www.freestatistics.org/blog/date/2010/Dec/10/t1291996423is6mzj8uprkao20/2ax6r1291995139.png (open in new window)
http://www.freestatistics.org/blog/date/2010/Dec/10/t1291996423is6mzj8uprkao20/2ax6r1291995139.ps (open in new window)


http://www.freestatistics.org/blog/date/2010/Dec/10/t1291996423is6mzj8uprkao20/3ax6r1291995139.png (open in new window)
http://www.freestatistics.org/blog/date/2010/Dec/10/t1291996423is6mzj8uprkao20/3ax6r1291995139.ps (open in new window)


http://www.freestatistics.org/blog/date/2010/Dec/10/t1291996423is6mzj8uprkao20/4exne1291995139.png (open in new window)
http://www.freestatistics.org/blog/date/2010/Dec/10/t1291996423is6mzj8uprkao20/4exne1291995139.ps (open in new window)


 
Parameters (Session):
par1 = 2 ; par2 = Do not include Seasonal Dummies ; par3 = No Linear Trend ;
 
Parameters (R input):
par1 = 5 ; par2 = none ; par3 = 3 ; par4 = no ;
 
R code (references can be found in the software module):
library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}
 





Copyright

Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.

Software written by Ed van Stee & Patrick Wessa


Disclaimer

Information provided on this web site is provided "AS IS" without warranty of any kind, either express or implied, including, without limitation, warranties of merchantability, fitness for a particular purpose, and noninfringement. We use reasonable efforts to include accurate and timely information and periodically update the information, and software without notice. However, we make no warranties or representations as to the accuracy or completeness of such information (or software), and we assume no liability or responsibility for errors or omissions in the content of this web site, or any software bugs in online applications. Your use of this web site is AT YOUR OWN RISK. Under no circumstances and under no legal theory shall we be liable to you or any other person for any direct, indirect, special, incidental, exemplary, or consequential damages arising from your access to, or use of, this web site.


Privacy Policy

We may request personal information to be submitted to our servers in order to be able to:

  • personalize online software applications according to your needs
  • enforce strict security rules with respect to the data that you upload (e.g. statistical data)
  • manage user sessions of online applications
  • alert you about important changes or upgrades in resources or applications

We NEVER allow other companies to directly offer registered users information about their products and services. Banner references and hyperlinks of third parties NEVER contain any personal data of the visitor.

We do NOT sell, nor transmit by any means, personal information, nor statistical data series uploaded by you to third parties.

We carefully protect your data from loss, misuse, alteration, and destruction. However, at any time, and under any circumstance you are solely responsible for managing your passwords, and keeping them secret.

We store a unique ANONYMOUS USER ID in the form of a small 'Cookie' on your computer. This allows us to track your progress when using this website which is necessary to create state-dependent features. The cookie is used for NO OTHER PURPOSE. At any time you may opt to disallow cookies from this website - this will not affect other features of this website.

We examine cookies that are used by third-parties (banner and online ads) very closely: abuse from third-parties automatically results in termination of the advertising contract without refund. We have very good reason to believe that the cookies that are produced by third parties (banner ads) do NOT cause any privacy or security risk.

FreeStatistics.org is safe. There is no need to download any software to use the applications and services contained in this website. Hence, your system's security is not compromised by their use, and your personal data - other than data you submit in the account application form, and the user-agent information that is transmitted by your browser - is never transmitted to our servers.

As a general rule, we do not log on-line behavior of individuals (other than normal logging of webserver 'hits'). However, in cases of abuse, hacking, unauthorized access, Denial of Service attacks, illegal copying, hotlinking, non-compliance with international webstandards (such as robots.txt), or any other harmful behavior, our system engineers are empowered to log, track, identify, publish, and ban misbehaving individuals - even if this leads to ban entire blocks of IP addresses, or disclosing user's identity.


FreeStatistics.org is powered by