"[SOLVED] using R in RapidMiner: how to use data from dataset in test in R"
Arjan
New Altair Community Member
Hello everybody,
I am new on the forum. The posts about installing R for RapidMiner have been very helpfull already. So thank you for that.
I am trying a simple analysis with R in RM to get myself started. Now I have run into some problems. I hope someone here can point me in the right direction.
I made the following R-script to do a paired t-test in RM:
x1 <- c(1,2,5,7,9,0)
x2 <- c(2,3,4,3,6,4)
mytest.t <- t.test(x1, x2, paired=T, alternative="less")
pvalue <- mytest.t$statistic
result <- as.data.frame(pvalue)
that works fine.
When I try to run the test on some data from one of the databases I get an error. In this script inputdata[2] and inputdata[6] are numeric columns from the dataset, which are otherwise accesible in the script (e.g. result <- as.data.frame(x1) will give the column as result)
The script giving the error:
x1 <- inputdata[2]
x2 <- inputdata[6]
mytest.t <- t.test(x1, x2, paired=T, alternative="less")
pvalue <- mytest.t$statistic
result <- as.data.frame(pvalue)
gives the following error:
Nov 14, 2012 2:38:13 PM INFO: Execute Script (R): Error in `[.data.frame`(y, yok) : undefined columns selected
Nov 14, 2012 2:38:13 PM INFO: Execute Script (R): In addition: Warning message:
Nov 14, 2012 2:38:13 PM INFO: Execute Script (R): package 'mlr' is not available (for R version 2.15.2)
Nov 14, 2012 2:38:13 PM INFO: Execute Script (R): Error: object 'result' not found
What am I doing wrong?
thank you for reading and any help in advance.
best regards,
Arjan
I am new on the forum. The posts about installing R for RapidMiner have been very helpfull already. So thank you for that.
I am trying a simple analysis with R in RM to get myself started. Now I have run into some problems. I hope someone here can point me in the right direction.
I made the following R-script to do a paired t-test in RM:
x1 <- c(1,2,5,7,9,0)
x2 <- c(2,3,4,3,6,4)
mytest.t <- t.test(x1, x2, paired=T, alternative="less")
pvalue <- mytest.t$statistic
result <- as.data.frame(pvalue)
that works fine.
When I try to run the test on some data from one of the databases I get an error. In this script inputdata[2] and inputdata[6] are numeric columns from the dataset, which are otherwise accesible in the script (e.g. result <- as.data.frame(x1) will give the column as result)
The script giving the error:
x1 <- inputdata[2]
x2 <- inputdata[6]
mytest.t <- t.test(x1, x2, paired=T, alternative="less")
pvalue <- mytest.t$statistic
result <- as.data.frame(pvalue)
gives the following error:
Nov 14, 2012 2:38:13 PM INFO: Execute Script (R): Error in `[.data.frame`(y, yok) : undefined columns selected
Nov 14, 2012 2:38:13 PM INFO: Execute Script (R): In addition: Warning message:
Nov 14, 2012 2:38:13 PM INFO: Execute Script (R): package 'mlr' is not available (for R version 2.15.2)
Nov 14, 2012 2:38:13 PM INFO: Execute Script (R): Error: object 'result' not found
What am I doing wrong?
thank you for reading and any help in advance.
best regards,
Arjan
0
Answers
-
Hello
Here's an example that might help.
http://rapidminernotes.blogspot.co.uk/2011/06/counting-clusters-part-r.html
regards
Andrew0 -
Thank you, it took me awhile to figure it out (and to find time to do that). It was about using the labels correctly.
In the end this was my R-script:
var1 <- inputdata$positionInfoSpeed
var2 <- inputdta$positionInfoSpeed2
mytest.t <- t.test(var1, var2, paired=T, alternative="less")
result_table <- sapply(mytest.t,unlist)
result <- as.data.frame(result_table)
thank you!0