"[SOLVED] using R in RapidMiner: how to use data from dataset in test in R"

Arjan
Arjan New Altair Community Member
edited November 5 in Community Q&A
Hello everybody,

I am new on the forum. The posts about installing R for RapidMiner have been very helpfull already. So thank you for that.

I am trying a simple analysis with R in RM to get myself started. Now I have run into some problems. I hope someone here can point me in the right direction.

I made the following R-script to do a paired t-test in RM:

x1 <- c(1,2,5,7,9,0)
x2 <- c(2,3,4,3,6,4)
mytest.t <- t.test(x1, x2, paired=T, alternative="less")
pvalue <- mytest.t$statistic
result <- as.data.frame(pvalue)

that works fine.

When I try to run the test on some data from one of the databases I get an error. In this script inputdata[2] and inputdata[6] are numeric columns from the dataset, which are otherwise accesible in the script (e.g. result <- as.data.frame(x1) will give the column as result)

The script giving the error:

x1 <- inputdata[2]
x2 <- inputdata[6]
mytest.t <- t.test(x1, x2, paired=T, alternative="less")
pvalue <- mytest.t$statistic
result <- as.data.frame(pvalue)

gives the following error:

Nov 14, 2012 2:38:13 PM INFO: Execute Script (R): Error in `[.data.frame`(y, yok) : undefined columns selected
Nov 14, 2012 2:38:13 PM INFO: Execute Script (R): In addition: Warning message:
Nov 14, 2012 2:38:13 PM INFO: Execute Script (R): package 'mlr' is not available (for R version 2.15.2)
Nov 14, 2012 2:38:13 PM INFO: Execute Script (R): Error: object 'result' not found

What am I doing wrong?

thank you for reading and any help in advance.

best regards,

Arjan
Tagged:

Answers

  • Arjan
    Arjan New Altair Community Member
    Thank you, it took me awhile to figure it out (and to find time to do that). It was about using the labels correctly.

    In the end this was my R-script:

    var1 <- inputdata$positionInfoSpeed
    var2 <- inputdta$positionInfoSpeed2

    mytest.t <- t.test(var1, var2, paired=T, alternative="less")

    result_table <- sapply(mytest.t,unlist)

    result <- as.data.frame(result_table)

    thank you!