piping in Execute R

MahdiP
MahdiP New Altair Community Member
edited November 5 in Community Q&A

Hello everybody,

I am trying to make use of the dplyr package for data manipulation in Execute R in Rapidminer. Every thing almost goes fine except the piping procedure!

I have this chunk of code that does not give rise to any output dataset and I get "Memory buffered file" as the output message. I check the Log file and every thing seems to be working flawlessly!  The code reads:

 

rm_main = function(data)
{
library(datasets)
library(dplyr)

NewData <- (
data %>%
mutate (Product2= Tot_Product*2) %>%
filter(Life_Phase != "Family") %>%
group_by(City, P_PLUS, Life_Phase) %>%
summarise(AVG_Prem= mean(TOT_prem, na.rm=TRUE), AVG_Test= mean(Product2, na.rm=TRUE)) %>%
select(City, P_PLUS,Life_Phase,AVG_Prem, AVG_Test)
)
return(NewData)
}

This code actually works in RStudion.

I apprecitae your help.

Mahdi

Tagged:

Best Answers

  • Thomas_Ott
    Thomas_Ott New Altair Community Member
    Answer ✓

    To build on what @mschmitz said, the Execute R operator allows you plass R objects to another Execute R operator but to output the results into RapidMiner, you have to first convert it to a dataframe inside your function.

  • MahdiP
    MahdiP New Altair Community Member
    Answer ✓

    Thank you so much for such a quick reply!

    It sounds like at the stage of applying group_by function, it messes up and destroy creation of the datafram. it could be simply solved by rendering the final result as dataframe to Rapidminer like; 


    rm_main = function(data)
    {
    library(datasets)
    library(dplyr)
    NewData <-
    data.frame(
    data %>%
    mutate (Product2= Tot_Product*2) %>%
    filter(Life_Phase != "Family") %>%
    group_by(City, P_PLUS, Life_Phase) %>%
    summarise(AVG_Prem= mean(TOT_prem, na.rm=TRUE), AVG_Test= mean(Product2, na.rm=TRUE)) %>%
    select(City, P_PLUS,Life_Phase,AVG_Prem, AVG_Test)
    )
    return(NewData)
    }

     

    Now it outputs a dataset to Rapidminer. 

    Thank you so much again.

    Mahdi

Answers

  • MartinLiebig
    MartinLiebig
    Altair Employee

    Dear Mahdi,

     

    RapidMiner can not interprete all types you can do in R. If you sent back something which is not interpretable it gets into RM as a memory buffered file. This cannot be used in RM but piped back into another R operator to be used there. This is very useful e.g. for modelling.

     

    In your case your NewData seems not to be a R dataframe.

     

    ~Martin

  • Thomas_Ott
    Thomas_Ott New Altair Community Member
    Answer ✓

    To build on what @mschmitz said, the Execute R operator allows you plass R objects to another Execute R operator but to output the results into RapidMiner, you have to first convert it to a dataframe inside your function.

  • MahdiP
    MahdiP New Altair Community Member
    Answer ✓

    Thank you so much for such a quick reply!

    It sounds like at the stage of applying group_by function, it messes up and destroy creation of the datafram. it could be simply solved by rendering the final result as dataframe to Rapidminer like; 


    rm_main = function(data)
    {
    library(datasets)
    library(dplyr)
    NewData <-
    data.frame(
    data %>%
    mutate (Product2= Tot_Product*2) %>%
    filter(Life_Phase != "Family") %>%
    group_by(City, P_PLUS, Life_Phase) %>%
    summarise(AVG_Prem= mean(TOT_prem, na.rm=TRUE), AVG_Test= mean(Product2, na.rm=TRUE)) %>%
    select(City, P_PLUS,Life_Phase,AVG_Prem, AVG_Test)
    )
    return(NewData)
    }

     

    Now it outputs a dataset to Rapidminer. 

    Thank you so much again.

    Mahdi