🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

piping in Execute R

User: "MahdiP"
New Altair Community Member
Updated by Jocelyn

Hello everybody,

I am trying to make use of the dplyr package for data manipulation in Execute R in Rapidminer. Every thing almost goes fine except the piping procedure!

I have this chunk of code that does not give rise to any output dataset and I get "Memory buffered file" as the output message. I check the Log file and every thing seems to be working flawlessly!  The code reads:

 

rm_main = function(data)
{
library(datasets)
library(dplyr)

NewData <- (
data %>%
mutate (Product2= Tot_Product*2) %>%
filter(Life_Phase != "Family") %>%
group_by(City, P_PLUS, Life_Phase) %>%
summarise(AVG_Prem= mean(TOT_prem, na.rm=TRUE), AVG_Test= mean(Product2, na.rm=TRUE)) %>%
select(City, P_PLUS,Life_Phase,AVG_Prem, AVG_Test)
)
return(NewData)
}

This code actually works in RStudion.

I apprecitae your help.

Mahdi

Find more posts tagged with

Sort by:
1 - 2 of 21
    User: "Thomas_Ott"
    New Altair Community Member
    Accepted Answer

    To build on what @mschmitz said, the Execute R operator allows you plass R objects to another Execute R operator but to output the results into RapidMiner, you have to first convert it to a dataframe inside your function.

    User: "MahdiP"
    New Altair Community Member
    OP
    Accepted Answer

    Thank you so much for such a quick reply!

    It sounds like at the stage of applying group_by function, it messes up and destroy creation of the datafram. it could be simply solved by rendering the final result as dataframe to Rapidminer like; 


    rm_main = function(data)
    {
    library(datasets)
    library(dplyr)
    NewData <-
    data.frame(
    data %>%
    mutate (Product2= Tot_Product*2) %>%
    filter(Life_Phase != "Family") %>%
    group_by(City, P_PLUS, Life_Phase) %>%
    summarise(AVG_Prem= mean(TOT_prem, na.rm=TRUE), AVG_Test= mean(Product2, na.rm=TRUE)) %>%
    select(City, P_PLUS,Life_Phase,AVG_Prem, AVG_Test)
    )
    return(NewData)
    }

     

    Now it outputs a dataset to Rapidminer. 

    Thank you so much again.

    Mahdi