"Rscript performance issue"
Hi all,
I'm using Rstudio to write my R code and I want to use it in Rapidminer. The Rscript extension is great. But I realized that code which will execute in Rstudio within less than 1 second will take up to 12 seconds in Rapidminer with the Rscript extension. For small datasets this is not really a problem. But for huge datasets I'm running into perfomance issues.
Did you face similiar issues? Is there any workaround to solve this problem?
A minimal example is:
[tt]# rm_main is a mandatory function,
# the number of arguments has to be the number of input ports (can be none)
rm_main = function(data)
{
require(fields)
lotlan <- cbind(data$C_1,data$C_2)
xy1 <- lotlan[1:length(data$C_1),]
xy2 <- rbind(lotlan[2:length(data$C_1),],c(NA,NA))
GeoDistance <- rdist.earth.vec(xy1, xy2)
data2 <- cbind(data,GeoDistance)
return(data2)
}[/tt]
where data$C_1,data$C_2 represent geocoordinates.
Best regards,
creatX
I'm using Rstudio to write my R code and I want to use it in Rapidminer. The Rscript extension is great. But I realized that code which will execute in Rstudio within less than 1 second will take up to 12 seconds in Rapidminer with the Rscript extension. For small datasets this is not really a problem. But for huge datasets I'm running into perfomance issues.
Did you face similiar issues? Is there any workaround to solve this problem?
A minimal example is:
[tt]# rm_main is a mandatory function,
# the number of arguments has to be the number of input ports (can be none)
rm_main = function(data)
{
require(fields)
lotlan <- cbind(data$C_1,data$C_2)
xy1 <- lotlan[1:length(data$C_1),]
xy2 <- rbind(lotlan[2:length(data$C_1),],c(NA,NA))
GeoDistance <- rdist.earth.vec(xy1, xy2)
data2 <- cbind(data,GeoDistance)
return(data2)
}[/tt]
where data$C_1,data$C_2 represent geocoordinates.
Best regards,
creatX