"Clustering in Rapidminer and R extension"

nav
nav New Altair Community Member
edited November 5 in Community Q&A
Hello,

I am not new of rapidminer but I am new of R extension. I need to use R for calculate same statistics for the clustering output from rapidminer operator, but I don't now how to use the cluster set of rapidminer in the R script operator (I have already use the R export operatore). In particular what I want to do is for example:

- do different clustering with rapidminer operator
- apply some statics from R script to the example set of previous step, for example randIndex, silhuette, etc

But how can I use the example set of clustering in R.

Thanks in advance to all.

Answers

  • nav
    nav New Altair Community Member
    Possible that none never had the need of using R to calculate any statistics from a cluster set generated by RapidMiner operator?
  • Hello

    I made a comment on my http://rapidminernotes.blogspot.com/2011/06/counting-clusters-part-r.html

    The key part is the R needed

    library(mclust)
    library(profdpm)

    ## "data" is defined as the input in the inputs
    ## each column is referred to by name using $
    ## because the input is a data frame.
    ARI = adjustedRandIndex(data$cluster1, data$cluster2)
    ARI = as.data.frame(ARI)
    pci=as.data.frame(t(pci(data$cluster1, data$cluster2)))


    ## using the variable ari sets the name of the returned data frame column
    ## and avoids having to use a rename process
    ## "x" is the output defined in the results
    ## it must be a data frame
    x = as.data.frame(cbind(ARI,pci))


    Andrew
  • nav
    nav New Altair Community Member
    Thank you very much... you give me a very good starting point for my problem...