[SOLVED] evaluating clusters and graphing

MarcosRL
MarcosRL New Altair Community Member
edited November 5 in Community Q&A
Hi Marius, following the clustering process, now would need to apply metrics to assess the quality of the clusters obtained. Need to apply metrics such as F-Measure, Entropia, Overall Similarity.
Are there operators that support these metrics?
After graphics also need to visualize the clusters or otherwise optimal.
Are there any operator to plot the clusters?
regards
Tagged:

Answers

  • MariusHelf
    MariusHelf New Altair Community Member
    Hi,

    even though I answer most of the posts, this is still a public forum, and I have the faint hope that at some point in the future there will be also others, namely experienced users, who answer other people's question, so no need to address me directly :)

    To answer your last question first: you don't need a special operator to plot any results - just connect the clustered dataset to the process output and use the Plot View or the Advanced Charts to visualize your data. There is a series of blog posts covering the most common standard plotters, and for the advanced charts there is a 60-page document available on our website.
    From the standard plotters, the most interesting one for you is probably the scatter plot.

    A list of all relevant blog posts is available at the bottom of this post: http://rapid-i.com/component/option,com_myblog/show,The-RapidMiner-Plotters-16-Bars.html/Itemid,172/lang,en/
    A blog introducing the new advanced plotters is located here: http://rapid-i.com/component/option,com_myblog/show,New-Plotters-for-RapidMiner.html/Itemid,172/lang,en/
    The documentation of the advanced charts can be downloaded here: http://docs.rapid-i.com/r/rm-charts-en



    Concerning the evaluation of clusters: there is the group Evaluation / Performance Measurement / Clustering, where you can find relevant operators. The F-Measure can be used only with supervised learning tasks, so its not applicable to clustering, unless you know the "true" cluster of each example beforehand.

    Best regards,
    Marius