"[Solved]Convert Numeric to Nominal after k-means clustering"

nachiket
nachiket New Altair Community Member
edited November 5 in Community Q&A
I am a new RapidMiner, I have an excel dataset
I wanted to apply k-means clustering on this dataset and then Bayesian classification on the result of the same
I imported excel(all fields except FID as text) and did Nominal to Numeric to apply kmeans now I want the clusters with original values of data as in input excel (not the numeric data)  to apply Bayes classification on same
How can I do Numeric to Nominal conversion on all of fields ?

Sample Data(1100 rows)
FID Geology                                        Geomorphology                                            Land use_land cover Rainfall       SLOPE Soil                     zone
0 Fissile hornblende biotite gneiss HIGHLY DISSECTED DIFLECTION SLOPE     FOREST                 1200-1400 >60% BROWN CLAY     High
1 Fissile hornblende biotite gneiss HIGHLY DISSECTED DIFLECTION SLOPE     FOREST                    1200-1400 30-60% BROWN CLAY    Moderate

Answers

  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Hi,

    I'm not sure I understood you correctly, but does the example process below help you? It uses the Multiply operator to create multiple instances of your data and then at the end join the clustered result to your original data.

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.1.001-SNAPSHOT">
     <context>
       <input/>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="6.1.001-SNAPSHOT" expanded="true" name="Process">
       <process expanded="true">
         <operator activated="true" class="generate_nominal_data" compatibility="6.1.001-SNAPSHOT" expanded="true" height="60" name="Generate Nominal Data" width="90" x="45" y="75"/>
         <operator activated="true" class="generate_id" compatibility="6.1.001-SNAPSHOT" expanded="true" height="76" name="Generate ID" width="90" x="179" y="75"/>
         <operator activated="true" class="multiply" compatibility="6.1.001-SNAPSHOT" expanded="true" height="94" name="Multiply" width="90" x="313" y="75"/>
         <operator activated="true" class="nominal_to_numerical" compatibility="6.1.001-SNAPSHOT" expanded="true" height="94" name="Nominal to Numerical" width="90" x="514" y="30">
           <list key="comparison_groups"/>
         </operator>
         <operator activated="true" class="k_means" compatibility="6.1.001-SNAPSHOT" expanded="true" height="76" name="Clustering" width="90" x="647" y="30"/>
         <operator activated="true" class="select_attributes" compatibility="6.1.001-SNAPSHOT" expanded="true" height="76" name="Select Attributes" width="90" x="781" y="30">
           <parameter key="attribute_filter_type" value="subset"/>
           <parameter key="attributes" value="cluster|id|label"/>
         </operator>
         <operator activated="true" class="join" compatibility="6.1.001-SNAPSHOT" expanded="true" height="76" name="Join" width="90" x="916" y="75">
           <list key="key_attributes"/>
         </operator>
         <connect from_op="Generate Nominal Data" from_port="output" to_op="Generate ID" to_port="example set input"/>
         <connect from_op="Generate ID" from_port="example set output" to_op="Multiply" to_port="input"/>
         <connect from_op="Multiply" from_port="output 1" to_op="Nominal to Numerical" to_port="example set input"/>
         <connect from_op="Multiply" from_port="output 2" to_op="Join" to_port="right"/>
         <connect from_op="Nominal to Numerical" from_port="example set output" to_op="Clustering" to_port="example set"/>
         <connect from_op="Clustering" from_port="clustered set" to_op="Select Attributes" to_port="example set input"/>
         <connect from_op="Select Attributes" from_port="example set output" to_op="Join" to_port="left"/>
         <connect from_op="Join" from_port="join" to_port="result 1"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
       </process>
     </operator>
    </process>

    Regards,
    Marco
  • nachiket
    nachiket New Altair Community Member
    Yes thank you very much ,I got the clustered output with original names however as there are 1 extra attributes cluster and id is at the end(like a float number) can you please tell me how I can use Naive Bayes on it  ?

    PS :I am actually trying to integrate Bayes classification with k-means clustering
  • Fred12
    Fred12 New Altair Community Member

    I am interested in that topic, too!

    did you solve how to apply NB with clustering? can you show me how?