"Polynominal to binominal and aggregate behaviour"

frankstyle
frankstyle New Altair Community Member
edited November 5 in Community Q&A
Hi everybody, I'm pretty new to RapidMiner, but feel it is a great tool.
I really don't know how to solve a couple of data conversion tasks, I'm sure here there are many experts that can help me :-)

1) how can I create many binominal attributes starting from a polynominal attribute? I mean, I have this attribute:
color
white
orange
black
...and I want to convert it to 3 binominal attributes
white    orange  black
1          0            0
0          1            0
0          0            1

2) next, I love the aggregate operator, but it seems to me it doesn't work like the SQL-one. I.e., in RapidMiner it loses all the attributes it hasn't aggregated or grouped.... how can I get in RapidMiner something like this (silly) SQL statement?

SELECT name,birthday,email,SUM(monthlysalary) as totalsalary FROM mytable GROUP BY email

Thank you so much for your help!

Answers

  • fras
    fras New Altair Community Member
    I put everything into one process that illustrates the RapidMIner way of doing this kind of stuff.
    If you would like to do further calculations wiht aggregated attributes like "sum(..." you have to replace
    the brackets first:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.0.003">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="6.0.003" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="generate_team_profit_data" compatibility="6.0.003" expanded="true" height="60" name="Generate Team Profit Data" width="90" x="45" y="75"/>
          <operator activated="true" class="aggregate" compatibility="6.0.003" expanded="true" height="76" name="Aggregate" width="90" x="179" y="75">
            <list key="aggregation_attributes">
              <parameter key="average years of experience" value="sum"/>
            </list>
            <parameter key="group_by_attributes" value="leader"/>
          </operator>
          <operator activated="true" class="join" compatibility="6.0.003" expanded="true" height="76" name="Join" width="90" x="313" y="75">
            <parameter key="use_id_attribute_as_key" value="false"/>
            <list key="key_attributes">
              <parameter key="leader" value="leader"/>
            </list>
          </operator>
          <operator activated="true" class="nominal_to_numerical" compatibility="6.0.003" expanded="true" height="94" name="Nominal to Numerical" width="90" x="447" y="75">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="leader"/>
            <list key="comparison_groups"/>
          </operator>
          <connect from_op="Generate Team Profit Data" from_port="output" to_op="Aggregate" to_port="example set input"/>
          <connect from_op="Aggregate" from_port="example set output" to_op="Join" to_port="left"/>
          <connect from_op="Aggregate" from_port="original" to_op="Join" to_port="right"/>
          <connect from_op="Join" from_port="join" to_op="Nominal to Numerical" to_port="example set input"/>
          <connect from_op="Nominal to Numerical" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>

  • frankstyle
    frankstyle New Altair Community Member
    Awesome. It works great!
    Thank you so much, fras