"Polynominal to binominal and aggregate behaviour"
frankstyle
New Altair Community Member
Hi everybody, I'm pretty new to RapidMiner, but feel it is a great tool.
I really don't know how to solve a couple of data conversion tasks, I'm sure here there are many experts that can help me :-)
1) how can I create many binominal attributes starting from a polynominal attribute? I mean, I have this attribute:
color
white
orange
black
...and I want to convert it to 3 binominal attributes
white orange black
1 0 0
0 1 0
0 0 1
2) next, I love the aggregate operator, but it seems to me it doesn't work like the SQL-one. I.e., in RapidMiner it loses all the attributes it hasn't aggregated or grouped.... how can I get in RapidMiner something like this (silly) SQL statement?
SELECT name,birthday,email,SUM(monthlysalary) as totalsalary FROM mytable GROUP BY email
Thank you so much for your help!
I really don't know how to solve a couple of data conversion tasks, I'm sure here there are many experts that can help me :-)
1) how can I create many binominal attributes starting from a polynominal attribute? I mean, I have this attribute:
color
white
orange
black
...and I want to convert it to 3 binominal attributes
white orange black
1 0 0
0 1 0
0 0 1
2) next, I love the aggregate operator, but it seems to me it doesn't work like the SQL-one. I.e., in RapidMiner it loses all the attributes it hasn't aggregated or grouped.... how can I get in RapidMiner something like this (silly) SQL statement?
SELECT name,birthday,email,SUM(monthlysalary) as totalsalary FROM mytable GROUP BY email
Thank you so much for your help!
0
Answers
-
I put everything into one process that illustrates the RapidMIner way of doing this kind of stuff.
If you would like to do further calculations wiht aggregated attributes like "sum(..." you have to replace
the brackets first:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="6.0.003">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="6.0.003" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="generate_team_profit_data" compatibility="6.0.003" expanded="true" height="60" name="Generate Team Profit Data" width="90" x="45" y="75"/>
<operator activated="true" class="aggregate" compatibility="6.0.003" expanded="true" height="76" name="Aggregate" width="90" x="179" y="75">
<list key="aggregation_attributes">
<parameter key="average years of experience" value="sum"/>
</list>
<parameter key="group_by_attributes" value="leader"/>
</operator>
<operator activated="true" class="join" compatibility="6.0.003" expanded="true" height="76" name="Join" width="90" x="313" y="75">
<parameter key="use_id_attribute_as_key" value="false"/>
<list key="key_attributes">
<parameter key="leader" value="leader"/>
</list>
</operator>
<operator activated="true" class="nominal_to_numerical" compatibility="6.0.003" expanded="true" height="94" name="Nominal to Numerical" width="90" x="447" y="75">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="leader"/>
<list key="comparison_groups"/>
</operator>
<connect from_op="Generate Team Profit Data" from_port="output" to_op="Aggregate" to_port="example set input"/>
<connect from_op="Aggregate" from_port="example set output" to_op="Join" to_port="left"/>
<connect from_op="Aggregate" from_port="original" to_op="Join" to_port="right"/>
<connect from_op="Join" from_port="join" to_op="Nominal to Numerical" to_port="example set input"/>
<connect from_op="Nominal to Numerical" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>0 -
Awesome. It works great!
Thank you so much, fras0