[Solved] Ranking percentages by ID and label
Kate_Strydom
New Altair Community Member
Hi,
Can anyone assist me please!
I need to create a new attribute in RapidMiner called "Rank" which ranks the percentages in descending order grouped by the unique ID and Category.
unique ID Category Percentage Rank
001A News 0.23 1
001A Weather 0.15 2
001A Sports 0.09 3
001B Weather 0.64 1
001B News 0.25 2
001B Entertainment 0.02 3
Can anyone assist me please!
I need to create a new attribute in RapidMiner called "Rank" which ranks the percentages in descending order grouped by the unique ID and Category.
unique ID Category Percentage Rank
001A News 0.23 1
001A Weather 0.15 2
001A Sports 0.09 3
001B Weather 0.64 1
001B News 0.25 2
001B Entertainment 0.02 3
Tagged:
0
Answers
-
Hi,
you can use Sort together with Generate ID to solve your problem. If you need to calculate the percentages first, you can use Aggregate.
Attached is a small process using the iris dataset and rank using the attribute a1.
I hope this helps!
Best,
Martin
Edit: I overlooked that you want to do it grouped by your unique IDs. Here is an example process on sonar. The difference is basicly the loop
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="6.1.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="6.1.000" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="6.1.000" expanded="true" height="60" name="Retrieve Iris" width="90" x="45" y="30">
<parameter key="repository_entry" value="//Samples/data/Iris"/>
</operator>
<operator activated="true" class="sort" compatibility="6.1.000" expanded="true" height="76" name="Sort" width="90" x="179" y="30">
<parameter key="attribute_name" value="a1"/>
</operator>
<operator activated="true" class="generate_id" compatibility="6.1.000" expanded="true" height="76" name="Generate ID" width="90" x="313" y="30"/>
<operator activated="true" class="set_role" compatibility="6.1.000" expanded="true" height="76" name="Set Role" width="90" x="447" y="30">
<parameter key="attribute_name" value="id"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="rename" compatibility="6.1.000" expanded="true" height="76" name="Rename" width="90" x="581" y="30">
<parameter key="old_name" value="id"/>
<parameter key="new_name" value="Rank"/>
<list key="rename_additional_attributes"/>
</operator>
<connect from_op="Retrieve Iris" from_port="output" to_op="Sort" to_port="example set input"/>
<connect from_op="Sort" from_port="example set output" to_op="Generate ID" to_port="example set input"/>
<connect from_op="Generate ID" from_port="example set output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Rename" to_port="example set input"/>
<connect from_op="Rename" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="6.1.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="6.1.000" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="6.1.000" expanded="true" height="60" name="Retrieve Sonar" width="90" x="112" y="30">
<parameter key="repository_entry" value="//Samples/data/Sonar"/>
</operator>
<operator activated="true" class="loop_values" compatibility="6.1.000" expanded="true" height="76" name="Loop Values" width="90" x="313" y="30">
<parameter key="attribute" value="class"/>
<process expanded="true">
<operator activated="true" class="filter_examples" compatibility="6.1.000" expanded="true" height="94" name="Filter Examples" width="90" x="45" y="30">
<parameter key="parameter_string" value="class=%{loop_value}"/>
<parameter key="parameter_expression" value="class==%{loop_value}"/>
<parameter key="condition_class" value="attribute_value_filter"/>
<list key="filters_list"/>
</operator>
<operator activated="true" class="sort" compatibility="6.1.000" expanded="true" height="76" name="Sort" width="90" x="179" y="30">
<parameter key="attribute_name" value="attribute_1"/>
<parameter key="sorting_direction" value="decreasing"/>
</operator>
<operator activated="true" class="generate_id" compatibility="6.1.000" expanded="true" height="76" name="Generate ID" width="90" x="313" y="30"/>
<operator activated="true" class="set_role" compatibility="6.1.000" expanded="true" height="76" name="Set Role" width="90" x="447" y="30">
<parameter key="attribute_name" value="id"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="rename" compatibility="6.1.000" expanded="true" height="76" name="Rename" width="90" x="581" y="30">
<parameter key="old_name" value="id"/>
<parameter key="new_name" value="Rank"/>
<list key="rename_additional_attributes"/>
</operator>
<connect from_port="example set" to_op="Filter Examples" to_port="example set input"/>
<connect from_op="Filter Examples" from_port="example set output" to_op="Sort" to_port="example set input"/>
<connect from_op="Sort" from_port="example set output" to_op="Generate ID" to_port="example set input"/>
<connect from_op="Generate ID" from_port="example set output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Rename" to_port="example set input"/>
<connect from_op="Rename" from_port="example set output" to_port="out 1"/>
<portSpacing port="source_example set" spacing="0"/>
<portSpacing port="sink_out 1" spacing="0"/>
<portSpacing port="sink_out 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="append" compatibility="6.1.000" expanded="true" height="76" name="Append" width="90" x="447" y="30"/>
<connect from_op="Retrieve Sonar" from_port="output" to_op="Loop Values" to_port="example set"/>
<connect from_op="Loop Values" from_port="out 1" to_op="Append" to_port="example set 1"/>
<connect from_op="Append" from_port="merged set" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>0 -
Hi Martin,
Thanks for the quick reply.
The problem is that it just replaces my Unique Id attribute with the generated ID, and no grouping takes place.
so I have operators
retrieve data - set role - sort by percentage - sort by unique ID - generate ID.
I need the generate ID to also group, instead of replace.
Thanks
Regards
Kate0 -
Hi Kate,
i am not completly aware of your problem.
Am I right, that the generate ID operator "deletes" your UniqueID attribute? That might happen if the role of UniqueID is ID. Simply change the Role before to "something else". Then you can do the ranking. Afterwards set the role of UniqueID back to id.
Best,
Martin0 -
HI Martin,
I will try this.
I am trying to implement the script you sent. I used a execute script. pasted the edited code you sent in it. Linked it to the results. It gives an error - undefined macro: loop_value.
I presume I need to insert the loop value operator between sort and generate ID operator?
Will let you know how it works.
Regards
Kate0 -
there is no need to use execute script. The XML code is a complete script. You can import it using the import dialogue or by pasting it into the XML - view. Be sure not to overwrite your process with this!0
-
Sorry Martin,
It still does not work
I must be doing something wrong. I also cannot see what the code you sent is supposed to do because of the error.
Please help.
Regards
Kate0 -
Maybe this thread: http://rapid-i.com/rapidforum/index.php/topic,4654.0.html helps you.Especially the last two sections.0
-
Hi Martin,
Thank you so very much. I managed to follow your instructions.
Sorry I am so new and I did not see the red saying that your post had come through. Hence some of the communication is out of order.
I have successfully implemented the sonar code as per your instructions. I have also manage to successfully implement the code and the Loop value operator in my process. It is a very neat operator.
Your advice has been extremely helpful after I have spent so much time searching and trying to do this. I would never had succeeded without the loop value and the filter macro. Wish I understood how to implement this in other areas of my RapidMiner processes.
Regards
Kate0 -
Hi Kate,
Great! It is always a pleasure to help.
There is some documentation around. E.g. https://rapidminer.com/documentation/ . I personally like the book "Data Mining for the masses" which is linked there. Furthermore there are several youtube sites etc. One of our consultants has his own youtube-channel. See: https://www.youtube.com/user/neuralmarkettrends1
But if you have further questions feel free to ask in this board.
Best,
Martin0