"Filter/select specific rows from set"
![MRon](https://us.v-cdn.net/6038102/uploads/defaultavatar/nLP0QHCLH24WL.jpg)
MRon
New Altair Community Member
Hello!
I select 500 rows from DB. This set is simple - has two "columns"(attributes?). Both are text fields, first one has label role and it consists name of the car brands).
My question is: how to remove/filter/delete these rows which appear in my set less than 10 times?
I would like to achive this in RapidMiner directly, not on DB level.
Cheers!
I select 500 rows from DB. This set is simple - has two "columns"(attributes?). Both are text fields, first one has label role and it consists name of the car brands).
My question is: how to remove/filter/delete these rows which appear in my set less than 10 times?
I would like to achive this in RapidMiner directly, not on DB level.
Cheers!
0
Answers
-
Hi,
unfortunately, this is currently not as easy as we would like..
However, it is possible
I don't have your data, so I made an example process to illustrate how it can be done:
Note that you will need to adapt the process to your specific settings (changing alot of parameters), but that shouldn't be too hard.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.008">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.008" expanded="true" name="Process">
<process expanded="true" height="346" width="949">
<operator activated="true" class="retrieve" compatibility="5.1.008" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
<parameter key="repository_entry" value="//Samples/data/Golf"/>
</operator>
<operator activated="true" class="retrieve" compatibility="5.1.008" expanded="true" height="60" name="Retrieve (2)" width="90" x="45" y="165">
<parameter key="repository_entry" value="//Samples/data/Golf"/>
</operator>
<operator activated="true" class="set_role" compatibility="5.1.008" expanded="true" height="76" name="Set Role (2)" width="90" x="447" y="165">
<parameter key="name" value="Outlook"/>
<parameter key="target_role" value="id"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="aggregate" compatibility="5.1.008" expanded="true" height="76" name="Aggregate" width="90" x="179" y="30">
<list key="aggregation_attributes">
<parameter key="Outlook" value="count"/>
</list>
<parameter key="group_by_attributes" value="|Outlook"/>
</operator>
<operator activated="true" class="filter_examples" compatibility="5.1.008" expanded="true" height="76" name="Filter Examples" width="90" x="313" y="30">
<parameter key="condition_class" value="attribute_value_filter"/>
<parameter key="parameter_string" value="count(Outlook)>4"/>
</operator>
<operator activated="true" class="set_role" compatibility="5.1.008" expanded="true" height="76" name="Set Role" width="90" x="447" y="30">
<parameter key="name" value="Outlook"/>
<parameter key="target_role" value="id"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="join" compatibility="5.1.008" expanded="true" height="76" name="Join" width="90" x="648" y="120"/>
<connect from_op="Retrieve" from_port="output" to_op="Aggregate" to_port="example set input"/>
<connect from_op="Retrieve (2)" from_port="output" to_op="Set Role (2)" to_port="example set input"/>
<connect from_op="Set Role (2)" from_port="example set output" to_op="Join" to_port="right"/>
<connect from_op="Aggregate" from_port="example set output" to_op="Filter Examples" to_port="example set input"/>
<connect from_op="Filter Examples" from_port="example set output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Join" to_port="left"/>
<connect from_op="Join" from_port="join" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Regards,
Marco0 -
Thank you for your answer!
This is no problem for meMarco Boeck wrote:
unfortunately, this is currently not as easy as we would like... I just thought that I missed operator which does this.
Could you tell me what is Select Attributes in you example for? At first glance I've received the same results without it.Marco Boeck wrote:
I don't have your data, so I made an example process to illustrate how it can be done:
Cheers!
0 -
Hi,
you're right, that operator is not needed. I forgot to remove it
Regards,
Marco0