"Filter/select specific rows from set"
MRon
New Altair Community Member
Hello!
I select 500 rows from DB. This set is simple - has two "columns"(attributes?). Both are text fields, first one has label role and it consists name of the car brands).
My question is: how to remove/filter/delete these rows which appear in my set less than 10 times?
I would like to achive this in RapidMiner directly, not on DB level.
Cheers!
I select 500 rows from DB. This set is simple - has two "columns"(attributes?). Both are text fields, first one has label role and it consists name of the car brands).
My question is: how to remove/filter/delete these rows which appear in my set less than 10 times?
I would like to achive this in RapidMiner directly, not on DB level.
Cheers!
0
Answers
-
Hi,
unfortunately, this is currently not as easy as we would like..
However, it is possible
I don't have your data, so I made an example process to illustrate how it can be done:
Note that you will need to adapt the process to your specific settings (changing alot of parameters), but that shouldn't be too hard.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.008">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.008" expanded="true" name="Process">
<process expanded="true" height="346" width="949">
<operator activated="true" class="retrieve" compatibility="5.1.008" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
<parameter key="repository_entry" value="//Samples/data/Golf"/>
</operator>
<operator activated="true" class="retrieve" compatibility="5.1.008" expanded="true" height="60" name="Retrieve (2)" width="90" x="45" y="165">
<parameter key="repository_entry" value="//Samples/data/Golf"/>
</operator>
<operator activated="true" class="set_role" compatibility="5.1.008" expanded="true" height="76" name="Set Role (2)" width="90" x="447" y="165">
<parameter key="name" value="Outlook"/>
<parameter key="target_role" value="id"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="aggregate" compatibility="5.1.008" expanded="true" height="76" name="Aggregate" width="90" x="179" y="30">
<list key="aggregation_attributes">
<parameter key="Outlook" value="count"/>
</list>
<parameter key="group_by_attributes" value="|Outlook"/>
</operator>
<operator activated="true" class="filter_examples" compatibility="5.1.008" expanded="true" height="76" name="Filter Examples" width="90" x="313" y="30">
<parameter key="condition_class" value="attribute_value_filter"/>
<parameter key="parameter_string" value="count(Outlook)>4"/>
</operator>
<operator activated="true" class="set_role" compatibility="5.1.008" expanded="true" height="76" name="Set Role" width="90" x="447" y="30">
<parameter key="name" value="Outlook"/>
<parameter key="target_role" value="id"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="join" compatibility="5.1.008" expanded="true" height="76" name="Join" width="90" x="648" y="120"/>
<connect from_op="Retrieve" from_port="output" to_op="Aggregate" to_port="example set input"/>
<connect from_op="Retrieve (2)" from_port="output" to_op="Set Role (2)" to_port="example set input"/>
<connect from_op="Set Role (2)" from_port="example set output" to_op="Join" to_port="right"/>
<connect from_op="Aggregate" from_port="example set output" to_op="Filter Examples" to_port="example set input"/>
<connect from_op="Filter Examples" from_port="example set output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Join" to_port="left"/>
<connect from_op="Join" from_port="join" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Regards,
Marco0 -
Thank you for your answer!
This is no problem for me . I just thought that I missed operator which does this.Marco Boeck wrote:
unfortunately, this is currently not as easy as we would like..
Could you tell me what is Select Attributes in you example for? At first glance I've received the same results without it.Marco Boeck wrote:
I don't have your data, so I made an example process to illustrate how it can be done:
Cheers!
0 -
Hi,
you're right, that operator is not needed. I forgot to remove it
Regards,
Marco0