Multi-level sorting
Find more posts tagged with
Sort by:
1 - 4 of
41
Hey @jreinoso,
If you are familiar with Python scripting in rapidminer, you can do achieve the operation in a single line of code.
Let's say you have this dataset,

and use use pandas sort_values function.

Check out the demo XML.
If you are familiar with Python scripting in rapidminer, you can do achieve the operation in a single line of code.
Let's say you have this dataset,

and use use pandas sort_values function.
data.sort_values(['Date','ColA', 'ColB'], ascending = [True, False, True], inplace = True)You will get the following result.

Check out the demo XML.
<?xml version="1.0" encoding="UTF-8"?><process version="9.7.000"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="9.7.000" expanded="true" name="Process"> <parameter key="logverbosity" value="init"/> <parameter key="random_seed" value="2001"/> <parameter key="send_mail" value="never"/> <parameter key="notification_email" value=""/> <parameter key="process_duration_for_mail" value="30"/> <parameter key="encoding" value="SYSTEM"/> <process expanded="true"> <operator activated="true" class="utility:create_exampleset" compatibility="9.7.000" expanded="true" height="68" name="Create ExampleSet" width="90" x="112" y="34"> <parameter key="generator_type" value="comma separated text"/> <parameter key="number_of_examples" value="100"/> <parameter key="use_stepsize" value="false"/> <list key="function_descriptions"/> <parameter key="add_id_attribute" value="false"/> <list key="numeric_series_configuration"/> <list key="date_series_configuration"/> <list key="date_series_configuration (interval)"/> <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/> <parameter key="time_zone" value="SYSTEM"/> <parameter key="input_csv_text" value="Date, ColA, ColB 2020/05/05, 45, 20 2020/05/05, 415, 2 2020/05/05, 415, 0 2020/05/03, -5, 6 2020/05/08, 4, 8 2020/05/15, 32, 9 2020/05/08, 4, 8 2020/05/08, -9, 21 2020/05/08, 41, 8"/> <parameter key="column_separator" value=","/> <parameter key="parse_all_as_nominal" value="false"/> <parameter key="decimal_point_character" value="."/> <parameter key="trim_attribute_names" value="true"/> </operator> <operator activated="true" class="nominal_to_date" compatibility="9.7.000" expanded="true" height="82" name="Nominal to Date" width="90" x="246" y="34"> <parameter key="attribute_name" value="Date"/> <parameter key="date_type" value="date"/> <parameter key="date_format" value="yyyy/MM/dd"/> <parameter key="time_zone" value="SYSTEM"/> <parameter key="locale" value="English (United States)"/> <parameter key="keep_old_attribute" value="false"/> </operator> <operator activated="true" class="python_scripting:execute_python" compatibility="9.6.000" expanded="true" height="103" name="Execute Python" width="90" x="447" y="34"> <parameter key="script" value="import pandas def rm_main(data): data.sort_values(['Date','ColA', 'ColB'], ascending = [True, False, True], inplace = True) return data"/> <parameter key="notebook_cell_tag_filter" value=""/> <parameter key="use_default_python" value="true"/> <parameter key="package_manager" value="conda (anaconda)"/> <parameter key="use_macros" value="false"/> </operator> <connect from_op="Create ExampleSet" from_port="output" to_op="Nominal to Date" to_port="example set input"/> <connect from_op="Nominal to Date" from_port="example set output" to_op="Execute Python" to_port="input 1"/> <connect from_op="Execute Python" from_port="output 1" to_port="result 1"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> </process> </operator> </process>
Hi,
you could simply install the Jackhammer Extension from Marketplace and use the Advanced Sort operator which allows to specify multiple attributes. You don't need a license file for this particular operator.
Otherwise chain multiple Sort operators in inverse order of their significance. Least important one sort first, then the next, etc. But is much more ugly and also slower than a single Sort (Advanced) Operator of the Jackhammer.
Greetings,
Sebastian