Global sum of a column
StanK
New Altair Community Member
Hi,
I am very disappointed with the way of calculation the simple global sum in RapidMiner. I think if you are unable to make this easy - there is no sense to continue with more complicated things.
In particular, I just need to get a % of an amount for each row from the total sum of a column. This procedure takes normally just seconds in Excel.
Nor Aggregate, neither Pivot could help me - as I don't need "Count", I need a "Total Sum".
I am very disappointed with the way of calculation the simple global sum in RapidMiner. I think if you are unable to make this easy - there is no sense to continue with more complicated things.
In particular, I just need to get a % of an amount for each row from the total sum of a column. This procedure takes normally just seconds in Excel.
Nor Aggregate, neither Pivot could help me - as I don't need "Count", I need a "Total Sum".
Tagged:
0
Best Answer
-
Thank you, Martin! I will look at it!
0
Answers
-
Hi @StanK,i think what you want is very easy to build with like 3 operators. I think all of these operators are part of the training and certification we offer free of charge on academy.rapidminer.com. Attached is a process, which I think does exactly what you want.Best,Martin<?xml version="1.0" encoding="UTF-8"?><process version="9.7.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="9.7.001" expanded="true" name="Process">
<parameter key="logverbosity" value="init"/>
<parameter key="random_seed" value="2001"/>
<parameter key="send_mail" value="never"/>
<parameter key="notification_email" value=""/>
<parameter key="process_duration_for_mail" value="30"/>
<parameter key="encoding" value="SYSTEM"/>
<process expanded="true">
<operator activated="true" class="generate_data" compatibility="9.7.001" expanded="true" height="68" name="Generate Data" width="90" x="112" y="136">
<parameter key="target_function" value="random"/>
<parameter key="number_examples" value="100"/>
<parameter key="number_of_attributes" value="5"/>
<parameter key="attributes_lower_bound" value="0.0"/>
<parameter key="attributes_upper_bound" value="10.0"/>
<parameter key="gaussian_standard_deviation" value="10.0"/>
<parameter key="largest_radius" value="10.0"/>
<parameter key="use_local_random_seed" value="false"/>
<parameter key="local_random_seed" value="1992"/>
<parameter key="datamanagement" value="double_array"/>
<parameter key="data_management" value="auto"/>
</operator>
<operator activated="true" class="aggregate" compatibility="9.7.001" expanded="true" height="82" name="Aggregate (2)" width="90" x="246" y="136">
<parameter key="use_default_aggregation" value="false"/>
<parameter key="attribute_filter_type" value="all"/>
<parameter key="attribute" value=""/>
<parameter key="attributes" value=""/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="attribute_value"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="time"/>
<parameter key="block_type" value="attribute_block"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_matrix_row_start"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="false"/>
<parameter key="default_aggregation_function" value="average"/>
<list key="aggregation_attributes">
<parameter key="att1" value="sum"/>
</list>
<parameter key="group_by_attributes" value=""/>
<parameter key="count_all_combinations" value="false"/>
<parameter key="only_distinct" value="false"/>
<parameter key="ignore_missings" value="true"/>
</operator>
<operator activated="true" class="cartesian_product" compatibility="9.7.001" expanded="true" height="82" name="Cartesian" width="90" x="380" y="136">
<parameter key="remove_double_attributes" value="true"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="9.7.001" expanded="true" height="82" name="Generate Attributes" width="90" x="514" y="136">
<list key="function_descriptions">
<parameter key="fraction_att1" value="att1/[sum(att1)]"/>
</list>
<parameter key="keep_all" value="true"/>
</operator>
<connect from_op="Generate Data" from_port="output" to_op="Aggregate (2)" to_port="example set input"/>
<connect from_op="Aggregate (2)" from_port="example set output" to_op="Cartesian" to_port="left"/>
<connect from_op="Aggregate (2)" from_port="original" to_op="Cartesian" to_port="right"/>
<connect from_op="Cartesian" from_port="join" to_op="Generate Attributes" to_port="example set input"/>
<connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
0 -
Thank you, Martin! I will look at it!
0 -
Hi Martin, thank you for your reply! Where should I exactly to enter this code you sent me?0
-
Hi @StanK,please see this thread for a a guide: https://community.rapidminer.com/discussion/50470/import-xml-code-to-processBest,Martin
0 -
Hi,
Since a couple versions ago, you can copy the XML into your clipboard and then put it into Studio by simply pressing the paste button in the Process panel top right corner:
Regards,
Marco0