Ponderated Sum Different Attributes

Hi Everybody, i'm new to Rapidminer and i'm doing a project with a Medical database.
I have 17 attributes with values of 1 and 0. Some values have a score of 1, 2, 3 and other 6. I want to create a new attribute which contains the sum of the scores depending on the value of the different attributes.
For example i want to sum the score of Attribute_1 to Attribute_17 only when they are 1 and next i want to sum the different scores of the different Attributes in one new Attribute (Sum Score)
I know this must be a easy problem, but i can't seem to find the answer, i tried "generate attribute" and followed a "If-Then" logic, but i can't sum the scores of the different attributes, i can only have the last one positive.
Thank you in advance, i hope you can help me.
Find more posts tagged with
I want to make a new column in which it represent the number of times the different previous attributes is positive.
For example in this image: i want to make a column where it counts the number of attributes from the 17 different ones, that are positive.
It works like a index, meaning is a attribute is positive it has a score, and i want to sum the counts in which this attributes are positive.
It's a Charlson Comorbidity Index
Hi again @emanuelmcruz,
To be sure to understand, what you want to obtain is like that (example with 7 attributes) :
mpp
Regards,
Lionel
Hi again @emanuelmcruz,
If attributes can only have 2 values (0 or 1), you can use the Generate Aggregation operator :
<?xml version="1.0" encoding="UTF-8"?><process version="8.2.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="8.2.000" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="read_excel" compatibility="8.2.000" expanded="true" height="68" name="Read Excel" width="90" x="45" y="34">
<parameter key="excel_file" value="C:\Users\Lionel\Documents\Formations_DataScience\Rapidminer\Tests_Rapidminer\Somme_Si\Somme_Si.xlsx"/>
<parameter key="imported_cell_range" value="A1:G3"/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<list key="data_set_meta_data_information">
<parameter key="0" value="Att1.true.integer.attribute"/>
<parameter key="1" value="Att2.true.integer.attribute"/>
<parameter key="2" value="Att3.true.integer.attribute"/>
<parameter key="3" value="Att4.true.integer.attribute"/>
<parameter key="4" value="Att5.true.integer.attribute"/>
<parameter key="5" value="Att6.true.integer.attribute"/>
<parameter key="6" value="Att7.true.integer.attribute"/>
</list>
</operator>
<operator activated="true" class="generate_aggregation" compatibility="8.2.000" expanded="true" height="82" name="Generate Aggregation" width="90" x="179" y="34">
<parameter key="attribute_name" value="sum"/>
</operator>
<connect from_op="Read Excel" from_port="output" to_op="Generate Aggregation" to_port="example set input"/>
<connect from_op="Generate Aggregation" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
I hope it helps,
Regards,
Lionel
Hi again @emanuelmcruz,
I have difficulties to understand the content of your dataset :
I thought that you have only 0 and 1 on your dataset, so all your values are positive ?
Finally, to sum up, you have 17 attributes with only 0 and 1 and additionnal attribute(s) with negative values ? that's right ?
Regards,
Lionel
Hi again @emanuelmcruz,
If attributes can only have 2 values (0 or 1), you can use the Generate Aggregation operator :
<?xml version="1.0" encoding="UTF-8"?><process version="8.2.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="8.2.000" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="read_excel" compatibility="8.2.000" expanded="true" height="68" name="Read Excel" width="90" x="45" y="34">
<parameter key="excel_file" value="C:\Users\Lionel\Documents\Formations_DataScience\Rapidminer\Tests_Rapidminer\Somme_Si\Somme_Si.xlsx"/>
<parameter key="imported_cell_range" value="A1:G3"/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<list key="data_set_meta_data_information">
<parameter key="0" value="Att1.true.integer.attribute"/>
<parameter key="1" value="Att2.true.integer.attribute"/>
<parameter key="2" value="Att3.true.integer.attribute"/>
<parameter key="3" value="Att4.true.integer.attribute"/>
<parameter key="4" value="Att5.true.integer.attribute"/>
<parameter key="5" value="Att6.true.integer.attribute"/>
<parameter key="6" value="Att7.true.integer.attribute"/>
</list>
</operator>
<operator activated="true" class="generate_aggregation" compatibility="8.2.000" expanded="true" height="82" name="Generate Aggregation" width="90" x="179" y="34">
<parameter key="attribute_name" value="sum"/>
</operator>
<connect from_op="Read Excel" from_port="output" to_op="Generate Aggregation" to_port="example set input"/>
<connect from_op="Generate Aggregation" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
I hope it helps,
Regards,
Lionel
Hi @emanuelmcruz
Can you
- share your dataset and
- based on an extract of your dataset, post an example of what you want to obtain.
Regards,
Lionel