How can I compute/derive additional attributes?
ChrisNelson
New Altair Community Member
I have a dataset with two columns, A, and B. For each record, i want to compute C = (B-A)/B. Can I do that in RapidMiner transformations? Can you direct me to the right one?
I also want to compute the overall C. That is ((B1 + B2 + B3 ...) - (A1 + A2 + A3 ...)) / (B1 + B2 + B3 ...). Can I compute this aggregate function? How? (I think this is a special case of a weighted average which I think I've seen reference to but haven't found the method for computing.)
Thanks.
I also want to compute the overall C. That is ((B1 + B2 + B3 ...) - (A1 + A2 + A3 ...)) / (B1 + B2 + B3 ...). Can I compute this aggregate function? How? (I think this is a special case of a weighted average which I think I've seen reference to but haven't found the method for computing.)
Thanks.
Tagged:
0
Answers
-
So, I found Generate Attribute as the means to create C = f(A,B). Still looking for a new summary function.0
-
Hello
Use Generate Aggregation to create new attributes based on a function applied to attributes within a single example of an example set
regards
Andrew0 -
In 5.2.008 on Linux, the Generate Aggregation has many more parameters in the help than in the form above it. Notably missing is "attribute". I see only:
* attribute name
* attribute filter type
* invert selection
* include special attributes
* aggregation function
* keep all
* ignore missings
Is this a bug or am I missing a step?0 -
When I pick "single" or "subset" instead of "all" for the filter type, a new control appears that allows me to pick attributes. But then it appears that Generate Aggregation works across a single row. What I need is something that works down the columns, perhaps this is creating new meta data?
Given:
I can use Generate Attribute to computer C = (B-A)/A for each row:A B 1 2 2 3 4 1
But I need to calculate ((B1 + B2 + B3) - (A1 + A2 + A3)) / (A1 + A2 + A3):A B C 1 2 1.0 2 3 0.5 4 1 -0.75
I guess I'm adding a new row, not a new column. How do I do that?A B C 1 2 1.0 2 3 0.5 4 1 -0.75 7 6 [glow=red,2,300]-0.14[/glow] 0 -
Hi,
use Aggregate (not Create Aggregation) to create column-wise sum-aggregations. You will end up with a new example set, on which you can apply the formula (sum(A)-sum(B))/sum(A).
Please note that Generate Attributes cannot handle attributes with parenthesis in their names, so you have to rename the aggregation attributes. Please have a look at the attached process.
Happy Mining!
~Marius<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.3.000" expanded="true" name="Process">
<process expanded="true" height="161" width="681">
<operator activated="true" class="generate_data" compatibility="5.3.000" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30"/>
<operator activated="true" class="aggregate" compatibility="5.3.000" expanded="true" height="76" name="Aggregate" width="90" x="179" y="30">
<list key="aggregation_attributes">
<parameter key="att1" value="sum"/>
<parameter key="att2" value="sum"/>
</list>
</operator>
<operator activated="true" class="rename" compatibility="5.3.000" expanded="true" height="76" name="Rename" width="90" x="313" y="30">
<parameter key="old_name" value="sum(att1)"/>
<parameter key="new_name" value="sum_att1"/>
<list key="rename_additional_attributes"/>
</operator>
<operator activated="true" class="rename" compatibility="5.3.000" expanded="true" height="76" name="Rename (2)" width="90" x="447" y="30">
<parameter key="old_name" value="sum(att2)"/>
<parameter key="new_name" value="sum_att2"/>
<list key="rename_additional_attributes"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="5.3.000" expanded="true" height="76" name="Generate Attributes" width="90" x="581" y="30">
<list key="function_descriptions">
<parameter key="value" value="(sum_att1-sum_att2) / sum_att2"/>
</list>
</operator>
<connect from_op="Generate Data" from_port="output" to_op="Aggregate" to_port="example set input"/>
<connect from_op="Aggregate" from_port="example set output" to_op="Rename" to_port="example set input"/>
<connect from_op="Rename" from_port="example set output" to_op="Rename (2)" to_port="example set input"/>
<connect from_op="Rename (2)" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
<connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>0