AttributeConstruction + average(X1)
Shubha
New Altair Community Member
Hi,
I have two variables in my ExampleSet, 'X1' and 'average(X1)'. The variable 'average(X1)' is a variable created from RM. Now I want to do an 'AttributeConstruction' based on the variable, 'average(X1)'. Say this could be, (X1-average(X1))^2. But I get the error, 'Unrecognized Symbol "average" Syntax error (implicit multiplication not enabled).
How do I make this work?
Thanks, Shubha
I have two variables in my ExampleSet, 'X1' and 'average(X1)'. The variable 'average(X1)' is a variable created from RM. Now I want to do an 'AttributeConstruction' based on the variable, 'average(X1)'. Say this could be, (X1-average(X1))^2. But I get the error, 'Unrecognized Symbol "average" Syntax error (implicit multiplication not enabled).
How do I make this work?
Thanks, Shubha
Tagged:
0
Answers
-
RM doesn't like 'average(X)' as an attribute name, rename to something else......
see below..<operator name="Root" class="Process" expanded="yes">
<operator name="ExampleSetGenerator" class="ExampleSetGenerator">
<parameter key="number_examples" value="200"/>
<parameter key="target_function" value="random"/>
</operator>
<operator name="ChangeAttributeName" class="ChangeAttributeName">
<parameter key="new_name" value="X1"/>
<parameter key="old_name" value="att1"/>
</operator>
<operator name="ChangeAttributeName (2)" class="ChangeAttributeName">
<parameter key="new_name" value="average(X)"/>
<parameter key="old_name" value="att2"/>
</operator>
<operator name="OperatorSelector" class="OperatorSelector" expanded="yes">
<parameter key="select_which" value="2"/>
<operator name="OperatorChain" class="OperatorChain" expanded="yes">
<operator name="AttributeConstruction" class="AttributeConstruction">
<list key="function_descriptions">
<parameter key="Mmm" value="X1+average(X)"/>
</list>
</operator>
</operator>
<operator name="OperatorChain (2)" class="OperatorChain" expanded="yes">
<operator name="ChangeAttributeName (3)" class="ChangeAttributeName">
<parameter key="new_name" value="average"/>
<parameter key="old_name" value="average(X)"/>
</operator>
<operator name="AttributeConstruction (2)" class="AttributeConstruction">
<list key="function_descriptions">
<parameter key="Mmm" value="X1+average"/>
</list>
</operator>
</operator>
</operator>
</operator>0 -
Oh no.... I have too many have too many average(X1), average(X2) attributes like this.......0
-
Was wondering if that was not a valid name on which RM cannot operate, why should the 'Aggregation' operator should give the aggregate measure with that attribute name (the one with brackets average(X1))...
Now, this has to be done for all the attribues. Attribute names can again be anything...
Thanks, Shubha0 -
Hi,
I agree, if RM makes an attribute, it should be usable. No doubt that will get changed, but in the meantime use a regex to remove the brackets in the attribute name, like this..<operator name="Root" class="Process" expanded="yes">
<operator name="ExampleSetGenerator" class="ExampleSetGenerator">
<parameter key="number_examples" value="200"/>
<parameter key="number_of_attributes" value="4"/>
<parameter key="target_function" value="random"/>
</operator>
<operator name="ChangeAttributeName" class="ChangeAttributeName">
<parameter key="new_name" value="X1"/>
<parameter key="old_name" value="att1"/>
</operator>
<operator name="ChangeAttributeName (2)" class="ChangeAttributeName">
<parameter key="new_name" value="X2"/>
<parameter key="old_name" value="att2"/>
</operator>
<operator name="AttributeSubsetPreprocessing" class="AttributeSubsetPreprocessing" expanded="yes">
<parameter key="attribute_name_regex" value="at.*"/>
<parameter key="condition_class" value="attribute_name_filter"/>
<parameter key="deliver_inner_results" value="true"/>
<operator name="BinDiscretization" class="BinDiscretization">
<parameter key="range_name_type" value="short"/>
</operator>
</operator>
<operator name="Aggregation" class="Aggregation">
<list key="aggregation_attributes">
<parameter key="X1" value="average"/>
<parameter key="X2" value="average"/>
</list>
<parameter key="group_by_attributes" value="at.*"/>
</operator>
<operator name="ChangeAttributeNamesReplace" class="ChangeAttributeNamesReplace">
<parameter key="apply_on_special" value="false"/>
<parameter key="attributes" value="av.*"/>
<parameter key="replace_what" value="\(|\)"/>
</operator>
<operator name="AttributeConstruction" class="AttributeConstruction">
<list key="function_descriptions">
<parameter key="New_Att" value="averageX1+averageX2"/>
</list>
</operator>
</operator>0 -
Hi,
that's true. Unfortunately, the parentheses go back to the first version of RapidMiner in 2001 and we cannot simply change the output names without breaking compatibility. So we have to write a parser for the processes etc. and this is something which is not easily done.
I agree, if RM makes an attribute, it should be usable. No doubt that will get changed, but in the meantime use a regex to remove the brackets in the attribute name, like this..
Until then, however, there are two new helper operators to overcome those naming issues:
- ChangeAttributeNamesReplace: replaces characters in matching attribute names, e.g. all non-word characters by an underscore
- ChangeAttributeNames2Generic: replaces the matching attribute names by generic names
Cheers,
Ingo0