SOMDimensionalityReduction and split attribute set
brianbaker
New Altair Community Member
I want to create an SOM classification and keep the attributes moving through the process stream. When I create the SOM I get back the ID, label, and SOM dimensions only. How do I keep the attributes?
More generally, is there a way to duplicate the attributes, send them into two separate processes and join them back into one set? I've found the join operators, but I haven't found a split.
Thanks!
More generally, is there a way to duplicate the attributes, send them into two separate processes and join them back into one set? I've found the join operators, but I haven't found a split.
Thanks!
<operator name="som" class="SOMDimensionalityReduction" breakpoints="after">
<parameter key="return_preprocessing_model" value="true"/>
<parameter key="number_of_dimensions" value="1"/>
<parameter key="net_size" value="20"/>
<parameter key="training_rounds" value="60"/>
</operator>
0
Answers
-
Hi,
just use the iomultiply operator to copy the exampleset before using the som. After this just join the original and the new exampleset. The id's will be used to identify examples which belong together.
Greetings,
Sebastian0 -
Thank you that is very helpful!
I can merge the som dimension field back in:
However, I'd like to be able to split the stream, do different manipulations to both pieces and then merge them. When I try, the 1st data set always goes into the chain.
<operator name="build som" class="OperatorChain" expanded="yes">
<operator name="IOMultiplier" class="IOMultiplier">
<parameter key="io_object" value="ExampleSet"/>
<parameter key="multiply_type" value="multiply_all"/>
</operator>
<operator name="SOMDimensionalityReduction" class="SOMDimensionalityReduction">
<parameter key="number_of_dimensions" value="1"/>
<parameter key="net_size" value="15"/>
</operator>
<operator name="ExampleSetJoin" class="ExampleSetJoin">
</operator>
</operator>
Is there a way to reorder the data sets so that I can run each through a different set of operations?
<operator name="featureFix" class="OperatorChain" breakpoints="after" expanded="no">
<description text="create derived features, transform nominal to numeric, and remove correlated ones"/>
<operator name="Normalization" class="Normalization">
<parameter key="return_preprocessing_model" value="true"/>
<parameter key="create_view" value="true"/>
</operator>
<operator name="Nominal2Binominal" class="Nominal2Binominal">
</operator>
<operator name="FeatureNameFilter" class="FeatureNameFilter">
<parameter key="skip_features_with_name" value="gender = F"/>
</operator>
<operator name="Nominal2Numerical" class="Nominal2Numerical">
</operator>
<operator name="AttributeConstruction" class="AttributeConstruction">
<list key="function_descriptions">
<parameter key="ageAdjPushup" value="pushupPre / age"/>
<parameter key="ageAdjSitup" value="situpPre / age"/>
</list>
</operator>
<operator name="RemoveCorrelatedFeatures" class="RemoveCorrelatedFeatures">
</operator>
</operator>
<operator name="build som" class="OperatorChain" breakpoints="after" expanded="yes">
<operator name="SOMDimensionalityReduction" class="SOMDimensionalityReduction" breakpoints="after">
<parameter key="number_of_dimensions" value="1"/>
<parameter key="net_size" value="15"/>
</operator>
</operator>
<operator name="ExampleSetJoin" class="ExampleSetJoin" breakpoints="after">
</operator>
0 -
Hi,
yes. You can use the IOSelector to push one of the data sets on top of the stack of objects.
Or simply wait for RapidMiner 5. This will make all this things unnecessary and much more intuitive because of the explicit flow layout.
Greetings,
Sebastian0 -
Nice! when will it be released?0
-
Hi,
we are going to publish the final version in mid December since it's definitively something you have to put under the Christmas tree
Greetings,
Sebastian0