Nesting ValueSubgroupIterators
keith
New Altair Community Member
Is it possible to nest ValueSubgroupIterators inside of one another so you can cycle over the values of more than one nominal attribute? The operator info indicates that you can't combine multiple attributes within the definition of a single ValueSubgroupIterator, but it seemed like if you had the attributes separated and nested into their own nodes, it should work.
Here's an example of what I'm trying to do. I want to normalize the data once, before modeling each subgroup separately, and then apply the preprocessing model and regression model to a new example set. The problem is that the preprocessing model disappears when you get to the 2nd nested level of subgroup. So when I try to use ModelGrouper to create a combined model, it fails.
Keith
Here's an example of what I'm trying to do. I want to normalize the data once, before modeling each subgroup separately, and then apply the preprocessing model and regression model to a new example set. The problem is that the preprocessing model disappears when you get to the 2nd nested level of subgroup. So when I try to use ModelGrouper to create a combined model, it fails.
Thanks for any ideas,
<operator name="Root" class="Process" expanded="yes">
<operator name="ExampleSetGenerator" class="ExampleSetGenerator">
<parameter key="target_function" value="sum"/>
</operator>
<operator name="AttributeSubsetPreprocessing" class="AttributeSubsetPreprocessing" expanded="yes">
<parameter key="attribute_name_regex" value="att1|att2"/>
<parameter key="condition_class" value="attribute_name_filter"/>
<operator name="BinDiscretization" class="BinDiscretization">
<parameter key="number_of_bins" value="3"/>
</operator>
</operator>
<operator name="Normalization" class="Normalization">
<parameter key="return_preprocessing_model" value="true"/>
</operator>
<operator name="ValueSubgroupIterator" class="ValueSubgroupIterator" expanded="yes">
<list key="attributes">
<parameter key="att1" value="all"/>
</list>
<operator name="ValueSubgroupIterator (2)" class="ValueSubgroupIterator" expanded="yes">
<list key="attributes">
<parameter key="att2" value="all"/>
</list>
<operator name="W-LinearRegression" class="W-LinearRegression">
<parameter key="keep_example_set" value="true"/>
</operator>
<operator name="ModelGrouper" class="ModelGrouper">
</operator>
<operator name="ExampleSetGenerator (2)" class="ExampleSetGenerator">
<parameter key="target_function" value="sum"/>
</operator>
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
<parameter key="keep_model" value="true"/>
</operator>
</operator>
</operator>
</operator>
Keith
Tagged:
0
Answers
-
Hi Keith,
the problem is, models are not passed to the inner operators of ValueSubgroupIterators. The only solution is to write the models into a file and reload it within the ValueSubgroupIterator.
Another possibility could be to do it like this:<operator name="Root" class="Process" expanded="yes">
But if this is appropriate depends on your task on hand.
<operator name="ExampleSetGenerator" class="ExampleSetGenerator">
<parameter key="target_function" value="sum"/>
</operator>
<operator name="AttributeSubsetPreprocessing" class="AttributeSubsetPreprocessing" expanded="yes">
<parameter key="attribute_name_regex" value="att1|att2"/>
<parameter key="condition_class" value="attribute_name_filter"/>
<parameter key="deliver_inner_results" value="true"/>
<operator name="BinDiscretization" class="BinDiscretization">
<parameter key="number_of_bins" value="3"/>
<parameter key="range_name_type" value="short"/>
<parameter key="use_long_range_names" value="false"/>
</operator>
</operator>
<operator name="Normalization" class="Normalization">
</operator>
<operator name="ValueSubgroupIterator" class="ValueSubgroupIterator" expanded="yes">
<list key="attributes">
<parameter key="att1" value="all"/>
</list>
<operator name="ValueSubgroupIterator (2)" class="ValueSubgroupIterator" expanded="yes">
<list key="attributes">
<parameter key="att2" value="all"/>
</list>
<operator name="XValidation" class="XValidation" expanded="yes">
<parameter key="leave_one_out" value="true"/>
<operator name="W-LinearRegression" class="W-LinearRegression">
<parameter key="keep_example_set" value="true"/>
</operator>
<operator name="OperatorChain" class="OperatorChain" expanded="yes">
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
<parameter key="keep_model" value="true"/>
</operator>
<operator name="RegressionPerformance" class="RegressionPerformance">
<parameter key="root_mean_squared_error" value="true"/>
</operator>
</operator>
</operator>
</operator>
</operator>
</operator>
Greetings,
Sebastian0 -
Hi,
just a small side note: in the upcoming release (4.3) you could also make use of the new operators "IOStorer" and "IORetrieval" instead of writing things to files. Those operators will store (and retrieve) arbitrary objects at arbitrary points of the process under a specified name. This was actually an idea Steffen gave us some time ago and it really extends the possibilities for processes with RapidMiner. Version 4.3 will be released tomorrow or latest during the weekend.
Cheers,
Ingo0