Fitting 'by groups'

stereotaxon
stereotaxon New Altair Community Member
edited November 5 in Community Q&A
Hi,

I have grouped data and I'd like to fit a model to each group's data.  To do this in SAS, I use a BY <grouping var> statement and in SPSS it's a split file statement.  Is there a way to do this in RapidMiner?  As I'm doing it now, I'm writing out each subgroups data which is extremely slow and difficult to maintain. 

Thanks for your help,

Mike
Tagged:

Answers

  • IngoRM
    IngoRM New Altair Community Member
    Hi Mike,

    yes, this is possible by using the ExampleFilter operator. Let's say, you have three groups "A", "B", and "C" which are specified in the attribute (column) "groups". You can then use the filter operator together with the ParameterIteration operator (but then you would have to define the groups manually). If you access the latest CVS version, this is possible much more comfortable by using the new ValueSubgroupIterator operator like in the following example:

    <operator name="Root" class="Process" expanded="yes">
        <operator name="DataCreation" class="OperatorChain" expanded="no">
            <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
                <parameter key="target_function" value="sum classification"/>
            </operator>
            <operator name="BinDiscretization" class="BinDiscretization">
                <parameter key="number_of_bins" value="3"/>
                <parameter key="use_long_range_names" value="false"/>
            </operator>
            <operator name="AttributeValueMapper" class="AttributeValueMapper">
                <parameter key="attributes" value="att1"/>
                <parameter key="replace_by" value="A"/>
                <parameter key="replace_what" value="range1"/>
            </operator>
            <operator name="AttributeValueMapper (2)" class="AttributeValueMapper">
                <parameter key="attributes" value="att1"/>
                <parameter key="replace_by" value="B"/>
                <parameter key="replace_what" value="range2"/>
            </operator>
            <operator name="AttributeValueMapper (3)" class="AttributeValueMapper">
                <parameter key="attributes" value="att1"/>
                <parameter key="replace_by" value="C"/>
                <parameter key="replace_what" value="range3"/>
            </operator>
            <operator name="ChangeAttributeName" class="ChangeAttributeName">
                <parameter key="new_name" value="groups"/>
                <parameter key="old_name" value="att1"/>
            </operator>
        </operator>
        <operator name="ValueSubgroupIterator" class="ValueSubgroupIterator" expanded="yes">
            <list key="attributes">
              <parameter key="groups" value="all"/>
            </list>
            <operator name="DecisionTree" class="DecisionTree">
            </operator>
        </operator>
    </operator>
    The first operator chain is just used for data creation. Please note that you have to access the latest CVS version for this which is described here: http://rapid-i.com/content/view/25/48/
    Or you can simply wait since we will make a new release probably next week.

    Cheers,
    Ingo
  • stereotaxon
    stereotaxon New Altair Community Member
    Thanks for your help.  I was able to get it working before you responded by doing something like you suggest.  I have a ton of groups so I created a dummy grouping variable (1-n) and then used an iteratingOperatorChain and the {a} macro to select distinct groups and then run the analysis repetitively.
    -Mike