🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Iterating through example subsets

User: "darkobodnaruk"
New Altair Community Member
Updated by Jocelyn
Hi, is there a way to create a process to do the following:

- start with a dataset with 5.000 examples
- iterate through subsets of 1.000 examples (sequentially, examples 0-999, then 1000-1999 and so on) and run the same classification algorithm on each subset
- write the results and performance of each subset to a file
- (if possible, average performance over all subsets)

I know about ExampleRangeFilter. I'm guessing it might have something to do with macros, where you can define variable parameters, but don't know how to do a loop/iteration?

I'm experimenting with ParameterIteration now, but if I want to vary two parameters, first_example (0, 1000, 2000...) and last_example (999, 1999...) for ExampleRangeFilter, I get 5x5=25 iterations instead of only 5...

regards,
darko

Find more posts tagged with

Sort by:
1 - 2 of 21
    User: "haddock"
    New Altair Community Member
    Hi,

    What you describe is called validation, like this...
    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
            <parameter key="target_function" value="random classification"/>
            <parameter key="number_examples" value="5000"/>
        </operator>
        <operator name="SlidingWindowValidation" class="SlidingWindowValidation" expanded="yes">
            <operator name="LibSVMLearner" class="LibSVMLearner">
                <list key="class_weights">
                </list>
            </operator>
            <operator name="OperatorChain" class="OperatorChain" expanded="yes">
                <operator name="ModelApplier" class="ModelApplier">
                    <list key="application_parameters">
                    </list>
                </operator>
                <operator name="ClassificationPerformance" class="ClassificationPerformance" breakpoints="after">
                    <parameter key="accuracy" value="true"/>
                    <list key="class_weights">
                    </list>
                </operator>
            </operator>
        </operator>
    </operator>
    Lots of examples on Help->RapidMiner Tutorial.

    User: "darkobodnaruk"
    New Altair Community Member
    OP
    Now that you put it like that, it IS validation. Not sure why I tried to make it more complicated. :)

    But I didn't know about SlidingWindowValidation before, thanks a lot!