Xvalidation
evgeny
New Altair Community Member
hi,
i am a rapid-i novice.
i want to train the model on a specific set of data and then test on another specific set. from what i can see (so far), there is a round about way of doing this by using:
Xvalidation and selelecting sampling_type = "linear sampling" and number_of_validations = 2
although this requires that both training and testing data sets have the same number of elements and are in a particular order.
is there a more general / sensible way of doing this? in particular, can i base the sampling on one of the data attributes?
many thanks, evgeny.
i am a rapid-i novice.
i want to train the model on a specific set of data and then test on another specific set. from what i can see (so far), there is a round about way of doing this by using:
Xvalidation and selelecting sampling_type = "linear sampling" and number_of_validations = 2
although this requires that both training and testing data sets have the same number of elements and are in a particular order.
is there a more general / sensible way of doing this? in particular, can i base the sampling on one of the data attributes?
many thanks, evgeny.
Tagged:
0
Answers
-
G'Day Evgeny,
Welcome to the world of countless combinations! Sure, linear sampling in a validation wrapper would work, with the limitations you spot, but you can always go freestyle....
1. Filter your examples by attribute value to make a training set.
2. Add a learner to make a model, but do not keep the training examples.
3. Load your test set and apply the model.
4. Have a beer, and examine the results.
Actually the XVal operators just bundle this up so you can test repeatedly, but don't stop for the beer ( but you can always insert a break for that ;D ).
Good luck!
0 -
tx for the quick response. i don't suppose you can post an example of pts 1-3? i.e. how one would do it in practice.0
-
Just this once
<operator name="Root" class="Process" expanded="yes">
<description text="Check comments tab!"/>
<operator name="ExampleSetGenerator" class="ExampleSetGenerator">
<parameter key="target_function" value="simple non linear classification"/>
</operator>
<operator name="Copy exampleset" class="IOMultiplier">
<description text="The Learner will consume the examples, so keep a copy for later."/>
<parameter key="io_object" value="ExampleSet"/>
</operator>
<operator name="1 Train on Att1 positive" class="ExampleFilter" breakpoints="after">
<parameter key="condition_class" value="attribute_value_filter"/>
<parameter key="parameter_string" value="att1>0"/>
</operator>
<operator name="2 Make model" class="ID3Numerical">
</operator>
<operator name="3 Test on Att1 negative" class="ExampleFilter" breakpoints="after">
<description text="As the learner has consumed the original exampleset, the copy madein step two is on top of the stack."/>
<parameter key="condition_class" value="attribute_value_filter"/>
<parameter key="parameter_string" value="att1<0"/>
</operator>
<operator name="Get Beer" class="ModelApplier">
<parameter key="keep_model" value="true"/>
<list key="application_parameters">
</list>
</operator>
</operator>0 -
thanks - that's very helpful.0