I have a data set that contains a numerical label, the predictions from two other models (not generated in RM), and several attributes (mostly numerical, but a few binomial or polynominal).
I suspect that certain values of attributes determine whether one model or the other is more accurate for a given instance, and so I would like to created a weighted average of the two models' predictions, where the weights are based on the other attributes. The relationship between the attributes and weights is what I want RM to learn.
In other words, I have data with attributes: actual, model1, model2, att1
and I want to model this as:
actual = f(att1) * model1 + (1 - f(att1)) * model2
I'm actually open to other approaches that can combine two (or more) predictions based on other attributes. This was just the simplest place to begin.
Two questions:
1) How can we get RM to learn a model of this type (assuming it's possible)?
2) Given that the values (model1, model2) are themselves predictions, is there any way to get RM to see them that way (which might open up Bagging or Boosting as ways to combine them)? There doesn't seem to be a way to define a basic learner that says "predict X to be the value of Y". DefaultLearner almost does this, but instead of the mean, median, or constant, I want it to be the value of another attribute.
The following RM process creates some data that has the properties I described. The predictions from two models are given, but one model is more accurate, depending on the value of some third attribute.
Thanks,
Keith
<operator name="Root" class="Process" expanded="yes">
<operator name="ExampleSetGenerator" class="ExampleSetGenerator">
<parameter key="target_function" value="sum"/>
<parameter key="number_of_attributes" value="2"/>
<parameter key="attributes_lower_bound" value="0.0"/>
<parameter key="attributes_upper_bound" value="100.0"/>
</operator>
<operator name="Rename to Model1" class="ChangeAttributeName">
<parameter key="old_name" value="att1"/>
<parameter key="new_name" value="model1"/>
</operator>
<operator name="Rename to Model2" class="ChangeAttributeName">
<parameter key="old_name" value="att2"/>
<parameter key="new_name" value="model2"/>
</operator>
<operator name="AttributeConstruction" class="AttributeConstruction">
<list key="function_descriptions">
<parameter key="att1" value="rand()"/>
<parameter key="actual" value="sin(att1*pi/2)*model1 + (1-sin(att1*pi/2))*model2 + (rand()-0.5)*5"/>
<parameter key="diff1" value="abs(actual-model1)/abs(model1-model2)"/>
<parameter key="diff2" value="abs(actual-model2)/abs(model1-model2)"/>
</list>
</operator>
<operator name="ChangeAttributeRole" class="ChangeAttributeRole">
<parameter key="name" value="actual"/>
<parameter key="target_role" value="label"/>
</operator>
</operator>