"RM 4.3 Feature Generation problem"

keith
keith New Altair Community Member
edited November 5 in Community Q&A

I've installed the new RM 4.3 EE release, but I am having problems with the Feature Generation not recognizing generated features when used in later steps.

This example adds one to the first attribute (successfully) to generate a new "plusone" attribute.  Then, it tries to use "plusone" in a subsequent step, but RM returns an error that "plusone" doesn't exist.
<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
        <parameter key="attributes_lower_bound" value="0.0"/>
        <parameter key="attributes_upper_bound" value="1.0"/>
        <parameter key="target_function" value="random"/>
    </operator>
    <operator name="FeatureGeneration" class="FeatureGeneration">
        <list key="functions">
          <parameter key="plusone" value="+(att1,const[1]())"/>
          <parameter key="nextval" value="+(plusone,att2)"/>
        </list>
        <parameter key="keep_all" value="true"/>
    </operator>
</operator>
I then tried splitting the computation up across two FeatureGeneration nodes, but it yields the same error:

<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
        <parameter key="attributes_lower_bound" value="0.0"/>
        <parameter key="attributes_upper_bound" value="1.0"/>
        <parameter key="target_function" value="random"/>
    </operator>
    <operator name="FeatureGeneration" class="FeatureGeneration">
        <list key="functions">
          <parameter key="plusone" value="+(att1,const[1]())"/>
        </list>
        <parameter key="keep_all" value="true"/>
    </operator>
    <operator name="FeatureGeneration (2)" class="FeatureGeneration">
        <list key="functions">
          <parameter key="nextval" value="+(plusone,att2)"/>
        </list>
        <parameter key="keep_all" value="true"/>
    </operator>
</operator>
This appears to be a regression from 4.2, as I had a working process in RM 4.2 that now fails.  Also note that I had to use the prefix notation to get the computation to work, not the infix notation that is supposed to be in RM 4.3.  Either my installation is messed up (perhaps from an incomplete uninstall?), or there's a bug in RM 4.3. 

Thanks,
Keith

Answers

  • IngoRM
    IngoRM New Altair Community Member
    Hi Keith,

    you are right: we had to remove the feature that freshly created attributes can be directly re-used from the FeatureGeneration operator since it unfortunately caused bugs in other settings. Since it was quite unpredictable (even for us who always work on predictions  ;)  )in which cases everything works and in which cases not, we decided to remove this feature.

    This was also supported by the decision to create a new operator "AttributeConstruction" which should now be preferred for the construction of new attributes instead of the FeatureGeneration operator. This new operator "AttributeConstruction" also is the one which supports infix formulas and nicer constants (just "1" instead of "const[1]()") as well as many new functions - including a very nice if-function, e.g.

    if (attribute1 > 5, sin(attribute2), cos(attribute3))

    which will create a new attribute with the value sin(attribute2) if the value of attribute1 is larger than 5 and cos(attribute3) otherwise. Even nominal values are supported in this if-statement, e.g.

    if (attribute1 == "dog", attribute2 + attribute3, 42)


    So why did we keep the old FeatureGeneration operator at all (and it is not even marked as deprecated)? The reason is simple: it is faster on real large datasets and so we decided to keep it but we unfortunately had to remove the re-use-just-created-attributes functionality.

    Hope that clarifies things about feature construction a bit.

    Cheers,
    Ingo
  • keith
    keith New Altair Community Member
    Thanks for the clarification, Ingo.  That makes sense, although it's unfortunate that backward compatibility couldn't be maintained.  I now have to rewrite several existing processes that no longer run properly under RM 4.3. :(
  • IngoRM
    IngoRM New Altair Community Member
    Hello Keith,

    yes, that's a PITA, sorry about that. If you are not yet have rewritten all of your processes, we could also try to include the old functionality of the old operator FeatureGeneration under a new name, e.g. "FeatureGenerationDeprecated" and deliver this with the next EE update. Then it is simply a matter of replacing all "FeatureGeneration" with "FeatureGenerationDeprecated" which might be easier. Of course, the deprecated operator will be removed for some version in the future but it would give you some more time to update your processes. Please let me know if this would be useful, then I would ask one of our developers to add this "new" (old) operator.

    Cheers,
    Ingo
  • keith
    keith New Altair Community Member
    Thanks for the offer, Ingo.  At the moment, I'll probably just bite the bullet and rewrite my processes now.  I'd much rather have you guys working on new features!  But if its possible in the future to preserve some modicum of backward compatibility when making other changes, it would be appreciated.  :-)

    Keith
  • IngoRM
    IngoRM New Altair Community Member

    But if its possible in the future to preserve some modicum of backward compatibility when making other changes, it would be appreciated.
    We will do our best  :)

    Cheers,
    Ingo