Decision Tree Parser

Legacy User
Legacy User New Altair Community Member
edited November 5 in Community Q&A
Hi,

as far as I saw, RapidMiner 4.2 does not provide a module to dump the tree
representation (available in the "Text View") of decision trees and
random forests into equivalent programming language constructs. Especially,
I'm interested in a tree parser for the language C/C++.

I think that this is a very useful feature (accidentally I've also seen that
there was a related post couple of days ago) when you generate decision
trees that you afterwards want to use in your application. This is common
practice in some domains and I think that people would be very happy about that,
in particular because this is also not supported by R yet.
Currently, you must do this translation by hand which is quite cumbersome and error-prone.

I would highly appreciate this extension. :-)

Thank you.

Cheers,
Stephan
Tagged:

Answers

  • haddock
    haddock New Altair Community Member
    Hi,

    Could you not make use of the Tree2RuleConverter ( 5.4.65 in the manual )?

  • Legacy User
    Legacy User New Altair Community Member
    No idea if Tree2RuleConverter can be used.  ;)
    The documentation is somehow short at this point.

    Did you ever use it or do you have an idea how Tree2RuleConverter
    could be applied in particular for my problem (maybe you have an
    XML example)?
  • land
    land New Altair Community Member
    Hi Stephan,
    at least to my knowledge this feature does not exist until now. And im not quite sure if this will be added in future, since RapidMiner is built to be utilized as a library for doing exactly such things. You could easily translate the decision into an example, put it into the treeModel and get the decision back. If you are an experienced C++ programmer this should be done before breakfast.
    On the other hand as experienced as your are then, you should get a programm to work, which translates the TreeModel of rapid miner recursivly into an C++ if/then programm.
    Easiest solution but involving a little bit of handwork would be, to overwrite the toString method in TreeModel, respectivly in TreeNode. This method is already recursivly and needs only little changes.

    If you want it totaly without manual interaction, you could overwrite
    public final void write(OutputStream out) throws IOException
    within TreeModel, so that it outputs an appropriate representation.
    Then you could use IOObjectWriter to write the model into a file. If you use XML encoding without compression, this would be your desired  result.

    Greetings,
      Sebastian
  • Legacy User
    Legacy User New Altair Community Member
    Hi,

    I don't know exactly how I could use the Java RapidMiiner library in my C++ project.
    My idea was to invoke RapidMiner (for example via "system") from my application
    with "java -jar rapidminer.jar myfile.XML" . By passing the XML file, RapidMiner
    should learn the randomforest classifier and produce a file where the
    model is dumped to, so that I can parse it afterwards in my C++ application.
    Is this a good idea or do you know a better solution for using the Java library
    in my C++ program?

    In doing it that way that I described above, I could use the IOObjectWriter or
    something similar to generate a file where the model is written to. However,
    with IOObjectWriter I can just chose XML or binary as output format
    that is hard to parse in my application. Is there a way to dump the randomforest
    tree models into a file exactly in the form as seen in the TextView mode like
    (taken from samle 01_DecisionTree.xml):

    Outlook = sunny
    |   Humidity <= 77.500: yes {no=0, yes=2}
    |   Humidity > 77.500: no {no=3, yes=0}
    Outlook = overcast: yes {no=0, yes=4}
    Outlook = rain
    |   Wind = false: yes {no=0, yes=3}
    |   Wind = true: no {no=2, yes=0}

    I don't want to extend the RapidMiner functions since my Java skills are not that good,
    thus a pure C++ solution is best suited.  ::)
  • IngoRM
    IngoRM New Altair Community Member
    Hi,

    Is this a good idea or do you know a better solution for using the Java library in my C++ program?
    maybe the information in this link help:

    http://www.javaworld.com/javaworld/javatips/jw-javatip17.html

    Is there a way to dump the randomforest tree models into a file exactly in the form as seen in the TextView mode like (taken from samle 01_DecisionTree.xml):
    Yes. Please use the operator "ResultWriter" which will output the textual result into the specified file (or into the global result file).

    Cheers,
    Ingo
  • Legacy User
    Legacy User New Altair Community Member
    Hi,

    it seems to me that the operator "ResultWriter"  can be only used for single decision trees
    but not for random forests, right?

    Regards,
    Stepan
  • TobiasMalbrecht
    TobiasMalbrecht New Altair Community Member
    Hi Stephan,
    Stephan wrote:

    it seems to me that the operator "ResultWriter"  can be only used for single decision trees
    but not for random forests, right?
    you are right. The SimpleVoteModel which holds the decision tree models did not support the text output. I have changed that in the newest developer version (the branch Zaniah). The change will of course be part in the next release.

    Regards,
    Tobias