Integration with Java application

slycom
slycom New Altair Community Member
edited November 5 in Community Q&A
Hi, I'm new to RapidMiner.

I'd like to integrate me small RapidMiner into my application. I want to use the RapidMiner as a library not as a separate application. I read a wiki about RM and library but the information for me is helpless.

How can I integrate the RM? Is there only one possibility to communicate my application and RM using 'process'?

best regards,
Sylvester

Answers

  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Hi,

    I'm sorry, but I don't really get what you are trying to do. You can integrate RapidMiner as a library just like you would with any other libraries, just add the jars to your project. If you want to execute a RapidMiner process, you will indeed need to use the Process class - however you can also chose to implement something else and only use algorithms.
    If you are looking for an easy way to execute a process you built via the RapidMiner GUI, you can see how it is done here: click.

    Regards,
    Marco
  • hattan
    hattan New Altair Community Member
    hi maroco
    can you help please in where we can find the jars files?
  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Hi,

    you can find them where you'd normally find library files: in the "lib" folder ;)

    Regards,
    Marco
  • act
    act New Altair Community Member
    Hi,

    I try to integrate RM in my java project, so I imported rapidminer.jar.
    But just for this try, I realize I have to import a lot of other jars (which are in lib folder). Is it normal? Does it exist a big jar with all needed jars?

    And can you estimate the global size of the total used libs for a java project? (rapidmier.jar is about 12M, so I guess it is far more than 12M...)


    Thx,
  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Hi,

    Yes, that is normal. No, there is no big "all-in-one" jar. If you need that, you can create it yourself.
    Final size? May I humbly suggest you add all the jars and check it yourself? ;)


    Regards,
    Marco
  • maltinho
    maltinho New Altair Community Member
    Hi,

    I would like to use rapidminer from my app, too. I want to use dbscan in a first step. I would like to generate an input for dbscan without generating a xml-file. Is this possible?

    I saw a possibility to do that on
    http://rapid-i.com/wiki/index.php?title=Integrating_RapidMiner_into_your_application
    Is this the actual way to do that?

    If yes, I read:
    "The last thing which must be done is to produce a view on this example table. Such views are called ExampleSet in RapidMiner. The creation of these views is done by the method createCompleteExampleSet(label, null, null, null). The resulting example set can be encapsulated in a IOContainer and given to operators."

    But how can I do this? Can I use process.run(IOContainer)? Or is there a change since the introduction of the RM port-concept?

    I think generating dynamical input for using RM in an own application is a basical step, isn´t it? So were can I find a documentation for this for the actual version of RM?

    Thanks for helping!

    Maltinho
  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Hi,

    a lot of things are possible, sometimes you just need to be creative ;)
    If I understood you correctly, you want to create an ExampleSet with your custom data and then feed it to a process to do some work with it?
    If so, there are multiple ways to do it, I will start with one way to create an ExampleSet:

    List<Attribute> listOfAttributes = // create attributes here
    MemoryExampleTable table = new MemoryExampleTable(listOfAttributes);
    double[] doubleArray = new double[numberOfColumns];
    // fill your data array
    // numerical data
    doubleArray = Double.parseDouble(value);
    // nominal data
    doubleArray = attribute.getMapping().mapString("Im a string!");
    table.addDataRow(new DoubleArrayDataRow(doubleArray));
    // now create the exampleset
    ExampleSet exSet = table.createExampleSet();
    Now you have an ExampleSet filled with your data. Next you need a process to work with the data. You can either create the process yourself manually (I would advise against that, it is much more error-prone and tedious), or you can create a process beforehand via the RapidMiner gui and then just load the XML and create a process from it (This is by far the easiest and most failsafe way to do it).
    After you have your process, you can just call

    IOContainer result = process.run(new IOContainer{ exSet }
    Regards,
    Marco
  • maltinho
    maltinho New Altair Community Member
    Hi,

    first of all, thanks for the fast reply! Yes, that is what I want to do, and that is, what I already did. I tried to build the process by hand... So if I´ll initialize it with this process-file:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.002">
     <context>
       <input/>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="5.2.002" expanded="true" name="Process">
       <process expanded="true" height="116" width="279">
         <operator activated="true" class="dbscan" compatibility="5.2.002" expanded="true" height="76" name="Clustering" width="90" x="179" y="30"/>
         <connect from_port="input 1" to_op="Clustering" to_port="example set"/>
         <connect from_op="Clustering" from_port="cluster model" to_port="result 1"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="source_input 2" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
       </process>
     </operator>
    </process>
    it should work? Okay, thanks, I´ll try it in the evening, when I´m back home... I think the problem was, that the input-port of the Operator was not connected to a port of the process... How can I do this by hand in the code?

    Using a xml-file now
    <connect from_port="input 1" to_op="Clustering" to_port="example set"/>
    should do the job, right?

    Regards,

    Maltinho
  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Hi,

    indeed you can do that via inserting a line like that into the XML, or you could do so via

    process.getRootOperator().getSubprocess(0).getInnerSources().getPortByIndex(YourDesiredPortIndexWhere0IsTheFirstOne).connectTo(
          process.getOperator("YourOperatorName").getInputPorts().getPortByName("example set"));
    Regards,
    Marco