"Integrating RapidMiner into your java application"

lexusboy
lexusboy New Altair Community Member
edited November 5 in Community Q&A
Hello All,

I am a student working on RapidMiner for some time now, I love the tool and appreciate the effort that has gone and still going into this product, but recently I have been having problems when I tried to integrate RM into my java application. I seem to have figured out the whole I/O port logic in RM 5.1, but the problem comes when I use a "XValidation" operator chain into my program. As far as I understand, you have to connect all the I/O ports so that the data can flow between the operators, which means every following operator's input port is connected to the preceding operator's output port. However I dont know how to configure the input port of neural net learner (the first inner operator), since there is no port for it to connect to. So I keep getting the error "No data was deliverd at port NeuralNet.training set."  Any help is appreciated. Thanks!

The process goes something like this:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.001">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.1.001" expanded="true" name="Process">
    <process expanded="true" height="359" width="413">
      <operator activated="true" class="retrieve" compatibility="5.1.001" expanded="true" height="60" name="Retrieve" width="90" x="45" y="75">
        <parameter key="repository_entry" value="//Samples/data/Iris"/>
      </operator>
      <operator activated="true" class="x_validation" compatibility="5.1.001" expanded="true" height="112" name="Validation" width="90" x="313" y="120">
        <description>A cross-validation evaluating a decision tree model.</description>
        <process expanded="true" height="654" width="466">
          <operator activated="true" class="neural_net" compatibility="5.1.001" expanded="true" height="76" name="Neural Net" width="90" x="112" y="255">
            <list key="hidden_layers"/>
          </operator>
          <operator activated="true" class="write_model" compatibility="5.1.001" expanded="true" height="60" name="Write Model" width="90" x="313" y="300">
            <parameter key="model_file" value="D:\Bhavya\RapidMiner\test2.mod"/>
            <parameter key="output_type" value="XML"/>
          </operator>
          <connect from_port="training" to_op="Neural Net" to_port="training set"/>
          <connect from_op="Neural Net" from_port="model" to_op="Write Model" to_port="input"/>
          <connect from_op="Write Model" from_port="through" to_port="model"/>
          <portSpacing port="source_training" spacing="0"/>
          <portSpacing port="sink_model" spacing="0"/>
          <portSpacing port="sink_through 1" spacing="0"/>
        </process>
        <process expanded="true" height="654" width="466">
          <operator activated="true" class="apply_model" compatibility="5.1.001" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
            <list key="application_parameters"/>
          </operator>
          <operator activated="true" class="performance" compatibility="5.1.001" expanded="true" height="76" name="Performance" width="90" x="179" y="30"/>
          <connect from_port="model" to_op="Apply Model" to_port="model"/>
          <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
          <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
          <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
          <portSpacing port="source_model" spacing="0"/>
          <portSpacing port="source_test set" spacing="0"/>
          <portSpacing port="source_through 1" spacing="0"/>
          <portSpacing port="sink_averagable 1" spacing="0"/>
          <portSpacing port="sink_averagable 2" spacing="0"/>
        </process>
      </operator>
      <connect from_op="Retrieve" from_port="output" to_op="Validation" to_port="training"/>
      <connect from_op="Validation" from_port="model" to_port="result 1"/>
      <connect from_op="Validation" from_port="averagable 1" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="36"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>
Tagged:

Answers

  • steffen
    steffen New Altair Community Member
    Hello lexusboy,

    Sorry, I cannot reproduce your error. All in all the process looks fine and it ran without any problems. I am using rm 5.1.
    If the posted process does not work "as code", you should post that code here. Otherwise it is hard for us to guess where the error is located ;).

    greetings,

    steffen

  • lexusboy
    lexusboy New Altair Community Member
    Hello Steffen,

    Thanks for your reply, yeah the process works fine from the RM GUI, but I cant seem to get it to work in my java program. I should have posted the code as well, sorry about that, here it goes:
    package rapidminer_5x;

    import com.rapidminer.repository.RepositoryLocation;
    import com.rapidminer.tools.OperatorService ;
    import com.rapidminer.ProcessContext;
    import com.rapidminer.RapidMiner;
    import com.rapidminer.RapidMiner.ExecutionMode;
    import com.rapidminer.Process;
    import com.rapidminer.operator.*;
    import com.rapidminer.operator.io.ModelWriter;
    import com.rapidminer.operator.io.RepositorySource;
    import com.rapidminer.operator.learner.functions.neuralnet.NeuralNetLearner;
    import com.rapidminer.operator.ports.*;
    import com.rapidminer.operator.validation.XValidation;

    public class Test {

    /**
    * @param argv
    */
    public static void main (String [] argv) {

    try {

    //Set the execution mode in which to run RM
    RapidMiner.setExecutionMode(ExecutionMode.COMMAND_LINE);
    // Initialize Rapidminer
    RapidMiner.init();
    }

    catch (Exception e) {
    e.printStackTrace();

    }

    // Create a process
    Process process = new Process();

    try {

    /*Create a location entry for the repository that
    * is to be used as input
    */
    RepositoryLocation location = new RepositoryLocation
    ("//Samples/data/Iris");

    String loc = process.makeRelativeRepositoryLocation(location);

    // create input operator
    Operator retrieve =
    OperatorService.createOperator(RepositorySource.class);


    /*Set the parameter of the operator to the location of
    *the repository entry
    */
    retrieve.setParameter("repository_entry", loc); 

    //Create the neural network operator
    NeuralNetLearner neuralNet =
    OperatorService.createOperator(NeuralNetLearner.class);

    //Create the model writer
    Operator modelWriter =
    OperatorService.createOperator(ModelWriter.class);

    //Create the XValidation operator chain
    XValidation xvalidation = OperatorService.createOperator(XValidation.class);

    xvalidation.setExpanded(true);

    //Set the parameters for model writer
    modelWriter.setParameter("model_file",
    "D:/Bhavya/RapidMiner/test2.mod");

    modelWriter.setParameter("output_type","XML");

    //Create the output port of the retrieve operator
    OutputPort retrieveOutput = retrieve
    .getOutputPorts().getPortByName("output");

    //Create the input port of XValidation
    InputPort xvalidationInput = xvalidation.getInputPorts()
    .getPortByName("training");

    //Create the input port of Neural Net
    InputPort neuralNetInput = neuralNet.getInputPorts()
    .getPortByName("training set");

    //Create the output port of Neural Net
    OutputPort neuralNetOutput = neuralNet.getOutputPorts()
    .getPortByName("model");

    //Create the input port of Model Writer
    InputPort modelWriterInput = modelWriter.getInputPorts()
    .getPortByName("input");

    //Create the output port of Model Writer
    OutputPort modelWriterOutput = modelWriter.getOutputPorts()
    .getPortByName("through");

    xvalidation.shouldAutoConnect(neuralNetInput);

    xvalidation.getSubprocess(0).addOperator(neuralNet);
    xvalidation.getSubprocess(0).addOperator(modelWriter);
    //ExecutionUnit exec =

    //xvalidation.getSubprocess(0).autoWireSingle(neuralNet, null, true, true);

    //Connect the output port of Retrieve to the input port
    //of xvalidation
    retrieveOutput.connectTo(xvalidationInput);
    //retrieveOutput.connectTo(neuralNetInput);
    neuralNetInput = neuralNet.getExampleSetInputPort();

    // add operator to process
    process.getRootOperator().getSubprocess(0)
    .addOperator(retrieve);
    process.getRootOperator().getSubprocess(0)
    .addOperator(xvalidation);

    //Connect the output of neural net to the input of model writer
    neuralNetOutput.connectTo(modelWriterInput);

    //Connect the input of xvalidation to the input of neural net
    // IOObject data = xvalidationInput.getAnyDataOrNull();
    //

    /*
    //IOOb
        //xvalidation.execute();
    IOObject input = xvalidationInput.getData();
    neuralNetInput.receive(input);
    neuralNet.shouldAutoConnect(xvalidationInput);
    *
    *
    */

    //print process setup
    System.out.println(process.getRootOperator().createProcessTree(0));

    // perform process
    process.run();

    xvalidationInput.receive(retrieveOutput.getData());
    IOObject data = xvalidationInput.getData();
    neuralNetInput.receive(data);

    }

    catch(Exception e) {
    e.printStackTrace();
    }
    }
    }
  • steffen
    steffen New Altair Community Member
    Hello lexusboy

    Sorry for the late reply ... this is also nothing one can check on the fly. Warning: I am also trying to get familiar with rapido again (after a long break), so this code may suboptimal.

    Here are some remarks:
    • used ImprovedNeuralNetLearner.class as learner since the NeuralNetLearner in your code has not sufficient capabilities for the iris-dataset
    • added the %{a} - macro to the model-writer-filename so the model of every training step is written
    • shifted some code - blocks around for the sake of readability
    As far as I see, you have forgotten to link the ports of the subprocesses of xvalidation to the input/output ports of the operators in the corresponding subprocess.  This is really the only intelligent modification I have made ;).

    package rapidminer_5x;

    import com.rapidminer.Process;
    import com.rapidminer.RapidMiner;
    import com.rapidminer.RapidMiner.ExecutionMode;
    import com.rapidminer.operator.ExecutionUnit;
    import com.rapidminer.operator.ModelApplier;
    import com.rapidminer.operator.Operator;
    import com.rapidminer.operator.io.ModelWriter;
    import com.rapidminer.operator.io.RepositorySource;
    import com.rapidminer.operator.learner.functions.neuralnet.ImprovedNeuralNetLearner;
    import com.rapidminer.operator.performance.PerformanceEvaluator;
    import com.rapidminer.operator.validation.XValidation;
    import com.rapidminer.repository.RepositoryLocation;
    import com.rapidminer.tools.OperatorService;

    public class Test {

    private static Operator createRetrievalOperator(Process parentProcess)
    throws Exception {
    RepositoryLocation location = new RepositoryLocation(
    "//Samples/data/Iris");

    String loc = parentProcess.makeRelativeRepositoryLocation(location);

    // create input operator
    Operator retrieve = OperatorService
    .createOperator(RepositorySource.class);

    /*
    * Set the parameter of the operator to the location of the repository
    * entry
    */
    retrieve.setParameter("repository_entry", loc);
    return retrieve;
    }

    private static Operator createModelWriter() throws Exception {
    // Create the model writer
    Operator modelWriter = OperatorService
    .createOperator(ModelWriter.class);
    // Set the parameters for model writer
    modelWriter.setParameter("model_file", "c:/nn_run_%{a}.mode");

    modelWriter.setParameter("output_type", "XML");

    return modelWriter;
    }

    /**
    * Connect the output-port <code>fromPortName</code> from Operator
    * <code>from</code> with the input-port <code>toPortName</code> of Operator
    * <code>to</code>.
    */
    private static void connect(Operator from, String fromPortName,
    Operator to, String toPortName) {
    from.getOutputPorts().getPortByName(fromPortName).connectTo(
    to.getInputPorts().getPortByName(toPortName));
    }

    /**
    * Connect the output-port <code>fromPortName</code> from Subprocess
    * <code>from</code> with the input-port <code>toPortName</code> of Operator
    * <code>to</code>.
    */
    private static void connect(ExecutionUnit from, String fromPortName,
    Operator to, String toPortName) {
    from.getInnerSources().getPortByName(fromPortName).connectTo(
    to.getInputPorts().getPortByName(toPortName));
    }

    /**
    * Connect the output-port <code>fromPortName</code> from Operator
    * <code>from</code> with the input-port <code>toPortName</code> of
    * Subprocess <code>to</code>.
    */
    private static void connect(Operator from, String fromPortName,
    ExecutionUnit to, String toPortName) {
    from.getOutputPorts().getPortByName(fromPortName).connectTo(
    to.getInnerSinks().getPortByName(toPortName));
    }

    public static void main(String[] argv) throws Exception {
    // init rapidminer
    RapidMiner.setExecutionMode(ExecutionMode.COMMAND_LINE);
    RapidMiner.init();

    // Create a process
    final Process process = new Process();

    // all operators from "left to right"
    final Operator retrieve = createRetrievalOperator(process);

    final XValidation xvalidation = OperatorService
    .createOperator(XValidation.class);
    xvalidation.setParameter(XValidation.PARAMETER_NUMBER_OF_VALIDATIONS,
    Integer.valueOf(2).toString());

    final ImprovedNeuralNetLearner neuralNet = OperatorService
    .createOperator(ImprovedNeuralNetLearner.class);

    final Operator modelWriter = createModelWriter();

    final Operator modelApplier = OperatorService
    .createOperator(ModelApplier.class);

    final Operator performance = OperatorService
    .createOperator(PerformanceEvaluator.class);

    // add operators to the main process and connect them
    process.getRootOperator().getSubprocess(0).addOperator(retrieve);
    process.getRootOperator().getSubprocess(0).addOperator(xvalidation);
    connect(retrieve, "output", xvalidation, "training");

    // xvalidation
    // training part of xvalidation
    xvalidation.getSubprocess(0).addOperator(neuralNet);
    xvalidation.getSubprocess(0).addOperator(modelWriter);

    // create connection within training process: from left to right ...
    connect(xvalidation.getSubprocess(0), "training", neuralNet,
    "training set");
    connect(neuralNet, "model", modelWriter, "input");
    connect(modelWriter, "through", xvalidation.getSubprocess(0), "model");

    // testing part of xvalidation
    xvalidation.getSubprocess(1).addOperator(modelApplier);
    xvalidation.getSubprocess(1).addOperator(performance);

    // create connection within testing process: from left to right ...
    connect(xvalidation.getSubprocess(1), "model", modelApplier, "model");
    connect(xvalidation.getSubprocess(1), "test set", modelApplier,
    "unlabelled data");
    connect(modelApplier, "labelled data", performance, "labelled data");
    connect(performance, "performance", xvalidation.getSubprocess(1),
    "averagable 1");

    // print process setup
    System.out.println(process.getRootOperator().createProcessTree(0));

    // perform process
    process.run();
    }
    }
    I hope this also works for you

    greetings,

    steffen

  • lexusboy
    lexusboy New Altair Community Member
    Hello Steffen,

    Thanks you so much for your code, it really helped me out. I was stuck at that problem for days, now it works. Thanks again :)