Passing a model to XML process in JAVA

aws
aws New Altair Community Member
edited November 5 in Community Q&A
Good afternoon,

I would like to write a JAVA program that loads a trained classification model and test data from disk and applies the model to the data. I am almost done; the following code works fine:

import java.io.*;
import java.util.*;

import com.rapidminer.Process;
import com.rapidminer.RapidMiner;
import com.rapidminer.RapidMiner.ExecutionMode;
import com.rapidminer.example.*;
import com.rapidminer.example.set.*;
import com.rapidminer.example.table.*;
import com.rapidminer.operator.*;
import com.rapidminer.tools.*;

public class ApplyModelNew {

public static void main(String[] args) {
try {
// set amount of log messages - not sure about the effect
LogService.getGlobal().setVerbosityLevel(LogService.MINIMUM);

// initialize - possible to switch off certain resources?
RapidMiner.setExecutionMode(ExecutionMode.COMMAND_LINE);
RapidMiner.init();

// class running process gets process as XML file
Process process = new Process(new File("apply_model.rmp"));

// data to predict is written to an ExampleSet (extends IOObject)
// (in createData()) and stored in a list
LinkedList<IOObject> linkedList = new LinkedList<IOObject>();
linkedList.add(createData());

// list is stored in IOContainer and given to starting process
// return value is also an IOContainer
IOContainer resultContainer = process.run(new IOContainer(
linkedList));

// take first IOObject of type SimpleExampleSet from container
// (contains only one element anyway)
// from the example set take first Example (= first row)
Example resultExample = resultContainer.get(SimpleExampleSet.class)
.getExample(0);

// print value that is contained in Example's column "prediction"
System.out.println(resultExample.getPredictedLabel());

} catch (IOException e) {
System.out.println("Error: " + e);
} catch (OperatorException e) {
System.out.println("Error: " + e);
} catch (XMLException e) {
System.out.println("Error: " + e);
}
}
The createData() method reads the test data and returns an ExampleSet. The process I load is just

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.003">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.2.003" expanded="true" name="Process">
    <parameter key="logverbosity" value="init"/>
    <parameter key="random_seed" value="2001"/>
    <parameter key="send_mail" value="never"/>
    <parameter key="notification_email" value=""/>
    <parameter key="process_duration_for_mail" value="30"/>
    <parameter key="encoding" value="SYSTEM"/>
    <process expanded="true" height="676" width="1169">
      <operator activated="true" class="read_model" compatibility="5.2.003" expanded="true" height="60" name="Read Model" width="90" x="246" y="30">
        <parameter key="model_file" value="svm.model"/>
      </operator>
      <operator activated="true" class="apply_model" compatibility="5.2.003" expanded="true" height="76" name="Apply Model" width="90" x="447" y="165">
        <list key="application_parameters"/>
        <parameter key="create_view" value="false"/>
      </operator>
      <connect from_port="input 1" to_op="Apply Model" to_port="unlabelled data"/>
      <connect from_op="Read Model" from_port="output" to_op="Apply Model" to_port="model"/>
      <connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="source_input 2" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
However, as you may have noticed, my model is hardwired in the XML file. I would like to change this. In particular, my new process looks as follows:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.017">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.1.017" expanded="true" name="Process">
    <parameter key="logverbosity" value="init"/>
    <parameter key="random_seed" value="2001"/>
    <parameter key="send_mail" value="never"/>
    <parameter key="notification_email" value=""/>
    <parameter key="process_duration_for_mail" value="30"/>
    <parameter key="encoding" value="SYSTEM"/>
    <parameter key="parallelize_main_process" value="false"/>
    <process expanded="true" height="676" width="1169">
      <operator activated="true" class="apply_model" compatibility="5.1.017" expanded="true" height="76" name="Apply Model" width="90" x="514" y="30">
        <list key="application_parameters"/>
        <parameter key="create_view" value="false"/>
      </operator>
      <connect from_port="input 1" to_op="Apply Model" to_port="model"/>
      <connect from_port="input 2" to_op="Apply Model" to_port="unlabelled data"/>
      <connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="source_input 2" spacing="0"/>
      <portSpacing port="source_input 3" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
I am wondering how to change the JAVA program to pass the model to the process. Here's my best guess:

import java.io.*;
import java.util.*;

import com.rapidminer.Process;
import com.rapidminer.RapidMiner;
import com.rapidminer.RapidMiner.ExecutionMode;
import com.rapidminer.example.*;
import com.rapidminer.example.set.*;
import com.rapidminer.example.table.*;
import com.rapidminer.operator.*;
import com.rapidminer.operator.io.ModelLoader;
import com.rapidminer.tools.*;

public class ApplyModelNew2 {

public static void main(String[] args) {
try {
// set amount of log messages - not sure about the effect
LogService.getGlobal().setVerbosityLevel(LogService.MINIMUM);

// initialize - possible to switch off certain resources?
RapidMiner.setExecutionMode(ExecutionMode.COMMAND_LINE);
RapidMiner.init();

--> // Initialize a ModelLoader
ModelLoader ml = new ModelLoader(new OperatorDescription(???));

// class running process gets process as XML file
Process process = new Process(new File("apply_model2.rmp"));

// data to predict is written to an ExampleSet (extends IOObject)
// (in createData()) and stored in a list
LinkedList<IOObject> linkedList = new LinkedList<IOObject>();
--> linkedList.add(ml.read()); // give model to process
linkedList.add(createData());

// list is stored in IOContainer and given to starting process
// return value is also an IOContainer
IOContainer resultContainer = process.run(new IOContainer(
linkedList));

// take first IOObject of type SimpleExampleSet from container
// (contains only one element anyway)
// from the example set take first Example (= first row)
Example resultExample = resultContainer.get(SimpleExampleSet.class)
.getExample(0);

// print value that is contained in Example's column "prediction"
System.out.println(resultExample.getPredictedLabel());

} catch (IOException e) {
System.out.println("Error: " + e);
} catch (OperatorException e) {
System.out.println("Error: " + e);
} catch (XMLException e) {
System.out.println("Error: " + e);
}
}
}
The execution fails due to the correct initialization of the OperatorDescriptor. From a logical point of view, I would expect that the ModelLoader needs the path and file name of my stored model, but I just don't understand how to pass this information correctly.

I am looking foward to your answers.

Greetings,
  Alex

Answers

  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Hi,

    a model is just an IOObject. So don't use anything special (aka ModelLoader), just retrieve the model aka IOObject from the repository, and deliver it to the process.

    Regards,
    Marco