Loading models in Java using RapidMiner 9.7
philipp_ginsel
New Altair Community Member
Hello,
we are using RapidMiner in our Java-project (https://github.com/igorvatolkin/AMUSE). We are currently using version 5.3 and are trying to update to the latest version (9.7).
At one point in the project we load a previously trained ".mod"-model-file in order to classify new data with it. In order to do that we use the class "com.rapidminer.operator.io.ModelLoader". When I downloaded the new Rapidminer version I noticed that this class no longer exists. Is there a way to do the same thing using another class with Rapidminer 9.7? Unfortunately I was no able to find anything about this problem and I am hoping that you can help me.
Our code looks like this (I excluded the parts that don't have anything to do with the problem and are specific to our project):
import com.rapidminer.Process;
import com.rapidminer.example.ExampleSet;
import com.rapidminer.operator.IOContainer;
import com.rapidminer.operator.ModelApplier;
import com.rapidminer.operator.Operator;
import com.rapidminer.operator.io.ModelLoader;
import com.rapidminer.operator.ports.InputPort;
import com.rapidminer.operator.ports.OutputPort;
import com.rapidminer.tools.OperatorService;
.
.
.
Process process = new Process();
// (1) Create ExampleSet
.
. //code that creates the ExampleSet exampleSet that should be classified
.
// (2) Load the model
Operator modelLoader = OperatorService.createOperator(ModelLoader.class);
modelLoader.setParameter(ModelLoader.PARAMETER_MODEL_FILE, pathToModelFile);
process.getRootOperator().getSubprocess(0).addOperator(modelLoader);
// (3) Apply the model
Operator modelApplier = OperatorService.createOperator(ModelApplier.class);
process.getRootOperator().getSubprocess(0).addOperator(modelApplier);
// (4) Connect the ports
InputPort modelApplierModelInputPort = modelApplier.getInputPorts().getPortByName("model");
InputPort modelApplierUnlabelledDataInputPort = modelApplier.getInputPorts().getPortByName("unlabelled data");
OutputPort modelLoaderOutputPort = modelLoader.getOutputPorts().getPortByName("output");
OutputPort processOutputPort = process.getRootOperator().getSubprocess(0).getInnerSources().getPortByIndex(0);
modelLoaderOutputPort.connectTo(modelApplierModelInputPort);
processOutputPort.connectTo(modelApplierUnlabelledDataInputPort);
// (5) Run the process
process.run(new IOContainer(exampleSet));
Thank you very much for your help.
Philipp
we are using RapidMiner in our Java-project (https://github.com/igorvatolkin/AMUSE). We are currently using version 5.3 and are trying to update to the latest version (9.7).
At one point in the project we load a previously trained ".mod"-model-file in order to classify new data with it. In order to do that we use the class "com.rapidminer.operator.io.ModelLoader". When I downloaded the new Rapidminer version I noticed that this class no longer exists. Is there a way to do the same thing using another class with Rapidminer 9.7? Unfortunately I was no able to find anything about this problem and I am hoping that you can help me.
Our code looks like this (I excluded the parts that don't have anything to do with the problem and are specific to our project):
import com.rapidminer.example.ExampleSet;
import com.rapidminer.operator.IOContainer;
import com.rapidminer.operator.ModelApplier;
import com.rapidminer.operator.Operator;
import com.rapidminer.operator.io.ModelLoader;
import com.rapidminer.operator.ports.InputPort;
import com.rapidminer.operator.ports.OutputPort;
import com.rapidminer.tools.OperatorService;
.
.
.
Process process = new Process();
// (1) Create ExampleSet
.
. //code that creates the ExampleSet exampleSet that should be classified
.
// (2) Load the model
Operator modelLoader = OperatorService.createOperator(ModelLoader.class);
modelLoader.setParameter(ModelLoader.PARAMETER_MODEL_FILE, pathToModelFile);
process.getRootOperator().getSubprocess(0).addOperator(modelLoader);
// (3) Apply the model
Operator modelApplier = OperatorService.createOperator(ModelApplier.class);
process.getRootOperator().getSubprocess(0).addOperator(modelApplier);
// (4) Connect the ports
InputPort modelApplierModelInputPort = modelApplier.getInputPorts().getPortByName("model");
InputPort modelApplierUnlabelledDataInputPort = modelApplier.getInputPorts().getPortByName("unlabelled data");
OutputPort modelLoaderOutputPort = modelLoader.getOutputPorts().getPortByName("output");
OutputPort processOutputPort = process.getRootOperator().getSubprocess(0).getInnerSources().getPortByIndex(0);
modelLoaderOutputPort.connectTo(modelApplierModelInputPort);
processOutputPort.connectTo(modelApplierUnlabelledDataInputPort);
// (5) Run the process
process.run(new IOContainer(exampleSet));
Philipp
0
Best Answer
-
Hi Philipp!
The ModelLoader was deprecated in 9.2 and is still available in the legacy extension, which is bundled with the RapidMiner installation, but not part of the open source core. Even with 5.3, that operator was pretty much old.
Usually you would use the Retrieve operator instead (java class: RepositorySource), but as the name implies, this requires a repository location to read the model.
Another thing that changed is that the file ending for models in repositories usually is ".ioo". With 9.7 we updated the repository to better support more file types and how they are stored.
So I guess the "easy" way is to follow these steps:
1) Copy the ".mod" file and rename it to end with ".ioo"
2) Create and register a repository in code and use the Retrieve Operator<?xml version="1.0" encoding="UTF-8"?><process version="9.8.000-SNAPSHOT"></code>// create repository to load data Repository repo = new LocalRepository("modelRepo", pathToModelFolder<span>); </span>RepositoryManager.getInstance(null).addRepository(repo<span>); </span>// create retrieve operator instead of ModelLoader RepositorySource retrieveModel = OperatorService.createOperator(RepositorySource.class<span>); </span>retrieveModel.setParameter(RepositorySource.PARAMETER_REPOSITORY_ENTRY, nameOfModelWithoutFileEnding<span>); </span>process.getRootOperator().getSubprocess(0).addOperator(retrieveModel);</pre>Alternatively, you can just create the repo and process in RapidMiner Studio, looking something like this:<br><img alt=""><img alt=""><img alt="" src="https://us.v-cdn.net/6030995/uploads/editor/7j/o6eauqwkzzvk.png"><br>xml:<br><div class="Spoiler"><pre class="CodeBlock"><code>
<context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" expanded="true" name="Process"> <parameter key="logverbosity" value="init"/> <parameter key="random_seed" value="2001"/> <parameter key="send_mail" value="never"/> <parameter key="process_duration_for_mail" value="30"/> <parameter key="encoding" value="SYSTEM"/> <process expanded="true"> <operator activated="true" class="retrieve" compatibility="9.8.000-SNAPSHOT" expanded="true" height="68" name="Retrieve" width="90" x="45" y="34"> <parameter key="repository_entry" value="modelRepo/model"/> </operator> <operator activated="true" class="apply_model" expanded="true" height="82" name="Apply Model" width="90" x="313" y="34"> <list key="application_parameters"/> <parameter key="create_view" value="false"/> </operator> <connect from_port="input 1" to_op="Apply Model" to_port="unlabelled data"/> <connect from_op="Retrieve" from_port="output" to_op="Apply Model" to_port="model"/> <connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/> <portSpacing port="source_input 1" spacing="50"/> <portSpacing port="source_input 2" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> </process> </operator> </process>
Then you can store the process somewhere and create the Process instance from the xml. You still might need to setup the repo programmatically, but you don't need to build the process programmatically.
I hope this helps!
Cheers
Jan
1
Answers
-
Hi Philipp!
The ModelLoader was deprecated in 9.2 and is still available in the legacy extension, which is bundled with the RapidMiner installation, but not part of the open source core. Even with 5.3, that operator was pretty much old.
Usually you would use the Retrieve operator instead (java class: RepositorySource), but as the name implies, this requires a repository location to read the model.
Another thing that changed is that the file ending for models in repositories usually is ".ioo". With 9.7 we updated the repository to better support more file types and how they are stored.
So I guess the "easy" way is to follow these steps:
1) Copy the ".mod" file and rename it to end with ".ioo"
2) Create and register a repository in code and use the Retrieve Operator<?xml version="1.0" encoding="UTF-8"?><process version="9.8.000-SNAPSHOT"></code>// create repository to load data Repository repo = new LocalRepository("modelRepo", pathToModelFolder<span>); </span>RepositoryManager.getInstance(null).addRepository(repo<span>); </span>// create retrieve operator instead of ModelLoader RepositorySource retrieveModel = OperatorService.createOperator(RepositorySource.class<span>); </span>retrieveModel.setParameter(RepositorySource.PARAMETER_REPOSITORY_ENTRY, nameOfModelWithoutFileEnding<span>); </span>process.getRootOperator().getSubprocess(0).addOperator(retrieveModel);</pre>Alternatively, you can just create the repo and process in RapidMiner Studio, looking something like this:<br><img alt=""><img alt=""><img alt="" src="https://us.v-cdn.net/6030995/uploads/editor/7j/o6eauqwkzzvk.png"><br>xml:<br><div class="Spoiler"><pre class="CodeBlock"><code>
<context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" expanded="true" name="Process"> <parameter key="logverbosity" value="init"/> <parameter key="random_seed" value="2001"/> <parameter key="send_mail" value="never"/> <parameter key="process_duration_for_mail" value="30"/> <parameter key="encoding" value="SYSTEM"/> <process expanded="true"> <operator activated="true" class="retrieve" compatibility="9.8.000-SNAPSHOT" expanded="true" height="68" name="Retrieve" width="90" x="45" y="34"> <parameter key="repository_entry" value="modelRepo/model"/> </operator> <operator activated="true" class="apply_model" expanded="true" height="82" name="Apply Model" width="90" x="313" y="34"> <list key="application_parameters"/> <parameter key="create_view" value="false"/> </operator> <connect from_port="input 1" to_op="Apply Model" to_port="unlabelled data"/> <connect from_op="Retrieve" from_port="output" to_op="Apply Model" to_port="model"/> <connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/> <portSpacing port="source_input 1" spacing="50"/> <portSpacing port="source_input 2" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> </process> </operator> </process>
Then you can store the process somewhere and create the Process instance from the xml. You still might need to setup the repo programmatically, but you don't need to build the process programmatically.
I hope this helps!
Cheers
Jan
1 -
Thank you very much for your answer. That seems to be exactly what I was looking for. I'm sorry that it took so long for me to answer. I wanted to test the solution before answering. But unfortunately I'm still having some other problems (that have nothing to do with this particular question), which is why I couldn't really test it yet.
2 -
Hi Philipp,
glad to hear that this already helped. Good luck with the rest of the project!
Cheers
Jan0 -
Hello, I am taking this same error, how can i get rid of this? @jczogalla0