Getting the output file from RM
Fireholder
New Altair Community Member
Hello everybody,
Does anyone knows how can I save the output generated by my code. This is the sample of very simple process, I'm trying to learn how to write RM process from the scratch without use of GUI.From the sample it can be seen that I connected the result to a result file using GUI options, so the output is being saved in a Result.md file.
Thanks in advance.
best regards, Fire
Does anyone knows how can I save the output generated by my code. This is the sample of very simple process, I'm trying to learn how to write RM process from the scratch without use of GUI.From the sample it can be seen that I connected the result to a result file using GUI options, so the output is being saved in a Result.md file.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>And now this is the code I wrote for this process (actually I found some snippets in this forum and used them for my process):
<process version="5.1.006">
<context>
<input>
<location>//NewLocalRepository/Project/Golf</location>
</input>
<output>
<location>result</location>
</output>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.006" expanded="true" name="Process">
<process expanded="true" height="440" width="625">
<operator activated="true" class="retrieve" compatibility="5.1.006" expanded="true" height="60" name="Retrieve" width="90" x="64" y="183">
<parameter key="repository_entry" value="Golf"/>
</operator>
<operator activated="true" class="naive_bayes" compatibility="5.1.006" expanded="true" height="76" name="Naive Bayes" width="90" x="249" y="164"/>
<connect from_op="Retrieve" from_port="output" to_op="Naive Bayes" to_port="training set"/>
<connect from_op="Naive Bayes" from_port="model" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
And the question is should I connect the "training set" port to "result 1" port like it in XML file, if yes how can I do it? (It's very stupid question but I'm learning at least ;D)And how can I save the output of this process to somewhere so I can use or read the data from that file?
import com.rapidminer.Process;
import com.rapidminer.RapidMiner;
import com.rapidminer.RapidMiner.ExecutionMode;
import com.rapidminer.operator.Operator;
import com.rapidminer.operator.io.RepositorySource;
import com.rapidminer.operator.learner.bayes.NaiveBayes;
import com.rapidminer.operator.ports.InputPort;
import com.rapidminer.operator.ports.InputPorts;
import com.rapidminer.operator.ports.OutputPort;
import com.rapidminer.repository.RepositoryLocation;
import com.rapidminer.tools.OperatorService;
public class Test {
private static Operator createRetrievalOperator(Process parentProcess)
throws Exception {
RepositoryLocation location = new RepositoryLocation(
"//NewLocalRepository/Project/Golf");
String loc = parentProcess.makeRelativeRepositoryLocation(location);
// create input operator
Operator retrieve = OperatorService
.createOperator(RepositorySource.class);
/*
* Set the parameter of the operator to the location of the repository
* entry
*/
retrieve.setParameter("repository_entry", loc);
return retrieve;
}
/**
* Connect the output-port <code>fromPortName</code> from Operator
* <code>from</code> with the input-port <code>toPortName</code> of Operator
* <code>to</code>.
*/
private static void connect(Operator from, String fromPortName,
Operator to, String toPortName) {
from.getOutputPorts().getPortByName(fromPortName).connectTo(
to.getInputPorts().getPortByName(toPortName));
}
/**
* Connect the output-port <code>fromPortName</code> from Subprocess
* <code>from</code> with the input-port <code>toPortName</code> of Operator
* <code>to</code>.
*/
public static void main (String[] argv) throws Exception {
// init rapidminer
RapidMiner.setExecutionMode(ExecutionMode.COMMAND_LINE);
RapidMiner.init();
// Create a process
final Process process = new Process();
// all operators from "left to right"
final Operator retrieve = createRetrievalOperator(process);
final NaiveBayes naivebayes=OperatorService.createOperator(NaiveBayes.class);
// add operators to the main process and connect them
process.getRootOperator().getSubprocess(0).addOperator(retrieve);
process.getRootOperator().getSubprocess(0).addOperator(naivebayes);
connect(retrieve, "output", naivebayes, "training set");
// print process setup
System.out.println(process.getRootOperator().createProcessTree(0));
// perform process
process.run();
}
}
Thanks in advance.
best regards, Fire
0
Answers
-
Hello,
I'm sry I don't quite understand your question, but I hope my answer will help anyway.
First, if you already have an XML process file, you can simply create the exact process by calling new Process(String xml). Simply feed the contents of the xml file as a string to the Process constructor, and you're good to go.IOContainer ioResult = process.run(ioInput);
This snippet will execute the process with the given IOContainer object. If your process does not have data coming from the input port, process.run() is fine.
However notice the IOContainer ioResult = process.run(). The result(s) will be saved in this IOContainer object, which basically acts like a list.
So to get the first result, just call
The result must not always be an ExampleSet, but normally, it is.
if (ioResult.getElementAt(0) instanceof ExampleSet) {
ExampleSet resultSet = (ExampleSet)ioResult.getElementAt(0);
}
To store this ExampleSet, you can call:RepositoryManager.getInstance(null).store(resultSet, location, null);
where location is a RepositoryLocation object.
I hope this helps.
Regards,
Marco0 -
This is what I have done so far:
However the compiler can not recognize the ResultSet as a IOObject and gives error.
import java.io.*;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import com.rapidminer.repository.RepositoryLocation;
import com.rapidminer.repository.RepositoryManager;
import com.rapidminer.tools.OperatorService;
import com.rapidminer.Process;
import com.rapidminer.*;
import com.rapidminer.RapidMiner.ExecutionMode;
import com.rapidminer.operator.IOContainer;
import com.rapidminer.operator.IOObject;
//import com.rapidminer.operator.ExecutionUnit;
import com.rapidminer.operator.Operator;
import com.rapidminer.operator.ExecutionMode.*;
import com.rapidminer.example.ExampleSet;
import com.rapidminer.gui.*;
import java.io.*;
//import com.rapidminer.operator.OperatorException;
//import com.rapidminer.operator.IOContainer;
//import java.io.IOException;
import java.io.File;
public class Test2 {
static String line;
static String x="/home/prakt/workspace-rapidminer/Project/FirstProcess.rmp";
private static String readFileAsString(String filePath)
throws java.io.IOException{
StringBuffer fileData = new StringBuffer(1000);
BufferedReader reader = new BufferedReader(
new FileReader(filePath));
char[] buf = new char[1024];
int numRead=0;
while((numRead=reader.read(buf)) != -1){
String readData = String.valueOf(buf, 0, numRead);
fileData.append(readData);
buf = new char[1024];
}
reader.close();
return fileData.toString();
}
public static void main(String[] argv) throws Exception {
// MUST BE INVOKED BEFORE ANYTHING ELSE !!!
RapidMiner.setExecutionMode(ExecutionMode.COMMAND_LINE);
RapidMiner.init();
Process process=new Process(readFileAsString(x));
IOContainer ioResult = process.run();
if (ioResult.getElementAt(0) instanceof ExampleSet) {
ExampleSet resultSet = (ExampleSet)ioResult.getElementAt(0);
}
RepositoryLocation resloc = new RepositoryLocation(
"//NewLocalRepository/Project/Res1");
RepositoryManager.getInstance(null).store(resultSet, resloc, null);
//process.run();
}
}
regards, Fire0 -
Hello,
you are trying to use the local variable resultSet outside of the scope you defined it in. This is something you need to be wary of when working with Java, you can only access variables in their respective scope. A block like {...} also counts as a scope.
So to fix your problem, you need to define the variable outside of the if-statement:ExampleSet resultSet = null;
Alternatively, and probably better in this case would be to move the code which depends on the if-statement anyway into it:
if (ioResult.getElementAt(0) instanceof ExampleSet) {
resultSet = (ExampleSet)ioResult.getElementAt(0);
}
RepositoryLocation resloc = new RepositoryLocation("//NewLocalRepository/Project/Res1");
RepositoryManager.getInstance(null).store(resultSet, resloc, null);if (ioResult.getElementAt(0) instanceof ExampleSet) {
Regards,
ExampleSet resultSet = (ExampleSet)ioResult.getElementAt(0);
RepositoryLocation resloc = new RepositoryLocation("//NewLocalRepository/Project/Res1");
RepositoryManager.getInstance(null).store(resultSet, resloc, null);
}
Marco0 -
Thank you Marco I hope it will work out. And one more thing, how to read the output?I mean, I want to use the data in outputfile and present them in form as it in RM GUI,in form of tables or maybe even as graphical models, but without using RM GUI.It's sound too odd, but I want to know at least how to get the data from outputfile and view them in console mode.
best regards, FIre0 -
Marco, I tried your suggestion and it worked out, but I cant see the result file, I mean output file. This is what I have:
2011-04-26 12:37:06 INFO: No filename given for result file, using stdout for logging results! (WrapperLoggingHandler.log())
2011-04-26 12:37:06 INFO: Process starts (Process.run())
2011-04-26 12:37:06 INFO: Process finished successfully after 0 s (Process.run())
regards, Fire.0 -
Hi,
now you have an ExampleSet object. You can use it to retrieve the actual data, e.g. the attributes and the values.
To check out what else you can do with it, just check out the methods on the ExampleSet object, the Attribute object and the Example object.
Regards,
// note: resultSet must be instanceof ExampleSet
for (Example example : resultSet) {
Iterator<Attribute> allAtts = exampleSet.getAttributes().allAttributes();
while(allAtts.hasNext()) {
Attribute a = allAtts.next();
if (a.isNumerical()) {
double value = example.getValue(a);
} else {
String value = example.getValueAsString(a);
}
}
}
Marco0 -
if (ioResult.getElementAt(0) instanceof ExampleSet) {
This code doesn't seem to work. Because as I understand ExampleSet is an interface which doesn't store any data while ioResult is a container and compiler can't recognize any element of ioResult as a instance of Exampleset,so that if statement doesn't work.
ExampleSet resultSet = (ExampleSet)ioResult.getElementAt(0);
}0 -
Hi,
I'm sorry, but that code works just fine for me. Of course you need to make sure that you're actually getting an ExampleSet as a result from your process and that the index checked via ioResult.getElement(index) is correct.
What do you mean by "the compiler can't recognize it" - does it give you an error when you're trying to compile the program, or is the element simply not an ExampleSet during runtime?
Please check what's in the ioResult object via the Ecplise debug view and post it. If the ioResult container is empty, make sure your process actually generates a result and that you're checking the correct element position (via ioResult.getElement(index)).
If all that fails, post your process xml and the results of your inquiry.
Regards,
Marco0 -
I will try explain everything as clear as I can.;) I have tried to add some code to print "Hello" inside the if statement you recommended me. But the word haven't appear,although compiler hadn't found any error,it means that the ioResult.getElementAt(0) is not an instance of ElementSet ,right? And I followed your suggestion to check the content of ioResult, but I'm not sure whether it contains the result to be assigned as ElementSet or not and this is the content of ioResult container:
<object-stream>
<com.rapidminer.operator.learner.bayes.SimpleDistributionModel id="1" serialization="custom">
<com.rapidminer.operator.AbstractIOObject>
<default>
<source>Naive Bayes</source>
</default>
</com.rapidminer.operator.AbstractIOObject>
<com.rapidminer.operator.ResultObjectAdapter>
<default>
<annotations id="2">
<keyValueMap id="3"/>
</annotations>
</default>
</com.rapidminer.operator.ResultObjectAdapter>
<com.rapidminer.operator.AbstractModel>
<default>
<headerExampleSet id="4" serialization="custom">
<com.rapidminer.operator.ResultObjectAdapter>
<default>
<annotations id="5">
<keyValueMap id="6"/>
</annotations>
</default>
</com.rapidminer.operator.ResultObjectAdapter>
<com.rapidminer.example.set.AbstractExampleSet>
<default>
<idMap id="7"/>
<statisticsMap id="8"/>
</default>
</com.rapidminer.example.set.AbstractExampleSet>
<com.rapidminer.example.set.HeaderExampleSet>
<default>
<attributes class="SimpleAttributes" id="9">
<attributes class="linked-list" id="10">
<AttributeRole id="11">
<special>false</special>
<attribute class="PolynominalAttribute" id="12" serialization="custom">
<com.rapidminer.example.table.AbstractAttribute>
<default>
<annotations id="13">
<keyValueMap id="14"/>
</annotations>
<attributeDescription id="15">
<name>Outlook</name>
<valueType>1</valueType>
<blockType>1</blockType>
<defaultValue>0.0</defaultValue>
<index>0</index>
</attributeDescription>
<constructionDescription>Outlook</constructionDescription>
<statistics class="linked-list" id="16">
<NominalStatistics id="17">
<mode>-1</mode>
<maxCounter>0</maxCounter>
</NominalStatistics>
<UnknownStatistics id="18">
<unknownCounter>0</unknownCounter>
</UnknownStatistics>
</statistics>
<transformations id="19"/>
</default>
</com.rapidminer.example.table.AbstractAttribute>
<PolynominalAttribute>
<default>
<nominalMapping class="PolynominalMapping" id="20">
<symbolToIndexMap id="21">
<entry>
<string>sunny</string>
<int>2</int>
</entry>
<entry>
<string>overcast</string>
<int>1</int>
</entry>
<entry>
<string>rain</string>
<int>0</int>
</entry>
</symbolToIndexMap>
<indexToSymbolMap id="22">
<string>rain</string>
<string>overcast</string>
<string>sunny</string>
</indexToSymbolMap>
</nominalMapping>
</default>
</PolynominalAttribute>
</attribute>
</AttributeRole>
<AttributeRole id="23">
<special>false</special>
<attribute class="NumericalAttribute" id="24" serialization="custom">
<com.rapidminer.example.table.AbstractAttribute>
<default>
<annotations id="25">
<keyValueMap id="26"/>
</annotations>
<attributeDescription id="27">
<name>Temperature</name>
<valueType>3</valueType>
<blockType>1</blockType>
<defaultValue>0.0</defaultValue>
<index>1</index>
</attributeDescription>
<constructionDescription>Temperature</constructionDescription>
<statistics class="linked-list" id="28">
<NumericalStatistics id="29">
<sum>0.0</sum>
<squaredSum>0.0</squaredSum>
<valueCounter>0</valueCounter>
</NumericalStatistics>
<WeightedNumericalStatistics id="30">
<sum>0.0</sum>
<squaredSum>0.0</squaredSum>
<totalWeight>0.0</totalWeight>
<count>0.0</count>
</WeightedNumericalStatistics>
<com.rapidminer.example.MinMaxStatistics id="31">
<minimum>Infinity</minimum>
<maximum>-Infinity</maximum>
</com.rapidminer.example.MinMaxStatistics>
<UnknownStatistics id="32">
<unknownCounter>0</unknownCounter>
</UnknownStatistics>
</statistics>
<transformations id="33"/>
</default>
</com.rapidminer.example.table.AbstractAttribute>
</attribute>
</AttributeRole>
<AttributeRole id="34">
<special>false</special>
<attribute class="NumericalAttribute" id="35" serialization="custom">
<com.rapidminer.example.table.AbstractAttribute>
<default>
<annotations id="36">
<keyValueMap id="37"/>
</annotations>
<attributeDescription id="38">
<name>Humidity</name>
<valueType>3</valueType>
<blockType>1</blockType>
<defaultValue>0.0</defaultValue>
<index>2</index>
</attributeDescription>
<constructionDescription>Humidity</constructionDescription>
<statistics class="linked-list" id="39">
<NumericalStatistics id="40">
<sum>0.0</sum>
<squaredSum>0.0</squaredSum>
<valueCounter>0</valueCounter>
</NumericalStatistics>
<WeightedNumericalStatistics id="41">
<sum>0.0</sum>
<squaredSum>0.0</squaredSum>
<totalWeight>0.0</totalWeight>
<count>0.0</count>
</WeightedNumericalStatistics>
<com.rapidminer.example.MinMaxStatistics id="42">
<minimum>Infinity</minimum>
0 -
Continue of the code...
<maximum>-Infinity</maximum>
What for does the index in ioResult.getElementAt(index) stands for?As I understand it's an index of IOObject in IOContainer,right?How to know which one can be used as input to be assigned to ExampleSet?
</com.rapidminer.example.MinMaxStatistics>
<UnknownStatistics id="43">
<unknownCounter>0</unknownCounter>
</UnknownStatistics>
</statistics>
<transformations id="44"/>
</default>
</com.rapidminer.example.table.AbstractAttribute>
</attribute>
</AttributeRole>
<AttributeRole id="45">
<special>false</special>
<attribute class="PolynominalAttribute" id="46" serialization="custom">
<com.rapidminer.example.table.AbstractAttribute>
<default>
<annotations id="47">
<keyValueMap id="48"/>
</annotations>
<attributeDescription id="49">
<name>Wind</name>
<valueType>1</valueType>
<blockType>1</blockType>
<defaultValue>0.0</defaultValue>
<index>3</index>
</attributeDescription>
<constructionDescription>Wind</constructionDescription>
<statistics class="linked-list" id="50">
<NominalStatistics id="51">
<mode>-1</mode>
<maxCounter>0</maxCounter>
</NominalStatistics>
<UnknownStatistics id="52">
<unknownCounter>0</unknownCounter>
</UnknownStatistics>
</statistics>
<transformations id="53"/>
</default>
</com.rapidminer.example.table.AbstractAttribute>
<PolynominalAttribute>
<default>
<nominalMapping class="PolynominalMapping" id="54">
<symbolToIndexMap id="55">
<entry>
<string>false</string>
<int>1</int>
</entry>
<entry>
<string>true</string>
<int>0</int>
</entry>
</symbolToIndexMap>
<indexToSymbolMap id="56">
<string>true</string>
<string>false</string>
</indexToSymbolMap>
</nominalMapping>
</default>
</PolynominalAttribute>
</attribute>
</AttributeRole>
<AttributeRole id="57">
<special>true</special>
<specialName>label</specialName>
<attribute class="PolynominalAttribute" id="58" serialization="custom">
<com.rapidminer.example.table.AbstractAttribute>
<default>
<annotations id="59">
<keyValueMap id="60"/>
</annotations>
<attributeDescription id="61">
<name>Play</name>
<valueType>1</valueType>
<blockType>1</blockType>
<defaultValue>0.0</defaultValue>
<index>4</index>
</attributeDescription>
<constructionDescription>Play</constructionDescription>
<statistics class="linked-list" id="62">
<NominalStatistics id="63">
<mode>-1</mode>
<maxCounter>0</maxCounter>
</NominalStatistics>
<UnknownStatistics id="64">
<unknownCounter>0</unknownCounter>
</UnknownStatistics>
</statistics>
<transformations id="65"/>
</default>
</com.rapidminer.example.table.AbstractAttribute>
<PolynominalAttribute>
<default>
<nominalMapping class="PolynominalMapping" id="66">
<symbolToIndexMap id="67">
<entry>
<string>yes</string>
<int>1</int>
</entry>
<entry>
<string>no</string>
<int>0</int>
</entry>
</symbolToIndexMap>
<indexToSymbolMap id="68">
<string>no</string>
<string>yes</string>
</indexToSymbolMap>
</nominalMapping>
</default>
</PolynominalAttribute>
</attribute>
</AttributeRole>
</attributes>
</attributes>
</default>
</com.rapidminer.example.set.HeaderExampleSet>
</headerExampleSet>
</default>
</com.rapidminer.operator.AbstractModel>
<com.rapidminer.operator.learner.bayes.SimpleDistributionModel>
<default>
<laplaceCorrectionEnabled>true</laplaceCorrectionEnabled>
<modelRecentlyUpdated>false</modelRecentlyUpdated>
<numberOfAttributes>4</numberOfAttributes>
<numberOfClasses>2</numberOfClasses>
<totalWeight>14.0</totalWeight>
<attributeNames id="69">
<string>Outlook</string>
<string>Temperature</string>
<string>Humidity</string>
<string>Wind</string>
</attributeNames>
<attributeValues id="70">
<string-array id="71">
<string>rain</string>
<string>overcast</string>
<string>sunny</string>
<string>unknown</string>
</string-array>
<null/>
<null/>
<string-array id="72">
<string>true</string>
<string>false</string>
<string>unknown</string>
</string-array>
</attributeValues>
<className>Play</className>
<classValues id="73">
<string>no</string>
<string>yes</string>
</classValues>
<classWeights id="74">
<double>5.0</double>
<double>9.0</double>
</classWeights>
<distributionProperties id="75">
<double-array-array id="76">
<double-array id="77">
<double>-0.9367692632176956</double>
<double>-4.30406509320417</double>
<double>-0.5428649775106072</double>
<double>-4.30406509320417</double>
</double-array>
<double-array id="78">
<double>-1.10633433476202</double>
<double>-0.8244831826210324</double>
<double>-1.5002386204691083</double>
<double>-4.867534450455582</double>
</double-array>
</double-array-array>
<double-array-array id="79">
<double-array id="80">
<double>74.6</double>
<double>7.8930349042684576</double>
<double>2.9849192461013776</double>
</double-array>
<double-array id="81">
<double>73.0</double>
<double>6.164414002968976</double>
<double>2.7377316130678655</double>
</double-array>
</double-array-array>
<double-array-array id="82">
<double-array id="83">
<double>84.0</double>
<double>9.617692030835672</double>
<double>3.182542855463862</double>
</double-array>
<double-array id="84">
<double>78.22222222222223</double>
<double>9.88405000212182</double>
<double>3.209860880213608</double>
</double-array>
</double-array-array>
<double-array-array id="85">
<double-array id="86">
<double>-0.5292593254548287</double>
<double>-0.923163611161917</double>
<double>-4.290459441148391</double>
</double-array>
<double-array id="87">
<double>-1.0986122886681096</double>
<double>-0.41716114787135555</double>
<double>-4.859812404361672</double>
</double-array>
</double-array-array>
</distributionProperties>
<nominal id="88">
<boolean>true</boolean>
<boolean>false</boolean>
<boolean>false</boolean>
<boolean>true</boolean>
</nominal>
<priors id="89">
<double>-1.0296194171811581</double>
<double>-0.4418327522790392</double>
</priors>
<weightSums id="90">
<double-array-array id="91">
<double-array id="92">
<double>2.0</double>
<double>0.0</double>
<double>3.0</double>
<double>0.0</double>
</double-array>
<double-array id="93">
<double>3.0</double>
<double>4.0</double>
<double>2.0</double>
<double>0.0</double>
</double-array>
</double-array-array>
<double-array-array id="94">
<double-array id="95">
<double>373.0</double>
<double>28075.0</double>
<double>0.0</double>
</double-array>
<double-array id="96">
<double>657.0</double>
<double>48265.0</double>
<double>0.0</double>
</double-array>
</double-array-array>
<double-array-array id="97">
<double-array id="98">
<double>420.0</double>
<double>35650.0</double>
<double>0.0</double>
</double-array>
<double-array id="99">
<double>704.0</double>
<double>55850.0</double>
<double>0.0</double>
</double-array>
</double-array-array>
<double-array-array id="100">
<double-array id="101">
<double>3.0</double>
<double>2.0</double>
<double>0.0</double>
</double-array>
<double-array id="102">
<double>3.0</double>
<double>6.0</double>
<double>0.0</double>
</double-array>
</double-array-array>
</weightSums>
</default>
</com.rapidminer.operator.learner.bayes.SimpleDistributionModel>
</com.rapidminer.operator.learner.bayes.SimpleDistributionModel>
</object-stream>0 -
Marco,sorry for my stupidity.))I figured out the problem.Thank you so much.
rgrds, Fire0