Running RM process (Correlation Matrix) from Java does not use dada from exsampl
nadi
New Altair Community Member
Hi,
When I run my process from Java app I always get same result. Passing dataset has no effect. I followed the examples I found, however, was not able to find a solution. Thanks in advance.
Here is the Java code followed by the process code
Java Code:
static public void runProcess()
RapidMiner.setExecutionMode(ExecutionMode.COMMAND_LINE);
RapidMiner.init();
// loads the process from the repository
RepositoryLocation pLoc;
try {
Process myProcess = new Process(new File("C:\\RM_Repository\\DemoPreparationProject\\Processes\\CorrelationAnalysis_NB.rmp"));
ExampleSet es = createExampleSet();
// execute the process and get the resulting objects
IOContainer ioInput = new IOContainer(new IOObject[] {es});
// just use myProcess.run() if you don't use the input ports for your process
IOContainer ioResult = myProcess.run(ioInput);
NumericalMatrix resultExample = ioResult.get(NumericalMatrix .class);
for (int row= 0; row < resultExample.getNumberOfRows(); row++){
for (int col= 0; col < resultExample.getNumberOfColumns(); col++){
System.out.print(String.format("%f6.2 \t" ,resultExample.getValue(row, col)));
}
System.out.println("");
}
} catch (MalformedRepositoryLocationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (XMLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (OperatorException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
static public ExampleSet createExampleSet() {
// create attribute list
List<Attribute> attributes = new LinkedList<Attribute>();
attributes.add(AttributeFactory.createAttribute("A" ,Ontology.REAL));
attributes.add(AttributeFactory.createAttribute("B" ,Ontology.REAL));
attributes.add(AttributeFactory.createAttribute("C" ,Ontology.REAL));
attributes.add(AttributeFactory.createAttribute("D" ,Ontology.REAL));
// create table
MemoryExampleTable table = new MemoryExampleTable(attributes);
double [][] date = getData();
// make and add row
// fill table (here: only real values)
for (int d = 0; d < date[0].length; d++) {
double[] data = new double[attributes.size()];
for (int a = 0; a < attributes.size(); a++) {
// fill with proper data here
data = date;
}
// add data row
table.addDataRow(new DoubleArrayDataRow(data));
}
// create example set
ExampleSet exampleSet = table.createExampleSet();
return exampleSet;
}
Process code:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.006">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.006" expanded="true" name="Process">
<parameter key="logverbosity" value="init"/>
<parameter key="random_seed" value="2001"/>
<parameter key="send_mail" value="never"/>
<parameter key="notification_email" value=""/>
<parameter key="process_duration_for_mail" value="30"/>
<parameter key="encoding" value="SYSTEM"/>
<parameter key="parallelize_main_process" value="false"/>
<process expanded="true" height="-20" width="-50">
<operator activated="true" class="read_csv" compatibility="5.2.006" expanded="true" height="60" name="Read CSV" width="90" x="37" y="53">
<parameter key="csv_file" value="C:\RM_Repository\DemoPreparationProject\Data\RM_data.csv"/>
<parameter key="column_separators" value=","/>
<parameter key="trim_lines" value="false"/>
<parameter key="use_quotes" value="true"/>
<parameter key="quotes_character" value="""/>
<parameter key="escape_character_for_quotes" value="\"/>
<parameter key="skip_comments" value="false"/>
<parameter key="comment_characters" value="#"/>
<parameter key="parse_numbers" value="true"/>
<parameter key="decimal_character" value="."/>
<parameter key="grouped_digits" value="false"/>
<parameter key="grouping_character" value=","/>
<parameter key="date_format" value=""/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<parameter key="time_zone" value="SYSTEM"/>
<parameter key="locale" value="English (United States)"/>
<parameter key="encoding" value="windows-1252"/>
<list key="data_set_meta_data_information">
<parameter key="0" value="A.true.real.attribute"/>
<parameter key="1" value="B.true.real.attribute"/>
<parameter key="2" value="C.true.real.attribute"/>
<parameter key="3" value="D.true.real.attribute"/>
</list>
<parameter key="read_not_matching_values_as_missings" value="true"/>
<parameter key="datamanagement" value="double_array"/>
</operator>
<operator activated="true" class="correlation_matrix" compatibility="5.2.006" expanded="true" height="94" name="Correlation Matrix" width="90" x="334" y="66">
<parameter key="create_weights" value="false"/>
<parameter key="normalize_weights" value="true"/>
<parameter key="squared_correlation" value="false"/>
</operator>
<connect from_op="Read CSV" from_port="output" to_op="Correlation Matrix" to_port="example set"/>
<connect from_op="Correlation Matrix" from_port="matrix" to_port="result 1"/>
<connect from_op="Correlation Matrix" from_port="weights" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
When I run my process from Java app I always get same result. Passing dataset has no effect. I followed the examples I found, however, was not able to find a solution. Thanks in advance.
Here is the Java code followed by the process code
Java Code:
static public void runProcess()
RapidMiner.setExecutionMode(ExecutionMode.COMMAND_LINE);
RapidMiner.init();
// loads the process from the repository
RepositoryLocation pLoc;
try {
Process myProcess = new Process(new File("C:\\RM_Repository\\DemoPreparationProject\\Processes\\CorrelationAnalysis_NB.rmp"));
ExampleSet es = createExampleSet();
// execute the process and get the resulting objects
IOContainer ioInput = new IOContainer(new IOObject[] {es});
// just use myProcess.run() if you don't use the input ports for your process
IOContainer ioResult = myProcess.run(ioInput);
NumericalMatrix resultExample = ioResult.get(NumericalMatrix .class);
for (int row= 0; row < resultExample.getNumberOfRows(); row++){
for (int col= 0; col < resultExample.getNumberOfColumns(); col++){
System.out.print(String.format("%f6.2 \t" ,resultExample.getValue(row, col)));
}
System.out.println("");
}
} catch (MalformedRepositoryLocationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (XMLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (OperatorException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
static public ExampleSet createExampleSet() {
// create attribute list
List<Attribute> attributes = new LinkedList<Attribute>();
attributes.add(AttributeFactory.createAttribute("A" ,Ontology.REAL));
attributes.add(AttributeFactory.createAttribute("B" ,Ontology.REAL));
attributes.add(AttributeFactory.createAttribute("C" ,Ontology.REAL));
attributes.add(AttributeFactory.createAttribute("D" ,Ontology.REAL));
// create table
MemoryExampleTable table = new MemoryExampleTable(attributes);
double [][] date = getData();
// make and add row
// fill table (here: only real values)
for (int d = 0; d < date[0].length; d++) {
double[] data = new double[attributes.size()];
for (int a = 0; a < attributes.size(); a++) {
// fill with proper data here
data = date;
}
// add data row
table.addDataRow(new DoubleArrayDataRow(data));
}
// create example set
ExampleSet exampleSet = table.createExampleSet();
return exampleSet;
}
Process code:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.006">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.006" expanded="true" name="Process">
<parameter key="logverbosity" value="init"/>
<parameter key="random_seed" value="2001"/>
<parameter key="send_mail" value="never"/>
<parameter key="notification_email" value=""/>
<parameter key="process_duration_for_mail" value="30"/>
<parameter key="encoding" value="SYSTEM"/>
<parameter key="parallelize_main_process" value="false"/>
<process expanded="true" height="-20" width="-50">
<operator activated="true" class="read_csv" compatibility="5.2.006" expanded="true" height="60" name="Read CSV" width="90" x="37" y="53">
<parameter key="csv_file" value="C:\RM_Repository\DemoPreparationProject\Data\RM_data.csv"/>
<parameter key="column_separators" value=","/>
<parameter key="trim_lines" value="false"/>
<parameter key="use_quotes" value="true"/>
<parameter key="quotes_character" value="""/>
<parameter key="escape_character_for_quotes" value="\"/>
<parameter key="skip_comments" value="false"/>
<parameter key="comment_characters" value="#"/>
<parameter key="parse_numbers" value="true"/>
<parameter key="decimal_character" value="."/>
<parameter key="grouped_digits" value="false"/>
<parameter key="grouping_character" value=","/>
<parameter key="date_format" value=""/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<parameter key="time_zone" value="SYSTEM"/>
<parameter key="locale" value="English (United States)"/>
<parameter key="encoding" value="windows-1252"/>
<list key="data_set_meta_data_information">
<parameter key="0" value="A.true.real.attribute"/>
<parameter key="1" value="B.true.real.attribute"/>
<parameter key="2" value="C.true.real.attribute"/>
<parameter key="3" value="D.true.real.attribute"/>
</list>
<parameter key="read_not_matching_values_as_missings" value="true"/>
<parameter key="datamanagement" value="double_array"/>
</operator>
<operator activated="true" class="correlation_matrix" compatibility="5.2.006" expanded="true" height="94" name="Correlation Matrix" width="90" x="334" y="66">
<parameter key="create_weights" value="false"/>
<parameter key="normalize_weights" value="true"/>
<parameter key="squared_correlation" value="false"/>
</operator>
<connect from_op="Read CSV" from_port="output" to_op="Correlation Matrix" to_port="example set"/>
<connect from_op="Correlation Matrix" from_port="matrix" to_port="result 1"/>
<connect from_op="Correlation Matrix" from_port="weights" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
0
Answers
-
Hi,
I took a quick look at the Java code and it looks good.
The problem here is your process. There you load a static file ("Data\RM_data.csv") and calculate the correlation matrix.
If you want to use the data you provide when you run the process via Java it has to look like this:
Best,
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.007">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.007" expanded="true" name="Process">
<process expanded="true" height="145" width="212">
<operator activated="true" class="correlation_matrix" compatibility="5.2.007" expanded="true" height="94" name="Correlation Matrix" width="90" x="334" y="66"/>
<connect from_port="input 1" to_op="Correlation Matrix" to_port="example set"/>
<connect from_op="Correlation Matrix" from_port="matrix" to_port="result 1"/>
<connect from_op="Correlation Matrix" from_port="weights" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
Nils0