[SOLVED] Model from Generate Script generates no values
aborg
New Altair Community Member
Hi,
Here is my process:
Thanks, gabor
PS: RM 5.2 Community edition; I was not sure whether this is Development topic or not, so no hard feelings if this gets moved.
Here is my process:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>Sorry, it is intentionally large for testing purposes. So, I create a PCAModel in Groov. It seems to be ok in the model output (as I see the same values comparing to the normal PCA operator's model). But when I try to apply that model to the same dataset, I get nothing, but missing values. I guess I did something obviously wrong, but I do not see where is the problem. Do you have idea?
<process version="5.2.008">
<context>
<input>
<location>//Samples/data/Polynomial</location>
</input>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
<process expanded="true" height="251" width="681">
<operator activated="true" class="multiply" compatibility="5.2.008" expanded="true" height="112" name="Multiply" width="90" x="45" y="30"/>
<operator activated="true" class="execute_script" compatibility="5.2.008" expanded="true" height="76" name="Execute Script" width="90" x="246" y="30">
<parameter key="script" value="import com.rapidminer.operator.features.transformation.PCAModel; /*String macroName = "temp_path"; Attribute macroValueAttribute = AttributeFactory.createAttribute("macroValue", com.rapidminer.tools.Ontology.NOMINAL); String macroValue = operator.getProcess().macroHandler.getMacro(macroName); macroValueAttribute.setMapping( new com.rapidminer.example.table.PolynominalMapping( //Collections.singletonMap(Integer.valueOf(0), macroValue) [0:macroValue] )); ExampleTable table = new MemoryExampleTable(macroValueAttribute); table.addDataRow(new IntArrayDataRow(0)); ExampleSet ret = new SimpleExampleSet(table); //String macroValue = operator.getProcess().macroHandler.getMacro(macroName); return [ret] as ExampleSet[];*/ ExampleSet exampleSet = input[0]; int dim = exampleSet.getAttributes().size(); /*double[] eigenValues = new double[dim]; double[][] eigenVectors = new double[dim][dim]; Random r = new Random(2); for (int i = dim; i-->0;) { 	eigenValues = 1.0;//11.0 - i; 	for (int j = dim; j-->0;) 		eigenVectors = 0.0;//r.nextDouble() - 0.5; 	eigenVectors[dim - i - 1] = 1; }*/ Jama.Matrix m = com.rapidminer.tools.math.matrix.CovarianceMatrix. getCovarianceMatrix(exampleSet); double[][] v = m.eig().getV().getArray(); Model model = new PCAModel(exampleSet, /*eigenValues*/m.eig().getRealEigenvalues(), /*eigenVectors*/v); model.setParameter("keep_attribues", "true"); model.setParameter("dimensionality_reduction", "none"); model.setParameter("number_of_components", Integer.toString(dim)); model.setParameter("variance_threshold", "1.0"); model.setNumberOfComponents(dim); return [model] as Model[];"/>
</operator>
<operator activated="true" class="principal_component_analysis" compatibility="5.2.008" expanded="true" height="94" name="PCA" width="90" x="246" y="120"/>
<operator activated="true" class="apply_model" compatibility="5.2.008" expanded="true" height="76" name="Apply Model (2)" width="90" x="380" y="165">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="apply_model" compatibility="5.2.008" expanded="true" height="76" name="Apply Model" width="90" x="380" y="30">
<list key="application_parameters"/>
</operator>
<connect from_port="input 1" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_op="Execute Script" to_port="input 1"/>
<connect from_op="Multiply" from_port="output 2" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Multiply" from_port="output 3" to_op="PCA" to_port="example set input"/>
<connect from_op="Execute Script" from_port="output 1" to_op="Apply Model" to_port="model"/>
<connect from_op="PCA" from_port="example set output" to_port="result 2"/>
<connect from_op="PCA" from_port="original" to_op="Apply Model (2)" to_port="unlabelled data"/>
<connect from_op="PCA" from_port="preprocessing model" to_op="Apply Model (2)" to_port="model"/>
<connect from_op="Apply Model (2)" from_port="labelled data" to_port="result 3"/>
<connect from_op="Apply Model (2)" from_port="model" to_port="result 5"/>
<connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
<connect from_op="Apply Model" from_port="model" to_port="result 4"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
<portSpacing port="sink_result 5" spacing="0"/>
<portSpacing port="sink_result 6" spacing="0"/>
</process>
</operator>
</process>
Thanks, gabor
PS: RM 5.2 Community edition; I was not sure whether this is Development topic or not, so no hard feelings if this gets moved.
Tagged:
0
Answers
-
Never mind... After some debugging there is a solution. I thought the Tools#onlyNumericalAttributes and Tools#onlyNonMissingValues calls are just checking some invariants, but it turned out those are computing some statistics too.
It is interesting that those stats (attribute means) are not checked in the model constructor, just saved. (Maybe a check for NaNs would not take too long and could report errors. Also, some Javadoc would help a bit.)0