Rapidminer changes my values...
Jorge
New Altair Community Member
Hi,
I'm working with Rapidminer 4.3 in that project
Anyone can help me?
Is only a print error, or affects too in the learning operator?
Thanks in advance.
Cheers,
Jorge
I'm working with Rapidminer 4.3 in that project
with input.arff...
<operator name="Root" class="Process" expanded="yes">
<operator name="ArffExampleSource" class="ArffExampleSource" breakpoints="after">
<parameter key="data_file" value="C:\Input.arff"/>
<parameter key="id_attribute" value="1"/>
<parameter key="label_attribute" value="example6"/>
</operator>
<operator name="InteractiveAttributeWeighting" class="InteractiveAttributeWeighting">
</operator>
<operator name="Learn" class="OperatorChain" expanded="yes">
<operator name="W-NaiveBayesUpdateable" class="W-NaiveBayesUpdateable">
</operator>
<operator name="ModelWriter" class="ModelWriter">
<parameter key="model_file" value="C:\model.mod"/>
<parameter key="output_type" value="XML"/>
</operator>
</operator>
<operator name="ArffExampleSource (2)" class="ArffExampleSource" breakpoints="after">
<parameter key="data_file" value="C:\Prediction.arff"/>
<parameter key="id_attribute" value="1"/>
<parameter key="label_attribute" value="example6"/>
</operator>
<operator name="ModelLoader" class="ModelLoader">
<parameter key="model_file" value="C:\model.mod"/>
</operator>
<operator name="ModelApplier" class="ModelApplier">
</operator>
</operator>
and prediction.arff.....
@RELATION Input
@ATTRIBUTE Id numeric
@ATTRIBUTE example1 string
@ATTRIBUTE example2 string
@ATTRIBUTE example3 string
@ATTRIBUTE example4 string
@ATTRIBUTE example5 string
@ATTRIBUTE example6 string
@DATA
'1','ex1','hello4','hw1','false','1000k','slow'
'2','ex1','hello6','hw2','true','4000k','slow'
'3','ex1','hello2','hw3','false','500k','slow'
'4','ex1','hello3','hw3','true','2000k','slow'
'5','ex2','hello2','hw2','true','500k','slow'
'6','ex2','hello5','hw1','true','1000k','mid'
'7','ex2','hello2','hw3','false','4000k','fast'
'8','ex3','hello','hw1','true','2000k','mid'
'9','ex3','hello','hw2','true','4000k','fast'
'10','ex3','hello','hw3','false','2000k','slow'
'11','ex3','hello','hw1','false','500k','mid'
when I execute the program, at the results, I click on "Data View" of the "Data Table" and the values of the colum "example1" are differents of the prediction.arff example1 attribute.
@RELATION Prediction
@ATTRIBUTE Id numeric
@ATTRIBUTE example1 string
@ATTRIBUTE example2 string
@ATTRIBUTE example3 string
@ATTRIBUTE example4 string
@ATTRIBUTE example5 string
@DATA
'100','ex1','hello','hw1','false','1000k'
'101','ex1','hello2','hw2','true','4000k'
'102','ex1','hello','hw2','true','4000k'
'103','ex1','hello2','hw3','true','500k'
'104','ex1','hello','hw2','true','2000k'
'105','ex1','hello2','hw1','true','4000k'
'106','ex2','hello3','hw1','false','500k'
'107','ex3','hello3','hw2','true','4000k'
'108','ex3','hello4','hw3','true','500k'
'109','ex3','hello5','hw3','false','500k'
'110','ex3','hello6','hw2','true','500k'
'111','ex3','hello2','hw1','false','500k'
'112','ex3','hello6','hw1','true','500k'
Anyone can help me?
Is only a print error, or affects too in the learning operator?
Thanks in advance.
Cheers,
Jorge
Tagged:
0
Answers
-
Hello Jorge
I got this warning message:
RM stores a mapping for nominal values which somehow affects the models. I suggest as workaround:
[Warning] W-NaiveBayesUpdateable: The internal nominal mappings are not the same between training and application for attribute 'example2'. This will probably lead to wrong results during model application.
-> Load both files, add an attribute marking it as train /prediction (AttributeConstruction and ChangeAttributeRole)
-> Merge (ExampleSetMerge)
-> save as exampleset
now you can perform your posted process either by loading the set twice and apply ExampleFilter or by using a combination of ExampleFilter and IOMultiplier
hope this was helpful
regards,
Steffen0 -
Thanks a lot steffen
It works perfectly now0 -
Steffen. I got the same problem but in rapidminer 5.0. I apply a model gotten from the "optimize selection evolutionary" process and i get the same
warnings:
" WARNING: SimpleDistribution: The internal nominal mappings are not the same between training and application for attribute 'carrera'. This will probably lead to wrong results during model application."
and the results in the prediction are not the same as those resulted in the split validation which tells me that this warning does lead to wrong results.
but i don't find the same operators where you say:
RM stores a mapping for nominal values which somehow affects the models. I suggest as workaround:
-> Load both files, add an attribute marking it as train /prediction (AttributeConstruction and ChangeAttributeRole)
-> Merge (ExampleSetMerge)
-> save as exampleset
how can i do the latter in rapidminer 5.0?
i do it without the optimizer: my XML looks like this
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.0.10" expanded="true" name="Process">
<process expanded="true" height="528" width="619">
<operator activated="true" class="retrieve" compatibility="5.0.10" expanded="true" height="60" name="vm_socdem_e_Xchanged" width="90" x="45" y="120">
<parameter key="repository_entry" value="vm_socdem_e_Xchanged"/>
</operator>
<operator activated="true" class="set_role" compatibility="5.0.10" expanded="true" height="76" name="SET ID" width="90" x="160" y="127">
<parameter key="name" value="cuenta"/>
<parameter key="target_role" value="id"/>
</operator>
<operator activated="true" class="set_role" compatibility="5.0.10" expanded="true" height="76" name="Set Role" width="90" x="281" y="136">
<parameter key="name" value="aprob_c"/>
<parameter key="target_role" value="label"/>
</operator>
<operator activated="true" class="select_attributes" compatibility="5.0.10" expanded="true" height="76" name="Select Attributes (2)" width="90" x="380" y="30">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="turno|raz_elec|no_unam|ingr_fi|esc_m|edad|carrera|a_ov|sost_ec|alg|geo_e|geo_a|qui|elec|X_sec|X_bach|bach|ENP|transp|dur_bach|trastes|refri|c_agua|tv_cable|horno_m|cel|inter|comp|auto_p|p_serv"/>
</operator>
<operator activated="true" class="naive_bayes" compatibility="5.0.10" expanded="true" height="76" name="Naive Bayes" width="90" x="447" y="165"/>
<operator activated="true" class="retrieve" compatibility="5.0.10" expanded="true" height="60" name="vm_socdem_e_Xchanged_prueba" width="90" x="45" y="300">
<parameter key="repository_entry" value="vm_socdem_e_Xchanged_prueba"/>
</operator>
<operator activated="true" class="set_role" compatibility="5.0.10" expanded="true" height="76" name="vm_socdem_prueba" width="90" x="112" y="435">
<parameter key="name" value="cuenta"/>
<parameter key="target_role" value="id"/>
</operator>
<operator activated="true" class="select_attributes" compatibility="5.0.10" expanded="true" height="76" name="Select Attributes" width="90" x="313" y="300">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="carrera|no_unam|edad|turno|raz_elec|ingr_fi|esc_m|a_ov|sost_ec|alg|geo_a|geo_e|elec|qui|X_sec|X_bach|bach|ENP|dur_bach|transp|refri|trastes|c_agua|cel|tv_cable|horno_m|comp|inter|auto_p|p_serv"/>
</operator>
<operator activated="true" class="apply_model" compatibility="5.0.10" expanded="true" height="76" name="Apply Model" width="90" x="492" y="288">
<list key="application_parameters"/>
<parameter key="create_view" value="true"/>
</operator>
<connect from_op="vm_socdem_e_Xchanged" from_port="output" to_op="SET ID" to_port="example set input"/>
<connect from_op="SET ID" from_port="example set output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Select Attributes (2)" to_port="example set input"/>
<connect from_op="Select Attributes (2)" from_port="example set output" to_op="Naive Bayes" to_port="training set"/>
<connect from_op="Naive Bayes" from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_op="vm_socdem_e_Xchanged_prueba" from_port="output" to_op="vm_socdem_prueba" to_port="example set input"/>
<connect from_op="vm_socdem_prueba" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
<connect from_op="Apply Model" from_port="model" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="216"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
0 -
Hi,
the merge operator is now called append.
Greetings,
Sebastian0