"Introducing Missing Values"
cherokee
New Altair Community Member
Hi!
How can I mark some value as missing?
I have some data where each feature (real-valued) of each instance has a value. Now I have mark some values as missing. How can I do this in RM?
Best regards,
chero
How can I mark some value as missing?
I have some data where each feature (real-valued) of each instance has a value. Now I have mark some values as missing. How can I do this in RM?
Best regards,
chero
Tagged:
0
Answers
-
Hi,
this depends when you want to change the data? Before importing the data into RapidMiner or inside a RapidMiner process?
Greetings,
Sebastian0 -
Hi!
I want to change it inside a RM process.
Best regards,
chero0 -
Hi,
I would use the Generate Attributes operator for this. You might define conditions there to decide when a value should become unknown. Unfortunately this isn't encodable directly, but you could enter 0/0 to define a value as missing. Here's a sample process:<?xml version="1.0" encoding="UTF-8" standalone="no"?>
Greetings,
<process version="5.0">
<context>
<input>
<location/>
</input>
<output>
<location/>
<location/>
</output>
<macros/>
</context>
<operator activated="true" class="process" expanded="true" name="Process">
<parameter key="parallelize_main_process" value="true"/>
<process expanded="true" height="595" width="366">
<operator activated="true" class="generate_data" expanded="true" height="60" name="Generate Data" width="90" x="98" y="74"/>
<operator activated="true" class="generate_attributes" expanded="true" height="76" name="Generate Attributes" width="90" x="246" y="75">
<list key="function_descriptions">
<parameter key="att1_new" value="if (att1 > 3, 0/0, att1)"/>
</list>
</operator>
<connect from_op="Generate Data" from_port="output" to_op="Generate Attributes" to_port="example set input"/>
<connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Sebastian
0 -
Hi,
this is no solution to my problem. I use some program to encode specific data. This program doesn't allow to set some data as missing furthermore there is no value which i can use to mark some value as missing. The resulting data is written in a specific format.
I wrote my own RM operator to import this data. Now I have to say something like this: attribute 1 of example 13 is missing. This is NOT codeable in the data.
The only solution I see is importing it into RM and exporting it to csv (or similar). In that file I could give those attributes specific values which I could recode in RM the way you suggested. Is there an easier way.
Best regards,
chero0 -
Hello
Since you have written your own input operator, setting missing values should be piece of cake for you :
Did you try that ? No guarantee for functionality, I am currently not able to test it.
Example example = <get-example-from-anywhere>;
example.setValue(<attribute>,Double.NaN);
steffen0 -
Hi!
I haven't tried it yet. But it seams to be the easiest solution. I'll give it a try in the next days.
Best regards,
chero0