RapidMiner: Handling nominal missing attributes

sav
sav New Altair Community Member
edited November 5 in Community Q&A
Hi fellas,

I'm a total noob in RapidMiner, I've just installed it on Mac.
My problem is sort of interesting for me. I'd appreciate any help. Here is my process:

image

In my process when I apply a "replace missing values" operation on a data set and run the process, only the numeric missing values are replaced by their Average value, nominal (binominal and polynominal) missing values are still missing and are not replaced.

image

However (this is where it gets strange for me) when I point on the output node of the "replace missing values" operator in the diagram (process view) I see that all missing values are replaced.

image

I'd really want to know if this is a bug or am I doing some ridiculous mistake.

Thanks a lot.
sav

Answers

  • earmijo
    earmijo New Altair Community Member
    From the operator's description:

    "For nominal attributes the mode is used for the average, i.e. the nominal value which occurs most often in the data. For nominal attributes and replacement type zero the first nominal value defined for this attribute is used. The replenishment "value" indicates that the user defined parameter should be used for the replacement."

    They get replaced by the mode. If you don't want this, replace only the numeric variables by subsetting them.

    Hope this helps,

    Ernesto
  • sav
    sav New Altair Community Member
    Hi,
    Thanks for the reply.

    The problem is Nominal values dont get replaced by anything when I look at the "Result Perspective/view" and when I export the results missing nominal values are still missing but missing Numeric values are replaced by the average, and the interesting thing is: all missing values seem replaced when I roll-over the result node in process view (as shown in the pictures above)... as if the "replace missing values" operator ignores the nominal values when it comes to results but not in the "process viw (where you design the process)"

    what I want is to replace missing numeric values with their average (which happens), and to replace missing nominal values with their mode (which doesnt happen). This is the out put of values.preprocessing (an output of the "replace missing values" operator)

    image

    I'd appreciate any help

    Thanks,
    Sav
  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Hi,

    the metadata preview (which you see in the tooltip when hovering over the node) is just that: a preview which may or may not become reality after operator execution. You just don't know what will happen sometimes without actually executing the operator (and we obviously cannot do that for the preview), so the metadata preview may be wrong.
    That aside, there was indeed a bug involved here which prevented the replacement of nominal attributes. I have fixed the problem so your process should work once the next update gets released.

    Regards,
    Marco
  • sav
    sav New Altair Community Member
    Thanks Marco... you guys have created an awesome tool. You actually got me quite interested in Data Mining. It's a shame I found RapidMiner at the end of the semester.