"Replace (Dictionary): unwanted behaviour"

DekwoKybon
DekwoKybon New Altair Community Member
edited November 5 in Community Q&A

Hi all,

can somebody explain me why the Replace (Dictionary) operator suffixes the replacements with indexes?

I want to map to following values that are loaded from an Excel file and I checked that they are read in correctly:

 

from attribute to attribute
/ None
J1 Early
J2 Early
J1-2 Early
J3 Late
J2-3 Late
J4 Late
J3-4 Late
J5 Late
J4-5 Late
? Late


I have selected single attribute and set it to the attribute containing the above from attribute-values. The from attribute property is set equal to 'from attribute' and the to attribute property is set to 'to attribute'. Since this concerns a class label I checked include special attributes.

 

Now, Rapidminer creates the following values:
None, Early, Early-1, Early-2, Late, Late-4,Late-5,J2,J3,J5 (why doens't it replace J2, J3 & J5 ???).

 

What I expect this operator to do is that it creates only 3 values: None, Early and Late

If I wanted the suffixes I would have added them myself, but to my knowlegde there is no way to stop it from doing this.

 

Is this an option that needs to be turned off? Or how I can I achieve this replacement without resorting to manual entry in some operator like the Map-operator?

 

Best regards,

Wouter

 

 

Best Answer

  • Edin_Klapic
    Edin_Klapic New Altair Community Member
    Answer ✓

    Hi Wouter,

     

    As you mentioned the Map Operator might be better fit in your use case.

    The settings can be seen on the screenshot below.

    Some remarks:

    I chose to use regular expressions which saves one line in the list of value mappings but you can of course do it the way you prefer more.

    The Operator Replace Missing Values might be necessary in case the "?" in your dataset is a missing value.

     

     

    image.png

     

     

    Best,

    Edin

     

     

Answers

  • Edin_Klapic
    Edin_Klapic New Altair Community Member
    Answer ✓

    Hi Wouter,

     

    As you mentioned the Map Operator might be better fit in your use case.

    The settings can be seen on the screenshot below.

    Some remarks:

    I chose to use regular expressions which saves one line in the list of value mappings but you can of course do it the way you prefer more.

    The Operator Replace Missing Values might be necessary in case the "?" in your dataset is a missing value.

     

     

    image.png

     

     

    Best,

    Edin

     

     

  • DekwoKybon
    DekwoKybon New Altair Community Member

    Hi Edin,

     

    thanks for your reply. I accepted your answer as a solution since I see you are RMSTAFF :-)

    I do think however that the Replace (dictionary) operator should at least have a simple extra checkbox that suppresses this behaviour.

     

    Maybe I can implement it myself or is it possible to file this as a bug?

     

    Best regards,

    Wouter

  • Thomas_Ott
    Thomas_Ott New Altair Community Member

    We always encourage users to build their own extensions.  :)

     

    Have you checked out our extension template? https://docs.rapidminer.com/developers/