convert each nominal value of data set to a unique number

pariB
pariB New Altair Community Member
edited November 5 in Community Q&A
Hi,
I’m using neural networks and my data is like below :

attr1       attr2  attr3         attr4
Summer , 2 , true , good
winter , 10 , false , good
spring   , 4 ,     true , bad

since neural networks use numerical data , I need to map each nominal value to a number. When I use nominal2numeric operator , it gives me the below result :
attr1=summer attr1=winter attr1=spring attr3=true attr3=false attr4=good attr4=bad attr2
          1                       0       0               1               0               1                 0           2
          0                       1       0               0               1               1                 0           10
          0                       0       1               1               0               0                 1             4
But I need to assign exactly one number(not separated digit)  to each nominal value.for example replece “summer” with “100” , “winter” with “010” and so on )
I mean something like the below :
attr1 attr2 attr3       attr4
100 ,   2   , 10   , 10
010 , 10 , 01   , 10
001 ,   4   , 10   , 01

Is there any solution to do this in rapidminer ?
your help are appreciated.
Tagged:

Answers

  • Hello

    The "Map" operator would let you do this - the result would be a nominal however and you might then need to use "Guess Types" to change to a number. The leading zeros would disappear after this.

    regards

    Andrew
  • fras
    fras New Altair Community Member
    Hi,
    perhaps you should exclude "attr1" from being transformed to nominal (set role to "label" ?)
    Then you applay the Map-Op only to "attr1" to get the desired values/strings you need, not so clear what kind of data type you need.
    As you already mentioned neural networks need numerical data so nominal2numeric should do the job for attr2 and attr3. 
    -frank
  • MariusHelf
    MariusHelf New Altair Community Member
    Hi,

    the output you posted in your first post is exactly what you want. If you map each nominal value to a unique integer, you would imply an ordering of the values (if Summer is 1, Winter 2 and Autumn 3, it imlies that Autumn is "more" than summer and winter, but obviously that does not make sense).

    If you really want it and know what you are doing, you can use the operator Nominal to Numerical with coding_type "unique integers".

    As an alternative I would suggest to use an SVM instead of Neural Nets - SVMs are easier to optimize, easier to interpret and often make the job as good as the neural net or even better. And they can handle nominal values.

    Best,
    Marius