Why does the Replace operator accepts only nominal values?
Christos_Karapapas
New Altair Community Member
The general concept of what I am trying to do is that I have trained a model and now I want to apply it on a single example.
My original dataset has too many attributes, so I thought, in order not to write all those values by hand, to just Store a part of it during training and then Retrieve and Filter by range 1-1 to get a single example.
And then just replace the values for just a few of its attributes and see the prediction.
So, when I use the Replace operator to replace the value of a nominal attribute everything is fine.
But when I use it to replace the value of an Integer attribute I get the following error "Wrong value type The attribute has value type Integer, should be Nominal".
Is there a way to avoid changing the type of the non-Nominal attributes to Nominal, just to perform a Replace operation and then changing them back to their original type?
The work around that I am trying is to convert from Numerical to Polynomial before the Replacement and then convert from Nominal to Numerical after the Replacement, for those specific attributes. However, this gives me an error during the Apply model that "The input ExampleSet does not match the training ExampleSet. Misfitting Attribute:myIntegerAttribute".
My original dataset has too many attributes, so I thought, in order not to write all those values by hand, to just Store a part of it during training and then Retrieve and Filter by range 1-1 to get a single example.
And then just replace the values for just a few of its attributes and see the prediction.
So, when I use the Replace operator to replace the value of a nominal attribute everything is fine.
But when I use it to replace the value of an Integer attribute I get the following error "Wrong value type The attribute has value type Integer, should be Nominal".
Is there a way to avoid changing the type of the non-Nominal attributes to Nominal, just to perform a Replace operation and then changing them back to their original type?
The work around that I am trying is to convert from Numerical to Polynomial before the Replacement and then convert from Nominal to Numerical after the Replacement, for those specific attributes. However, this gives me an error during the Apply model that "The input ExampleSet does not match the training ExampleSet. Misfitting Attribute:myIntegerAttribute".
0
Best Answer
-
@chris_skg the Discretize operators take numerical attributes as input and nominal attributes as output. So you will not get an error saying that you have numerical inputs when nominal are required.
For example, in the Golf sample data set, Temperature and Humidity are numerical attributes:
Discretize by Binning turns them into discrete nominal attributes:
You can of course customize a lot HOW the discretization happens.
Scott
2
Answers
-
hi @chris_skg ok I think I understand what you're asking. So the reason why Replace only works on Nominal/Polynominal attributes is that replacing a number with another number does not usually make a whole lot of mathematical sense. If the attribute is the label/target, you are in a regression problem. If it is not the label/target, then the model would just use the values with no problem.
So with that, I would say that if you really want to reduce the number of unique values in a numerical attribute, the more typical method is to use binning. See the whole series of "Discretize" operators in the Cleansing -> Binning folder of the operator panel.
Does that help?
Scott
0 -
I am not sure I get it 100%, I need to study this method before I can tell you if it's useful for this particular case.
So, if I understand it correctly, I will use one of the Discretize operators to convert the non-Nominal values of an attribute to a specific range of nominal values in order to make my Replacements. But then, after the Replacement, won't I stumble once again upon the same error of "The input ExampleSet does not match the training ExampleSet. Misfitting Attribute..." ?0 -
@chris_skg the Discretize operators take numerical attributes as input and nominal attributes as output. So you will not get an error saying that you have numerical inputs when nominal are required.
For example, in the Golf sample data set, Temperature and Humidity are numerical attributes:
Discretize by Binning turns them into discrete nominal attributes:
You can of course customize a lot HOW the discretization happens.
Scott
2