simple operator or method for combining nominal categories?
Is there some easy way to combine nominal categories together based on frequency? For example, if I have a nominal attribute with 10 different possible values, but I only want to keep the top 5 (by frequency) and then put the rest into an "Other" category.
This is obviously possible using some manual recoding logic, but I feel like there is a better way that is slipping my mind. Is there some operator for this that I am forgetting? Discretize operators aren't ideal because they only work on numerical attributes so that would require recoding and loses the underlying nominal values.
I have to do this with a large number of attributes/categories so I am looking for a solution that doesn't require manual recoding of the categories.
Thanks in advance!
This is obviously possible using some manual recoding logic, but I feel like there is a better way that is slipping my mind. Is there some operator for this that I am forgetting? Discretize operators aren't ideal because they only work on numerical attributes so that would require recoding and loses the underlying nominal values.
I have to do this with a large number of attributes/categories so I am looking for a solution that doesn't require manual recoding of the categories.
Thanks in advance!