Feature Selection with multidimensional attributes
Given a dataset with various attribute such as: {a1, a2, a3, b1, b2, c1, c2, c3, c4, d1}
The output of a standard Feature Selection such as Forward Selection will be a subset such as: {a1, b1, c2, c3}
However, I want to have Feature Selection applied on grouped attributes a (3 dimensions), b (2 dimensions), c (3 dimensions), d (1 dimension). That means expected output should be a subset of {a,b,c,d}, for instance, {a, c} = {a1, a2, a3, c1, c2, c3, c4}.
How can I do such Feature Selection with Rapid Miner?
Answers
-
I would investigate using some sort of Loop. Perhaps Loop Subsets for this.
1 -
You can get your hands dirty inside the forward selection operator. For example, with the following rules:
if a1 is in attribute set,
then make sure that a1, . . ., an are in the dataset
else
make sure that a1, . . ., an are NOT in the dataset
if b1 is in attribute set,
then make sure that b1, . . ., bn are in the dataset
else
make sure that b1, . . ., bn are NOT in the dataset
etc.
You will be wasting a lot of computation, but it is a workaround that may work.
2 -
What you are describing sounds a lot like grouped Lasso. I don't think it is available directly in Rapidminer, but it is indirectly by using the R-script extension.
Check out the library grpreg in R. I'm sure there are other libraries that will perform grouped lasso, but this is the one I know
0