Feature Selection with multidimensional attributes

levietduc
levietduc New Altair Community Member
edited November 2024 in Community Q&A

Given a dataset with various attribute such as: {a1, a2, a3, b1, b2, c1, c2, c3, c4, d1}
The output of a standard Feature Selection such as Forward Selection will be a subset such as: {a1, b1, c2, c3}
However, I want to have Feature Selection applied on grouped attributes a (3 dimensions), b (2 dimensions), c (3 dimensions), d (1 dimension). That means expected output should be a subset of {a,b,c,d}, for instance, {a, c} = {a1, a2, a3, c1, c2, c3, c4}.
How can I do such Feature Selection with Rapid Miner?

Answers

  • Thomas_Ott
    Thomas_Ott New Altair Community Member

    I would investigate using some sort of Loop. Perhaps Loop Subsets for this. 

  • SGolbert
    SGolbert New Altair Community Member

    You can get your hands dirty inside the forward selection operator. For example, with the following rules:

     

    if a1 is in attribute set,

    then make sure that a1, . . ., an are in the dataset

    else 

    make sure that a1, . . ., an are NOT in the dataset

     

    if b1 is in attribute set,

    then make sure that b1, . . ., bn are in the dataset

    else 

    make sure that b1, . . ., bn are NOT in the dataset

     

    etc.

     

    You will be wasting a lot of computation, but it is a workaround that may work.

  • earmijo
    earmijo New Altair Community Member

    What you are describing sounds a lot like grouped Lasso. I don't think it is available directly in Rapidminer, but it is indirectly by using the R-script extension. 

     

    Check out the library grpreg in R. I'm sure there are other libraries that will perform grouped lasso, but this is the one I know

     

    https://cran.r-project.org/web/packages/grpreg/grpreg.pdf