What M5, greedy and T-test is meaning

sriongatc
sriongatc New Altair Community Member
edited November 5 in Community Q&A
I just try to training model with Linear Regression. I need to know about meaning of M5, greedy and T-test from feature selection. Many thanks for considering my request. :'( 

Answers

  • David_A
    David_A New Altair Community Member
    edited November 2021

    those three are different strategies for reducing the number of features (or attributes or columns) that are considered in your model.
    In general it's a good idea to have as few influence factors as possible for your model, so it's less susceptible for noise and errors. On the other hand, you don't want to lose potential information. So it's always a trade off between selecting the right amount of features.

    M5 is also a called M5 Prime, selects a subset of attributes, which improves the Akaike information criterion the most.
    T-test performs the statistical test of the same name to consider if a feature has a significant influence on the target class.
    Greedy is a forward selection strategy, where each round the attribute with the lowest contribution (again based on the Akaike information criterion) is deselected.

    There's no golden rule which selection strategy gives you the best results, that's best decided with an independent parameter optimization). But I highly recommend to use any kind of feature selection for your regression model (especially if you have more than just a few attributes).

    Best,
    David