How can I reduce the amount of values in a attribute?

BrunoC
BrunoC New Altair Community Member
edited November 2024 in Community Q&A
I have a data set that has 4 values. They are polynomials. One of the values in the attribute have too many and I want all the values to have around the same amount of it. Is my label. I am trying to do a decision tree. I have 400 of one of the values but I want it to lower it to 40. randomly. Is there an option?
Tagged:

Best Answer

  • David_A
    David_A New Altair Community Member
    Answer ✓

    you can use the normal Sample Operator and check the parameter balance data. Then you can specify the exact amount of examples per class that you want in your test set. you can either specify an exact number (with absolute sampling) or a ratio (with relative sampling).
    An alternative approach could be to group the other values, so you don't loose too many examples (going from 400 to only 40 examples can strongly reduce the efficiency of your model, especially when you want to do a Cross-Validation for testing your model). Take a look at the Replace Rare Values operator from the Operator Tool Box extension.

    Best,
    David

Answers

  • David_A
    David_A New Altair Community Member
    Answer ✓

    you can use the normal Sample Operator and check the parameter balance data. Then you can specify the exact amount of examples per class that you want in your test set. you can either specify an exact number (with absolute sampling) or a ratio (with relative sampling).
    An alternative approach could be to group the other values, so you don't loose too many examples (going from 400 to only 40 examples can strongly reduce the efficiency of your model, especially when you want to do a Cross-Validation for testing your model). Take a look at the Replace Rare Values operator from the Operator Tool Box extension.

    Best,
    David