"[SOLVED] Attribute weighting for unbalanced data"

makak
makak New Altair Community Member
edited November 5 in Community Q&A
Hi,

I am trying to experimet with various techniques for attribute weighting for my dataset which is quite unbalanced. I am subsampling the majority class when I am training classifier. My question is, when I am applying "Weight by ..." operators, should I apply them for original (unbalanced) dataset, or for balanced dataset? Intuitively for balanced I'm just not sure.

Thank you.
Tagged:

Answers

  • MariusHelf
    MariusHelf New Altair Community Member
    Hi,

    yes, in general you should use the balanced data set for any kind of data mining and data analysis, at least as long as you are in the training process.

    Performance measurements can also be taken on the original class distributions, depending on the desired output and interpretation of the performance values.

    Best regards,
    Marius
  • makak
    makak New Altair Community Member
    Thank you.