Predictive model for rare occurrences

Casper72
Casper72 New Altair Community Member
edited November 2024 in Community Q&A
Hi fellow RapidMiners,

What kind of model would you suggest I should look into when trying to predict a binary outcome with a very high class imbalance (97/3)? The problem at hand is medical readmission within 30 days for surgery. Any suggestions would be appreciated. Currently I am planning to test the k-NN algorithm looping through different k-values.

Best regards.
Tagged:

Best Answers

  • lionelderkrikor
    lionelderkrikor New Altair Community Member
    Answer ✓
    Hi @Casper72,

    In your case, I advice you to preprocess your data by upsampling your dataset before modelling.
    For that, you can use the SMOTE Upsampling operator from the Operator Toolbox extension available for free in the MarketPlace.

    Regards,


    Lionel

  • Telcontar120
    Telcontar120 New Altair Community Member
    Answer ✓
    You can try weighting instead to balance the classes (although not all ML algorithms support weighting).  This might give better results than upsampling with such a small minority class.
  • Telcontar120
    Telcontar120 New Altair Community Member
    Answer ✓
    Take a look at the tutorial for the Generate Weight (Stratification) operator, that should be the one that you would use.

Answers

  • lionelderkrikor
    lionelderkrikor New Altair Community Member
    Answer ✓
    Hi @Casper72,

    In your case, I advice you to preprocess your data by upsampling your dataset before modelling.
    For that, you can use the SMOTE Upsampling operator from the Operator Toolbox extension available for free in the MarketPlace.

    Regards,


    Lionel

  • Casper72
    Casper72 New Altair Community Member
    Thank you Lionel,

    I will try using SMOTE. Have used it before with success, although with less imbalanced datasets (typically in the range of 30/70) 
  • Telcontar120
    Telcontar120 New Altair Community Member
    Answer ✓
    You can try weighting instead to balance the classes (although not all ML algorithms support weighting).  This might give better results than upsampling with such a small minority class.
  • Casper72
    Casper72 New Altair Community Member
    Tellcontar120: Great idea! I will have to read up upon weighting in RM though. Thank you for your suggestion.
  • Telcontar120
    Telcontar120 New Altair Community Member
    Answer ✓
    Take a look at the tutorial for the Generate Weight (Stratification) operator, that should be the one that you would use.