"Sampling highly unbalanced data"

User: "vdvaxel"
New Altair Community Member
Updated by Jocelyn

Hello guys,

 

I have a highly unbalanced data set which I'd like to use to build a model. However, I have a question regarding the position of the Sample operator to balance the data: should I put it before my X-Validation and use another Apply Model and Performance after the X-Validation to apply the model on the entire data set instead of just the sample (because the performance from the X-Validation is just the sample) OR should I put the Sample inside the training part of the X-Validation?

 

Thanks in advance!

Find more posts tagged with