Automatic sampling_type in Split Operator
HeikoeWin786
New Altair Community Member
Hello all,
Just one quick question if anyone has any idea on this.
For the Automatic sampling_type in Split Operator, it is said that it will use stratified sampling if the label is nominal, shuffled sampling otherwise.
What if the label is polynominal? It will be used stratified sampling?
Because I have imbalanced classes and I want to split the data as split 1: 75-25, then again split that 75 from split 1 into 75-25 an split 2.
I will save the model from split 2 and input the 25 from the split 1 as the unseen data to test the model.
thanks and regards,
Heikoe
Just one quick question if anyone has any idea on this.
For the Automatic sampling_type in Split Operator, it is said that it will use stratified sampling if the label is nominal, shuffled sampling otherwise.
What if the label is polynominal? It will be used stratified sampling?
Because I have imbalanced classes and I want to split the data as split 1: 75-25, then again split that 75 from split 1 into 75-25 an split 2.
I will save the model from split 2 and input the 25 from the split 1 as the unseen data to test the model.
thanks and regards,
Heikoe
0
Best Answer
-
@HeikoeWin786
The split data is robust to split more than 2 classes in a stratified manner, as long as you have non-numeric label. Also, I would recommend you try Cross Validation on your split 1 if possible and then use the 25% remaining data as your test(unseen) data. You might not need further splits in that case. Though, CV can take be compute and time intensive, but it is generally worth it.
There was a good thread on this in past as well.0
Answers
-
@HeikoeWin786
The split data is robust to split more than 2 classes in a stratified manner, as long as you have non-numeric label. Also, I would recommend you try Cross Validation on your split 1 if possible and then use the 25% remaining data as your test(unseen) data. You might not need further splits in that case. Though, CV can take be compute and time intensive, but it is generally worth it.
There was a good thread on this in past as well.0