A program to recognize and reward our most engaged community members
119,390 instance and 32 attributes and I want to use classification for a supervised learning question. after I run the process, someone told me that the data is unbalanced. In this case, should I better use downsampling since the dataset is a little big? But I cannot find SMOTE in RM~
id | type | wheels | engine ---------------------------- 1 | moto | 2 | yes 2 | moto | 2 | yes 3 | bike | 2 | no 4 | car | 4 | yes 5 | ????? | 2 | yes
moto (2) bike (1) auto (1)
Survived, Age, Gender, Passenger Class