I have Missing Data (10K out of 40K) I need to use Self-Organized Map (SOM) as clustering method
Find more posts tagged with
Sort by:
1 - 1 of
11

25% of missingness is a lot of missing values, if your data has only few attributes, I suggest to discard all examples with missing values and build your clustering system first - 30K examples is a lot examples so may still struggle with building a SOM if you intend to use more than 2 dimensions. Then you could play with missing values, e.g. by creating an imputation model, and apply your clustering model to these examples only.