"K-Means Clustering with Mixed Attributes"
Hello Everyone,
I want to segment my customer base (13,000 customers) according to several attributes such as:
1. Total Deposits (numerical)
2. Total #Accounts (integer)
3. #Months Since Customer Acquisition (Integer)
4. Has the client subscribed for Online Banking or not? (Categorical)
I want to see what is common among my customers by splitting them into clusters.
I have mixed attributes in my data set (numerical and categorical).
The questions I have are:
1. What is the best distance measure in this case?
2. Do I need to transform any attribute?
3. Do I need to normalize any attribute?
4. What is the best way to set up the model?
Any help would be appreciated.
Thank You
I want to segment my customer base (13,000 customers) according to several attributes such as:
1. Total Deposits (numerical)
2. Total #Accounts (integer)
3. #Months Since Customer Acquisition (Integer)
4. Has the client subscribed for Online Banking or not? (Categorical)
I want to see what is common among my customers by splitting them into clusters.
I have mixed attributes in my data set (numerical and categorical).
The questions I have are:
1. What is the best distance measure in this case?
2. Do I need to transform any attribute?
3. Do I need to normalize any attribute?
4. What is the best way to set up the model?
Any help would be appreciated.
Thank You