I've been given a dataset for an exercise in k-means clustering with 5 variables. Three of them are continuous (age of customer, number of items per transaction and dollar value of transaction) however the other two are not i.e. binomial (in store or online transaction as 1 or 0) and the other is polynomial ('Region' with values of 1,2,3 or 4). (Although they are both currently in the dataset as integers)
Am I correct in assuming that I should exclude the transaction type and region? My logic is that centroids produced are more or less garbage given that the transaction can't be half way between an online or in store transaction. Similarly, with geographical regions - and average value is meaningless.
Thanks in advance for any and all assistance. I've spent the last day and a half researching online and am none the wiser (with any certainty).