Cluster-Analysis with wholesale customer dataset

New Altair Community Member

Jan 18, 2018

Updated Nov 5, 2024 by Jocelyn

Hello everyone,

as a group of marketing students who participate in a course called "Marketing Analytics", we now have the task to make a cluster-analysis, using different clustering-methods, on the dataset from here:

https://archive.ics.uci.edu/ml/datasets/wholesale+customers

The exact description is the following:

"The data set refers to clients of a wholesale distributor. It includes the annual spending in monetary units (m.u.) on diverse product categories. Goal: Find Clusters of Customers"

For that, we should try out different Clustering methods (Professor told us next to k-means to try out DBSCAN and Hierachical Clustering)

Currently we did the following:

Added Operator: Read CSV -> Loaded in the Data-Set

Added Operator: Select Attributes -> Filtered out the nominal attributes Channel & Region

Added Operator: K-Means

First off we do not know how to find the optimum of "k" to use in RapidMiner? How can we get to this, how can we see the intradistance and so the "Ellbow" graph in rapid miner for this dataset? (I attached a graphic from a presentation i found)

As we have more than 2 attributes (Milk, Frozen, Fresh, Delicatess, Groceries, etc.) how can we visualize the clusters? What kind of clusters can we get out of this dataset?

Also, how can we use the DBSCAN Clustering ? If we just connect it with the Select Attributes operator and run it, we get only one cluster...

Our professor also told us to use some loop, is it also necessary to filter out Outliners?

Please help, we struggle a lot in this task. If someone is able to explain this task, he or she can also contact me private and I would offer something for the effort.

Thanks a lot!!

Find more posts tagged with

AI Studio

Clustering

🎉Community Raffle - Win $25

Cluster-Analysis with wholesale customer dataset

Find more posts tagged with

Quick Links