Workflow: Clustering data with the K-Means Clustering block
Ian Balanzá-Davis
Altair Employee
The K-Means Clustering block enables you to apply a K-Means clustering model to a dataset.
The following demonstrates how the K-Means Clustering block is used to split the input dataset lib_books.csv (containing observations that describe a range of books available from a lending library) into a specified number of clusters and assign observations to them.
- Import the lib_books.csv dataset onto a Workflow canvas using the Text File Import block.
- Expand the Model Training group in the Workflow palette, then click and drag a K-Means Clustering block onto the Workflow canvas.
- Click the Output port of the lib_books dataset block and drag a connection towards the Input port of the K-Means Clustering block.
- Double-click the K-Means Clustering block to display the Configure K-Means Clustering dialog box.
- In the Configure K-Means Clustering dialog box:
- In the Unselected Variables list, press and hold CTRL and select the NumberInStock and Price variables.
- Click Select to move the specified variables to the Selected Variables list.
- Click OK to save the configuration and close the Configure K-Means Clustering dialog box.
A green execution status is displayed in the Output port of the K-Means Clustering block with the model results, K-Means Clustering Model. The K-Means Clustering block output can be used with a Score block to apply the clustering to a dataset.
0