Workflow: Dividing a dataset using the Partition block

Ian Balanzá-Davis
Ian Balanzá-Davis
Altair Employee
edited October 2022 in Altair RapidMiner

The Partition block enables you to split up a dataset variable into multiple parts. For example, if you are training a model, you can use this block to split up a dataset into training and testing datasets.

The following demonstrates how to use the Partition block to split the input dataset IRIS.csv into three equal parts:

  1. Import the IRIS.csv dataset onto a Workflow canvas using the Text File Import block.
  2. Expand the Data Preparation group in the Workflow palette, then click and drag a Partition block onto the Workflow canvas.
  3. Click the Output port of the IRIS dataset block and drag a connection towards the Input port of the Partition block.
  4. Double-click the Partition block to display the Configure Partition dialog box.
  5. In the Configure Partition dialog box:
    1. Click Add Partition to create a third partition from the input dataset.
  6. Click OK to save the configuration and close the Configure Partition dialog box.

A green execution status is displayed in the Output port of the Partition block and the new Partition1, Partition2 and Partition3 datasets. The Partition block output datasets contain one third of the input IRIS dataset, split into three equal parts.