Workflow: Combining multiple datasets together with the Merge block

IanBD
IanBD
Altair Employee
edited October 2022 in Altair RapidMiner

The Merge block enables you to combine multiple datasets into a single working dataset.

The following demonstrates how to use the Merge block to combine two datasets Bananas.csv and Banans2.csv (each of which contain observations that describe a banana). Each dataset uses the same variables to describe a banana.

  1. Import the Bananas.csv and Bananas2.csv datasets into a Workflow using a Text File Import block for each dataset.
  2. Right-click the Bananas.csv dataset output, click Rename and enter Bananas.
  3. Right-click the Bananas2.csv dataset output, click Rename and enter Bananas2.
  4. Expand the Data Preparation group in the Workflow palette, then click and drag a Merge block onto the Workflow canvas.
  5. Click the Output port of the Bananas dataset block and drag a connection towards the Input port of the Merge block. Repeat for the Bananas2 dataset.
  6. Double-click the Merge block to display the Configure Merge dialog box.
  7. In the Merge Operation drop-down list, select Concatenate.
  8. Click OK to save the configuration and close the Configure Merge dialog box.

A green execution status is displayed in the Output ports of the Merge block and the new Working Dataset. The dataset contains observations from both input datasets.