Workflow: Sorting a dataset using the Python block

Ian Balanzá-Davis
Ian Balanzá-Davis
Altair Employee
edited October 2022 in Altair RapidMiner

The Python block enables you to use Python language programs in a Workflow. To use the Python block, you must have a Python interpreter installed and configured.

The following demonstrates how the Python block can be used to sort an input lib_books.csv dataset that contains observations describing a range of books available from a lending library.

  1. Import the lib_books.csv dataset onto a Workflow using the Text File Import block.
  2. Expand the Code Blocks group in the Workflow palette, then click and drag a Python block onto the Workflow canvas.
  3. Click on the lib_books.csv dataset block's Output port and drag a connection towards the Input port of the Python Language block.
  4. Double-click the Python block to display the Python Editor view.
  5. In the Python Editor view:
    1. In the Outputs panel, click Add new output dataset.
    2. In the Python editor, enter the following:
      Output_1 = Input_1.sort_values(by=['Author'])

  6. Close the Python editor view and save the configuration when prompted.

A green execution status is displayed in the Output port of the Python block and the new Working Dataset. The dataset contains all observations from the input lib_books dataset sorted alphanumerically using the Author column.