Workflow: Selecting dataset variables using the R block

Ian Balanzá-Davis
Ian Balanzá-Davis
Altair Employee
edited October 2022 in Altair RapidMiner

The R block enables you to use R language programs in a Workflow. To use the R block, you must have a R interpreter installed and configured.

The following demonstrates how the R block can be used to restrict an input lib_books.csv dataset (which contains observations that describe a range of books available from a lending library) to only show authors.

  1. Import the lib_books.csv dataset onto a Workflow using Text File Import block.
  2. Expand the Code Blocks group in the Workflow palette, then click and drag a R block onto the Workflow canvas.
  3. Click Output port of the lib_books dataset block and drag a connection towards the Input port of the R Language block.
  4. Double-click on the R block to display the R editor view.
  5. In the R editor view:
    1. In the Outputs panel, click Add new output dataset.
    2. In the R editor, enter the following:
      Output_1 <- data.frame(Author=Input_1$Author)

  6. Close the R editor view and save the configuration when prompted.

A green execution status is displayed in the Output port of the R block and the new Working Dataset. The dataset contains only the Author column from all observations in the input lib_books.csv dataset.