Workflow: Predicting a binary variable with the Decision Tree block


The Decision Tree block enables you to apply a Decision Tree predictive model to an input dataset.

The following demonstrates how to use the Decision Tree block to predict a dependent Score variable from an input dataset basketball_shots.csv (containing observations that detail a basketball shot in a professional game and the person taking the shot) based on other independent variables dataset:

  1. Import the basketball_shots.csv dataset onto a Workflow canvas using the Text File Import block.
  2. Expand the Model Training group in the Workflow palette, then click and drag a Decision Tree block onto the Workflow canvas.
  3. Click the Output port of the basketball_shots dataset block and drag a connection towards the Input port of the Decision Tree block.
  4. Double-click the Decision Tree block to display the Decision Tree Editor view and the Decision Tree Preferences dialog box.
  5. In the Decision Tree Preferences dialog box:
    1. In the Dependent variable drop-down list, select Score.
    2. In the Target Category drop-down list, select 1 (one).
    3. In the Unselected Independent Variables list, press and hold CTRL and select angle, distance_feet, height, position, and weight.
    4. Click Select to move the specified variables to the Selected Independent Variables list.
  6. Click OK to save the configuration and close the Decision Tree Preferences dialog box.
  7. Right click the 0:Score node and select Grow (C4.5) to train the model.
  8. Close the Decision Tree Editor View and save the configuration when prompted.

A green execution status is displayed in the Output port of the Decision Tree block. The Decision Tree block output can be used with a Score Block in order to make predictions on a dataset.