Workflow: Replacing missing values with the Impute block
Ian Balanzá-Davis
Altair Employee
The Impute block enables you to replace missing values in a dataset variable based on other values for that variable.
The block is used to replace the missing values in the Price variable in an input dataset lib_books.csv (which contains observations that describe a range of books available from a lending library) based on the distribution of the non-missing values:
- Import the lib_books.csv dataset onto a Workflow canvas using the Text File Import block.
- Expand the Data Preparation group in the Workflow palette, then click and drag an Impute block onto the Workflow canvas.
- Click the Output port of the lib_books dataset block and drag a connection towards the Input port of the Impute block.
- Double-click the Impute block to display the Configure Impute dialog box.
- In the Configure Impute dialog box:
- In the Variable drop-down list, select Price.
- In the Method drop-down list, select Distribution.
- Click OK to save the configuration and close the Configure Impute dialog box.
A green execution status is displayed in the Output ports of the Impute block and the new Working Dataset. The dataset contains the input lib_books dataset with new values to replace the missing values in the Price variable.
0