Workflow customisation: Creating new custom variables in the Mutate block

IanBD
IanBD
Altair Employee
edited October 2022 in Altair RapidMiner

If you find that the default functionality provided in the Workflow Mutate block is not quite what you need, you can create and use custom SAS language functions with the FCMP procedure.

User-defined functions saved in a SAS language catalog can be used in a workflow. Functions should not be created in a SAS language block as part of the workflow. This is because running a workflow uses multiple instances of an Altair SLC engine. A function defined in a SAS language block is therefore not guaranteed to be available to all parts of a workflow.

Your function can be invoked by every instance of the engine if you save the catalog in the SASUSER library and then add that location to the CMPLIB system option for the active engine.

This example uses the exam_results.csv dataset from the supplied samples to create a new variable for the dataset using the Mutate block. A function is created that calculates how effective students were in their study and returns a result between 1 and 100 inclusive. The higher the score the more effective their use of time.

To invoke a user-defined function in a workflow:

  1. Define the following howEffective function in a SAS language block. This block does not need to be attached to any other block in the Workflow:
    proc fcmp outlib = sasuser.mylib.testfn;   function howEffective(timespent, result);     if result = 0 then nscore = 1;     else if timespent = 0 then nscore = 0;     else nscore = 1 / (1 + result / timespent);      if nscore GE 1 then return 100;     else if nscore LE 0 then return 0;     else return nscore*100;   endsub; run;
  2. Specify the catalog to the active engine:
    1. In the Workflow Link Explorer, right-click the active engine (highlighted in bold) and click Properties.
    2. Expand the Startup group and click Set System Options.
    3. Click Add. In the System Option dialog box, enter CMPLIB in the Name field and enter SASUSER.MYLIB in the Value field.
    4. Click OK to close the System Option dialog box.
    5. Click OK to close the Properties dialog box.
    6. Restart the engine for the change to take effect.
  3. From the supplied samples, drag exam_results.csv from the data folder to the Workflow canvas. See the knowledge base article Importing the Altair SLC samples if you need to import the samples into your workspace.
  4. The two variables of interest in the dataset are HoursStudy and MockResult. To create a Working dataset containing these variables:
    1. Double-click the exam_results.csv block.
    2. Select the Column Selection tab.
    3. In the Selected Columns list, select Pass? and click Deselect.
    4. Click OK to close the Configure Text File Import dialog box.
  5. Expand the Data Preparation group and drag the Mutate block to the Workflow Canvas:
    1. Drag a connection from the Working Dataset to the Mutate block.
    2. Double-click the Mutate block to open the block editor.
    3. In the Mutated Variables list, select NewVariable and enter Effectiveness.
    4. In Expression, enter the following:
      howEffective(hoursstudy, mockresult)
    5. Close the Mutate block editor and save when prompted.

The Working Dataset created by the Mutate block contains three variables and can be linked to other blocks in your workflow. You can also use your defined functions in the SQL block and both user-defined functions and CALL routines in SAS language programs in the SAS block. For more information about using the FCMP procedure, see the Altair SLC Reference for Language Elements.