What does 'Instantiate Variable' mean and why would I use it?

Mahshid
Mahshid
Altair Employee
edited May 2021 in Community Q&A

When adding a new variable in the 'Variable Transformations' node some of you may have noticed the option to 'Instantiate' a variable. I quite often get asked what does that mean and why would/wouldn't I use it, so I thought I'd clear it up!

'Instantiate Variable' essentially means that you're creating the variable and cutting the link to the original data. This allows you to delete the original columns without losing the new variable, however, it does mean that you lose the data lineage aspects and the ability to skip the transformations when scoring new data.

Let me break this down into parts.

1. Create a new 'Variable Transformations' node with some data and click 'Add New Variable'. In this case, we're creating a new variable 'My_Variable' based on two existing fields, [IndivID], and [HHoldID]:

image

Using variable transformation functionality to instantiate a new variable.



2. Leaving the 'Force data to be instantiated' unchecked, create this variable and then try to delete the capital-gain or capital-loss variables:

image

3. Note the error saying that you have a dependency on those two variables:

image

4. Selecting 'Force data to be instantiated' will stop this error appearing (I'd post a pic, but there's nothing really interesting to take a screenshot of for a successful delete - just imagine clicking 'Run' and no error popping up). 

So what are the problems with doing this?

Firstly you lose the ability to see any data lineage in the model parameters. Here I create two variables: 'Instantiation_ON' and 'Instantiation_OFF' and build a tree. 

image

This is what's shown in the parameter expressions tab. You can only see the expressions for 'EducationOFF', the 'EducationON' variable is treated as a new variable:

image

Next, I'll try to score this model against the pre-transformed data:

image

When I go to score this model against non-transformed data I have the problem of trying to map the EducationON variable to something, whereas the EducationOFF maps itself back to the 'Education' variable. So I would need to manually transform the data prior to this step:

image

So, to summarize:
1. Instantiation will treat the data as a new variable
2. You can delete any data used in the calculation
3. You won't see the expression in the model
4. You won't be able to score against non-transformed data

Hope this makes it clearer! Post below if you have further questions!!



------------------------------
Alex Gobolos
Sales Engineer
Datawatch Corporation
Toronto ON
------------------------------

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.