Macros in Cost Matrix (Performance)

FBT
FBT New Altair Community Member
edited November 2024 in Community Q&A

Hallo community,

 

I am trying to run an optimization of a model based on costs for wrong and correct classifications. However, instead of assigning fixed values in the cost matrix, I would like to use macros and loop over my example set to set the required cost values. To give a bit more color, imagine you are trying to make a classification on customer churn and want to assign different cost values for each customer, i.e. the customer's revenue in the past 6 months.

 

Building the logic is not a big problem (at least I believe it isn't), however, the operator "Performance (Costs)" does apparently not accept macros as input values. Is there anything I can do about it, or any other work around?

 

This would be a short sample process based on the Titanic data:

<?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="7.6.001" expanded="true" height="68" name="Retrieve Titanic" width="90" x="45" y="34">
<parameter key="repository_entry" value="//Samples/data/Titanic"/>
</operator>
<operator activated="true" class="set_role" compatibility="7.6.001" expanded="true" height="82" name="Set Role" width="90" x="179" y="34">
<parameter key="attribute_name" value="Survived"/>
<parameter key="target_role" value="label"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="7.6.001" expanded="true" height="82" name="Generate Attributes" width="90" x="313" y="34">
<list key="function_descriptions">
<parameter key="Scoring" value="(if(Age &lt; 5, 1.1, if(Age &gt; 5 &amp;&amp; Age &lt; 25, 1.2, 1.3)))-1"/>
</list>
</operator>
<operator activated="true" class="optimize_parameters_grid" compatibility="7.6.001" expanded="true" height="103" name="Optimize Parameters (Grid)" width="90" x="514" y="34">
<list key="parameters"/>
<process expanded="true">
<operator activated="true" class="split_validation" compatibility="7.6.001" expanded="true" height="124" name="Validation" width="90" x="246" y="34">
<parameter key="sampling_type" value="stratified sampling"/>
<process expanded="true">
<operator activated="true" class="naive_bayes" compatibility="7.6.001" expanded="true" height="82" name="Naive Bayes" width="90" x="112" y="30"/>
<connect from_port="training" to_op="Naive Bayes" to_port="training set"/>
<connect from_op="Naive Bayes" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" class="apply_model" compatibility="7.1.001" expanded="true" height="82" name="Apply Model" width="90" x="45" y="30">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="loop_examples" compatibility="7.6.001" expanded="true" height="103" name="Loop Examples" width="90" x="179" y="34">
<process expanded="true">
<operator activated="true" class="extract_macro" compatibility="7.6.001" expanded="true" height="68" name="Extract Macro" width="90" x="112" y="34">
<parameter key="macro" value="Scoring"/>
<parameter key="macro_type" value="data_value"/>
<parameter key="attribute_name" value="Scoring"/>
<parameter key="example_index" value="%{example}"/>
<list key="additional_macros"/>
</operator>
<operator activated="true" class="performance_costs" compatibility="7.6.001" expanded="true" height="82" name="Performance" width="90" x="313" y="34">
<parameter key="cost_matrix" value="[0.0 1.0;1.0 0.0]"/>
<enumeration key="class_order_definition">
<parameter key="class_name" value="Yes"/>
<parameter key="class_name" value="No"/>
</enumeration>
</operator>
<connect from_port="example set" to_op="Extract Macro" to_port="example set"/>
<connect from_op="Extract Macro" from_port="example set" to_op="Performance" to_port="example set"/>
<connect from_op="Performance" from_port="performance" to_port="output 1"/>
<portSpacing port="source_example set" spacing="0"/>
<portSpacing port="sink_example set" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="average" compatibility="7.6.001" expanded="true" height="82" name="Average" width="90" x="313" y="34"/>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Loop Examples" to_port="example set"/>
<connect from_op="Loop Examples" from_port="output 1" to_op="Average" to_port="averagable 1"/>
<connect from_op="Average" from_port="average" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
</process>
</operator>
<connect from_port="input 1" to_op="Validation" to_port="training"/>
<connect from_op="Validation" from_port="averagable 1" to_port="performance"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
</process>
</operator>
<connect from_op="Retrieve Titanic" from_port="output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
<connect from_op="Generate Attributes" from_port="example set output" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
<connect from_op="Optimize Parameters (Grid)" from_port="performance" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>

 

Alternatively, does anybody have an example process for the operator "Performance (User-Based)"? It does not have a tutorial process and I am having a hard time figuring out how exactly it works.

 

 

Welcome!

It looks like you're new here. Sign in or register to get started.

Best Answer

  • MartinLiebig
    MartinLiebig
    Altair Employee
    Answer ✓

    Hi,

     

    you can take Generate Attributes and Extract Performance to get a similar result. Just build a "cost" attribute which is %{churnChurn} for churner who is churn and so on. Afterwards, you extract the average of this as performance.

     

    You are of course halfway through to take a customer-based performance (e.g. his Customer Lifetime Value).

     

    Cheers,

    Martin

Answers

  • Telcontar120
    Telcontar120 New Altair Community Member

    I've been thinking about this, and conceptually I don't believe the performance cost operator can utilize different values for different cases, unless you are literally building a separate model for each case (like inside a Loop Examples, for instance). Since the thing that is being minimized is the misclassification cost across all observations based on different models, if it needed to have a different calculation for each observation, then there would potentially be a different model required.  

     

    Having said that, I am not sure why it wouldn't accept a macro to set the values in the cost matrix---that's a question for the developers, I think.

     

     

  • MartinLiebig
    MartinLiebig
    Altair Employee
    Answer ✓

    Hi,

     

    you can take Generate Attributes and Extract Performance to get a similar result. Just build a "cost" attribute which is %{churnChurn} for churner who is churn and so on. Afterwards, you extract the average of this as performance.

     

    You are of course halfway through to take a customer-based performance (e.g. his Customer Lifetime Value).

     

    Cheers,

    Martin

  • FBT
    FBT New Altair Community Member

    Thanks @mschmitz! That is exactly what I was looking for and I officially found my new favorite performance operator. :-)

     

    Also thanks to @Telcontar120, you are probably right that my initial workaround proposal is flawed and would not work as intened. Luckily, RM has apparently a great solution for every possible problem.  

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.