different validation performance parameters in LOG?

Fred12
Fred12 New Altair Community Member
edited November 5 in Community Q&A

hi, 

I have several performance parameters for validation to choose in the log operator, see screenshot:

 

val.png

 

can someone explain to me where the difference is in the different performance operators? because I have only 1 performance operator in my design..

 

and can someone please tell me, if I should use the normal Performance operator for k-nn, or some cluster-performance operator? which is  better?

I would like to see possible cluster outliers and be able to tag my data points with the label class color... but my dataset has 20+ attributes, is that still possible to visualize k-nn somehow?

 

Answers

  • MartinLiebig
    MartinLiebig
    Altair Employee

    Dear Fred,

     

    i think there are various things mixed in one question.

     

    The various performance things in X-Val

    Are i think only placeholders if you use more than 1

     

    Placement of the Log operator

    Please be sure that the log operator is AFTER the operator it should log - in your case the X-Val. If you put it inside it cannot access the latest result of X-Val

     

    Performance

    You use k-NN to classify, so you should use one of the Performance Operators for classification. The key which measure to use is of course driven by your problem. Using a clustering measure does not make sense if you do classification

     

    Vizualizing 20 Dimensions

    It is simply not possible to have a look at 20 dimensions at once. You would need to reduce dimensions with techniques like a PCA, SOM or t-SNE.

     

    ~Martin

  • land
    land New Altair Community Member

    Regarding the different performance values:

     

    performance is the value of the main criterion, which you select in the performance operator inside the X-Validation.

    deviation is the standard deviation of this main criterion.

     

    performance1 to performance3 are referencing to the first three performance criterions selected in the Performance operator. So if you check accuracy and error in Performance (Classification), performance1 references accuracy and performance2 the error, as accuracy is the first checked criterion and error the second in the list.

     

    This is a major pain point in any training course I have given so far, so can't hurt to be precise here :)

     

    Greetings,

      Sebastian

  • Fred12
    Fred12 New Altair Community Member

    ok thanks, that helped a bit, but I am still confused..

     

    I am using a optimize parameter Grid, and inside a Backward elimination, an inside that a x-validation with W-REPTree for numeric dataset :

    test.PNG

     

    where should I use the log operator now? I Used one after the x-validation, another on after the backward-elimination, and another one after the optimize-grid operator...

    secondly, I still don't really understand the result of the log operator, regarding things like performance1, performance2, etc. because those are not the same as accuracy, classification error and so on:

     Unbenannt2.PNG

     my log(3) operator, the one after the backward-elimination, puts out different results than that after the x-val, of course:

    Unbenannt3.PNG

     but how does that work, after which loop will an entry be made in the log(3) operator?

  • MartinLiebig
    MartinLiebig
    Altair Employee

    Hi Fred,

     

    it always depend on what you want to do. In your case, you would like to log the performance the optimize is working on. So you log on the optimize returned by Backwards Elemination. This is the one to log.

     

    Be careful with overtraining!

     

    ~Martin

  • Fred12
    Fred12 New Altair Community Member

    ok but I want to test the 3 Parameters M,V,N in REPTree against eachother, because I want to achieve a high accuracy in X-Validation...

     

    I am now a bit confused, which of the logged performance values, or accuracy or kappa-value should I use to see the best performance?

    Unbenannt.PNG

    the first line has accuracy of 82.8%, but performance is only 77.6%, what is performance now? I thought thats the main criterion, which is accuracy?

    and performance1 is 77.6%, that should be the same as accuracy because thats the first case to choose in the performance(classification) operator?

     

  • MartinLiebig
    MartinLiebig
    Altair Employee

    Which accuracy did you log there? Backwards Elmination?

     

    ~Martin

  • Fred12
    Fred12 New Altair Community Member

    yes, log(3) is backward elimination, log(2) is x-validation

  • asem_k
    asem_k New Altair Community Member

    What if someone wants to log more than 3 performance values? i.e., has checked more than 3 metrics and wants to log all of them, not only first 3.

  • land
    land New Altair Community Member

    In that (very rare) case you can still use Performance to Data to transform the performance into a data set and handle it yourself. You could attach the current parameter settings using Generate Attributes param function and collect all the data sets in one of the usual ways.

    We usually use the Indexed Collections of our Jackhammer extension, that not only collect the objects but also indexing them with an arbitrary number of attribute/value pairs, so that you can access a specific object later by providing its index values. But also gut to have a match between parameters -> performance.

     

    Greetings,

     Sebastian