"Anomaly Detection: Annotate outlier Graph points with RowID of datapoints?"

Fred12
Fred12 New Altair Community Member
edited November 5 in Community Q&A

I was using Global outlier score with k-nn, is it somehow possible to annotate the graphs with the outliers (e.g the "top10" or identified by RowID), with the respective RowID, to see directly which of them is an outlier? e.g additionally to a color gradient..

 

Furthermore, can I somehow use optimize parameter in addition to k-nn GOS, or Local Outlier Factor? to identify different outliers based on different parameters? 

the thing is, operators like opt. parameters await a performance vector... thats not provided with outlier detection...

Best Answer

  • JEdward
    JEdward New Altair Community Member
    Answer ✓
    What I do in that case normally is add another column to my dataset which is a binominal outlier = true / false. This I then use as the colour in the scatterplot to highlight my outliers. I can also see how different outlier detection methods look visually using this method by creating multiple columns for each technique.




    Using Advanced Charts it's possible to change the shape of the scatterplotdot (I'm sure that's the right term :D), but personally I think this is a bit fiddly and so use a preprepared bit of Python/R/Javascript to send out the visualization to disk.

Answers

  • MartinLiebig
    MartinLiebig
    Altair Employee

    Dear Fred,

     

    you can use loop parameters (maybe with Select Subprocess) to get the results for various methods.

     

    Of course you can use filter examples (maybe range) to find the 10 most outlierish examples.

     

    ~Martin

  • Fred12
    Fred12 New Altair Community Member

    ok but I want to "merge" the top X outliers with the "normal" Scatterplot graphs from my results... is that possible? 

    and its not possible to annotate points in graphs by selecting RowID ?

  • JEdward
    JEdward New Altair Community Member
    Answer ✓
    What I do in that case normally is add another column to my dataset which is a binominal outlier = true / false. This I then use as the colour in the scatterplot to highlight my outliers. I can also see how different outlier detection methods look visually using this method by creating multiple columns for each technique.




    Using Advanced Charts it's possible to change the shape of the scatterplotdot (I'm sure that's the right term :D), but personally I think this is a bit fiddly and so use a preprepared bit of Python/R/Javascript to send out the visualization to disk.