"Anomaly Detection: Annotate outlier Graph points with RowID of datapoints?"
I was using Global outlier score with k-nn, is it somehow possible to annotate the graphs with the outliers (e.g the "top10" or identified by RowID), with the respective RowID, to see directly which of them is an outlier? e.g additionally to a color gradient..
Furthermore, can I somehow use optimize parameter in addition to k-nn GOS, or Local Outlier Factor? to identify different outliers based on different parameters?
the thing is, operators like opt. parameters await a performance vector... thats not provided with outlier detection...
Best Answer
-
What I do in that case normally is add another column to my dataset which is a binominal outlier = true / false. This I then use as the colour in the scatterplot to highlight my outliers. I can also see how different outlier detection methods look visually using this method by creating multiple columns for each technique.
Using Advanced Charts it's possible to change the shape of the scatterplotdot (I'm sure that's the right term ), but personally I think this is a bit fiddly and so use a preprepared bit of Python/R/Javascript to send out the visualization to disk.1
Answers
-
Dear Fred,
you can use loop parameters (maybe with Select Subprocess) to get the results for various methods.
Of course you can use filter examples (maybe range) to find the 10 most outlierish examples.
~Martin
0 -
ok but I want to "merge" the top X outliers with the "normal" Scatterplot graphs from my results... is that possible?
and its not possible to annotate points in graphs by selecting RowID ?
0 -
What I do in that case normally is add another column to my dataset which is a binominal outlier = true / false. This I then use as the colour in the scatterplot to highlight my outliers. I can also see how different outlier detection methods look visually using this method by creating multiple columns for each technique.
Using Advanced Charts it's possible to change the shape of the scatterplotdot (I'm sure that's the right term ), but personally I think this is a bit fiddly and so use a preprepared bit of Python/R/Javascript to send out the visualization to disk.1