Help interpreting outliers/anomalies when using Isolation Forest operator

New Altair Community Member

Jan 28, 2022

Updated Nov 5, 2024 by Jocelyn

Hi. I'm really liking the Isolation Forest operator under the Anomaly Detection Extension. Trees =100, Leaf Size =2, and average path as the score calculation gives me a result where the first 5 outliers match exactly with an R script using the Mahalanobis Distance function. That is great for comparisons. But is there a calculation or rule of thumb that you suggest for the Trees parameter? Or for cutoff score? Using my R script comparison I can easily match the 5 lowest scores. Score wise, is there a point or a calculation where outliers/anomalies end and the rest are not outliers? Thanks for any help.

Find more posts tagged with

AI Studio

Anomaly Detection

Sort by:

1 - 1 of 11

MartinLiebig

Altair Employee

Accepted Answer

Jan 28, 2022

Hi,

great to hear that we produce the same output as R. I am the author of it and I only compared to sklearn.

I think generally no real way to find the right parameters or cutoff for the anomaly_score. If you have a list of anomalies you may be able to calculate recall and precision on that set. But thats rather rare.

For trees: I would suspect that more is better but at some point the score should converge and more trees only cause more computation time.

Best,

Martin

View in context

🎉Community Raffle - Win $25

Help interpreting outliers/anomalies when using Isolation Forest operator

Find more posts tagged with

Quick Links