Fine Tuning Outlier & Novelty Detection

Angkit Choudhury_21636
Angkit Choudhury_21636
Altair Employee

Maximizing predictive power of anomaly detection models with hyperparameter tuning.

In a typical end to end data analytics workflow, hyperparameter tuning always plays a major role in achieving desired performance & predictions out of a machine learning/deep learning model. To use a car analogy, if we consider a particular ML model as a vehicle, it’s corresponding hyperparameters are like the different buttons we have in our cars. So, in today’s blog let’s take a look at the hyperparameters associated with the anomaly detection models in signalAI and expanding on that also discuss the ones which typically have a greater impact in fine tuning these models, as in case of our cars, some buttons are more important learn than the others :).

Starting with isolation forest (IF), to fine tune it to a particular problem at hand, we have number of hyperparameters shown in the panel below. As a rule of thumb, out of these parameters, the attributes called “Estimator” & “Contamination” are typically the most influential ones. By tuning the parameter “Estimator”, we can change the number of trees that are fitted under the hood of IF depending upon the complexity of the problem at hand. Higher is the number of estimators, more the complex the fitted model is. For “contamination”, it’s a number by which we are letting isolation forest know how much percentage/portion of the input data points are outliers in the data set which is fed into the model. It’s a rough estimate/guess. In some use cases like Renishaw, we might get lucky and know the contamination factor beforehand, but if we don’t the default value of 0.1(10%) is a good place to start.


image


In the context of local outlier factor (LOF), as shown below, by default LOF is set up for novelty detection (Outlier vs Novelty). But it can be also used for outlier detection by setting the hyperparameter “Novelty” to “False” . So, it’s a very important parameter to set before we start our analysis. Now when we use LOF for novelty detection, one useful trick here is to use a very small number (for example 0.001 or lower) for “Contamination” while fitting it on the input data, as for novelty detection the input data should ideally be free of anomalies during “training”. For outlier detection, the concept of “Contamination” here is exactly the same as isolation forest. On top of this, the parameter “Neighbors” also generally plays a significant role and a higher value of “neighbors” typically produces better results at the expense of increased fitting time.


image


Now for one class SVM (OSVM), it is typically sensitive to outliers and hence not very good for outlier detection, but it can be used for flagging outliers by fine-tuning the hyperparameter called “Nu” to handle outliers by preventing overfitting. Intuitively the parameter “Nu” is kind of equivalent to the “Contamination” hyperparameter we had discussed above. Other than “Nu”, the parameter “Kernel” is also an important one, by which we can control what type of kernel we want to use under the hood. In the use cases where the input dataset is separable with a linear decision boundary, we can play with the kernel choice of “linear” as it is faster to fit compared to other kernels but typically, as shown below the default kernel choice of “rbf” is always a good place to start as it can handle data points when they are linearly separable as well as when they are not.


image


Sometimes hyperparameter tuning might feel like bit overwhelming and tedious. But if you have read this far, next time you get your hands dirty with some data fitting these anomaly detection models, hopefully you will be able to reach the optimal performance with fewer iterations and hence will be able to save some of your valuable time!


Thanks for the read! Happy Learning!