Altair RISE

A program to recognize and reward our most engaged community members

Nominate Yourself Now!

What Benefits does Normalisation offer?

Hi, I understand that when normalising my data it puts values into a specific range.
I know that this can help for machine learning purposes but I'm unclear on how?

Would someone mind clearing this up for me?
Thanks again
-Madcap

Find more posts tagged with

AI Studio

Normalization

Accepted answers

varunm1

Hello @Madcap

Normalizing puts values into a specific range, True. Actually, it keeps all the predictor's values in the same range for example 0 to 1. ML and statistic models consider that data is distributed normally. The main use of normalization is when we have predictors (Attributes) whose scales (Ranges) vary a lot. For example, If we have an attribute that has values between 0 to 10 and another attribute that has values between 1000 and 10000 in the dataset this causes the algorithm to think that the attribute with higher values (1000 to 10000) is a supporting predictor. This might not be true in reality. For this reason, we consider normalization so that all the attributes are normally distributed during training and based on the statistical significance they will be given priority. This will support stable convergence of an algorithm

IngoRM

Just to add to the great explanation of @varunm1: Normalization is especially important for all distance-based learners like k-NN. Without normalization, attributes with a very large range would simply overwhelm other attributes with smaller ranges. Not because they are actually more important as predictors, simply because they have a bigger range. For other learning schemes, e.g. Decision Tree, this does not matter and in fact I would recommend against normalization (in most cases), since it changes the range of your input data and reduces understandability of the model to somebody who is familiar with the application domain.

Hope this helps,
Ingo

All comments

varunm1

Madcap

Thanks a lot for your help on this and past questions! @varunm1
Very helpful

IngoRM

Hope this helps,
Ingo

Madcap

Thank @IngoRM that is helpful. I had been creating my decisions trees and rule models etc, with normalised data mainly because in the tutorials I had done it. I definitely understand the readability aspect of it as I found myself trying to uncover what a standardised value actually represented.

Thanks again
-Madcap

Telcontar120

Just a clarifying note that normalization doesn't actually change the distribution of the underlying variables itself, but it does change their range. In spite of the name, normalization doesn't magically transform underlying data into a "normal" distribution. So you still might need to worry about outlier detection and removal techniques depending on the actual data you are using.