Normalisation vs Standardisation

Nikhil Verma
4 min readJul 8, 2021

Any machine learning algorithm, generally involves components as Optimisation procedure, cost function, modelling technique and the most important is “Dataset to learn”. Its said that any ML algo performs as good as the dataset it is fed with.

Most of the time that is spent following “Knowledge discovery from data” pipeline is data collection, cleaning and pre-processing. Data Preprocessing could involve multiple techniques as data transformation, analysing redundant data or outlier detection. All these anomalies in dataset could cause our model to under perform. There are couple of questions we should take care of before applying a model on any dataset. These are:-

  1. Can model handle missing values?
  2. What will be the effect of outliers in dataset?
  3. Is feature scaling required before training ML model?

Point of concern, for this post is Feature Scaling of Dataset.

Feature Scaling

Lets take one of the simplest usecase of ML Linear Regression to predict

Price of flat~ f(#rooms, #toilets, Area of flat, …)

We could note that the feature #rooms is going to be in range (0, 10] while feature like Area of flat is going to be in range [100, 1000] and when creating a model using LR…

--

--

Nikhil Verma

Knowledge shared is knowledge squared | My Portfolio https://lihkinverma.github.io/portfolio/ | My blogs are living document, updated as I receive comments