Member-only story

Outlier detection techniques

Nikhil Verma
5 min readJul 14, 2021

Pattern! Patterns! Patterns!

That’s what we are concerned about while running a Data Mining pipeline which helps to find patterns in the dataset collected. But are all patterns interesting. Well, not really. Interesting patterns are the one, which exhibit all or some of the properties mentioned below:-

  • Easily Understood
  • Valid on new dataset
  • Potentially useful
  • Novel

But finding patterns is not always a cake walk and we encounter many a times such data points which are visibly far apart from our normal data collected and such points do fall under the category of Outliers.

An Outlier is a datapoint in dataset that is distant from all other observations. It lies outside the overall distribution of the data. There are many applications where outlier detection is important including Novelty detection or Anomaly Detection.

In this post we are going to discuss various techniques to detect outlier in data

  • Z-score
  • Inter Quartile Range(IQR)
  • Elliptic Envelope | Robust Covariance
  • One-class SVM
  • Isolation forest
  • Local Outlier Factor

Z-score

--

--

Nikhil Verma
Nikhil Verma

Written by Nikhil Verma

Knowledge shared is knowledge squared | My Portfolio https://lihkinverma.github.io/portfolio/ | My blogs are living document, updated as I receive comments

No responses yet