Kurtosis and Skewness

The most commonly found distribution in nature is the normal distribution, which has a bell-shaped curve.

But we don’t need to get a perfect normal distribution every time. The data may have outliers that are going to distort this curve. The horizontal distortion of a normal distribution curve gets captured by the Skewness measure and the vertical distortion gets captured by the Kurtosis measure.

Kurtosis

Kurtosis describes the shape of a probability. The kurtosis of any univariate normal distribution is 3.

Distributions with kurtosis less than 3 are said to be platykurtic. It means the distribution produces fewer and less extreme outliers than does the normal distribution. An outlier is a data point that differs significantly from other observations.  A platykurtic distribution will have thinner tails than a normal distribution will, resulting in fewer extreme positive or negative events.

Distributions with kurtosis greater than 3 are said to be leptokurtic. Leptokurtic distributions maximize the chances of rarer positive or negative events. It has more outliers than the normal distribution. Examples of a leptokurtic distribution are Laplace distribution, Student’s t-distribution, an exponential distribution.

Skewness

Skewness represents a lack of symmetry of a curve. We know the bell curve or the normal distribution, which is symmetrical about the vertical axis. It is a positively or negatively skewed curve when this curve is tilted to the right (positive skewness) or the left (negative skewness). Positively skewed distributions have longer and fatter tails on the right while negatively skewed distributions have longer and fatter tails on the left.

Positively Skewed Data: Mean > Median

Negatively Skewed Data: Mean < Median

For example, the income distribution in India is positively skewed. A stock with negative skewness generates frequent small gains and few extreme or significant losses in the period considered. On the other hand, a stock with positive skewness generates frequent small losses and few extreme gains.

Why is it used in Machine Learning?

Skewness tells us how the data has been distributed and in what directions are the outliers or observations lie at an abnormal distance than the normal data points located. It is very important during Exploratory Data Analysis during feature extraction and feature selection. We use different transform techniques like Power Transform, Log Transform, Exponential Transform to convert the positive and negative skew distribution to Normal distribution to deal with skewness.