Measuring Asymmetry with Skewness

Join over 2 million students who advanced their careers with 365 Data Science. Learn from instructors who have worked at Meta, Spotify, Google, IKEA, Netflix, and Coca-Cola and master Python, SQL, Excel, machine learning, data analysis, AI fundamentals, and more.

Start for Free
Iliya Valchanov 19 Apr 2023 4 min read

A great term in the field of statistics, which you can add to your dictionary, is skewness. It is the most commonly used tool to measure asymmetry. However, in order to fully understand it, basic knowledge of the measures of central tendency is vital.

Now, let’s focus on skewness. What is shown below, is the formula to calculate it.

Sample skewness formula

Most likely, you will use software that performs the calculation for you. So, if you're looking for a shortcut to speed up your work, you can rely on our Skewness Calculator

In this tutorial, however, we will not get into the computation, but rather the meaning of skewness

The meaning of skewness

What's the Meaning of Skewness?

So, skewness indicates whether the observations in a data set are concentrated on one side. Skewness can be confusing at the beginning, so an example will be useful.

In the picture below, we have three data sets and their respective frequency distributions. We have also calculated the means, medians and modes.

Means modes and medians, skewness  

 

Positive Skew

The first data set has a mean of 2.79 and a median of 2. Hence, the mean is bigger than the median. We say that this is a positive or right skew. From the graph below, you can clearly see that the data points are concentrated on the left side.

Positive skew

Important: The direction of the skew is counterintuitive. It does not depend on which side the line is leaning to, but rather to which side its tail is leaning to. So, right skewness means that the outliers are to the right.

Outliers to the right, skewness

It is interesting to see the measures of central tendency incorporated in the graph. You can tell that we have right skewness when:

  • the mean is bigger than the median
  • the mode is the value with the highest visual representation.

Measures of central tendency

Zero Skew

In the second graph, we have plotted a data set that has an equal mean, median and mode. The frequency of occurrence is completely symmetrical and we call this a zero or no skew.

Zero or no skew

Most often, you will hear people say that the distribution is symmetrical.

Negative Skew

For the third data set, we have a mean of 4.9, a median of 5 and a mode of 6. As the mean is lower than the median, we say that there is a negative or left skew.

Negative or left skew

Once again, the highest point is defined by the mode. And as you might guess, it is called a left skew, because the outliers are to the left.

Why Skewness is Important

Skewness tells us a lot about where the data is situated.

Where the data is situated

In fact, the mean, median and mode should be used together to get a good understanding of the dataset. Measures of asymmetry like skewness are the link between central tendency measures and probability theory. This ultimately allows us to get a more complete understanding of the data we are working with.

Well, this is how we measure asymmetry. However, is there a way to measure something else? Like variability? Find out in our next tutorial.

***

Interested in learning more? You can take your skills from good to great with our statistics course!

Try statistics course for free

Next Tutorial: Measures of Variability: Variance, Standard Deviation and Coefficient of Variation

Iliya Valchanov

Co-founder of 365 Data Science

Iliya is a finance graduate with a strong quantitative background who chose the exciting path of a startup entrepreneur. He demonstrated a formidable affinity for numbers during his childhood, winning more than 90 national and international awards and competitions through the years. Iliya started teaching at university, helping other students learn statistics and econometrics. Inspired by his first happy students, he co-founded 365 Data Science to continue spreading knowledge. He authored several of the program’s online courses in mathematics, statistics, machine learning, and deep learning.

Top