Skewness Example

Join over 2 million students who advanced their careers with 365 Data Science. Learn from instructors who have worked at Meta, Spotify, Google, IKEA, Netflix, and Coca-Cola and master Python, SQL, Excel, machine learning, data analysis, AI fundamentals, and more.

Start for Free
Iliya Valchanov 26 Apr 2023 2 min read

The most commonly used tool to measure asymmetry is skewness.

Almost always, you will use software or a skewness calculator that performs the computation for you, so in this lesson, we will not get into the computation, but rather the meaning of skewness.

So, skewness indicates whether the observations in a data set are concentrated on one side.

Skewness can be confusing at the beginning, so a skewness example is in place.

Remember frequency distribution tables from previous articles? Here we have three data sets and their respective frequency distributions. We have also calculated the means, medians, and modes.

The first data set has a mean of 2.79 and a median of 2, hence the mean is bigger than the median. We say that this is a positive or right skew. From the graph, you can clearly see that the data points are concentrated on the left side. Note that the direction of the skew is counterintuitive. It does not depend on which side the line is leaning to, but rather to which side its tail is leaning to. So, right skewness means that the outliers are to the right.

It is interesting to see the measures of central tendency incorporated in the graph.

When we have right skewness, the mean is bigger than the median, and the mode is the value with the highest visual representation.

In the second graph, we have plotted a data set that has an equal mean, median and mode. The frequency of occurrence is completely symmetrical and we call this a zero or no skew. Most often, you will hear people say that the distribution is symmetrical.

For the third data set, we have a mean of 4.9, a median of 5 and a mode of 6. As the mean is lower than the median, we say that there is a negative or left skew. Once again, the highest point is defined by the mode. Why is it called a left skew, again? That’s right, because the outliers are to the left.

Alright. So, why is skewness important? Skewness tells us a lot about where the data is situated. As we mentioned in our previous lesson, the mean, median and mode should be used together to get a good understanding of the dataset. Measures of asymmetry like skewness are the link between central tendency measures and probability theory, which ultimately allows us to get a more complete understanding of the data we are working with.

Hope that our skewness example came in handy. Curious to learn more? Jump to our statistics tutorials or check out our Statistics course.

Next Video: Handle large data tables with VLOOKUP COLUMN and ROW  

Iliya Valchanov

Co-founder of 365 Data Science

Iliya is a finance graduate with a strong quantitative background who chose the exciting path of a startup entrepreneur. He demonstrated a formidable affinity for numbers during his childhood, winning more than 90 national and international awards and competitions through the years. Iliya started teaching at university, helping other students learn statistics and econometrics. Inspired by his first happy students, he co-founded 365 Data Science to continue spreading knowledge. He authored several of the program’s online courses in mathematics, statistics, machine learning, and deep learning.

Top