The Normal distribution definition.
You have surely seen a normal distribution before as it is the most common one. The statistical term for it is Gaussian distribution, but many people call it the Bell Curve as it is shaped like a bell. Here is a simple normal distribution definition: it is symmetrical and its mean, median and mode are equal. If you remember the lesson about skewness, you would recognize it has no skew! It is perfectly centered around its mean.
Alright. So, it is denoted in this way. N stands for normal, the tilde sign denotes it is a distribution and in brackets we have the mean and the variance of the distribution. On the plane, you can notice that the highest point is located at the mean, because it coincides with the mode. The spread of the graph is determined by the standard deviation.
Now, you are equipped with the normal distribution definition, but let’s try to understand the normal distribution a little bit better.
Let’s look at this approximately normally distributed histogram. There is a concentration of the observations around the mean, which makes sense as it is equal to the mode. Moreover, it is symmetrical on both sides of the mean.
We used 80 observations to create this histogram. Its mean is 743 and its standard deviation is 140.
But what if the mean is smaller or bigger?
Let’s first zoom out a bit by adding the origin of the graph. The origin is the zero point. Adding it to any graph gives perspective. Keeping the standard deviation fixed, or in statistical jargon, controlling for the standard deviation, a lower mean would result in the same shape of the distribution, but on the left side of the plane. In the same way, a bigger mean would move the graph to the right. In our example, this resulted in two new distributions – one with a mean of 470 and a standard deviation of 140 and one with a mean of 960 and a standard deviation of 140.
Alright, let’s do the opposite. Controlling for the mean, we can change the standard deviation and see what happens. This time the graph is not moving but is rather reshaping. A lower standard deviation results in a lower dispersion, so more data in the middle and thinner tails. On the other hand, a higher standard deviation will cause the graph to flatten out with less points in the middle and more to the end, or in statistics jargon – fatter tails.
Can’t wait to learn more? Check out our all-in-one Data Science Training.
Next Video: What is a Distribution in Statistics