Why do we use n-1 for samples instead of n?

Whenever you are calculating a sample statistic from an estimate of another sample statistic, you lose one degree of freedom.

Sample Mean is an estimate of population mean. In order to determine, sample variance, you need to use sample mean. Obviously, Sample variance is second sample statistic being calculated using sample mean (first statistic). Therefore, you lose one degree of freedom when you calculate sample variance. Hence the formula has (n-1) and not n.

That’s statistical talk.. Another way to explain this is – assume you have four numbers (a, b, c and d) that must add up to a total of m; you are free to choose the first three numbers at random, but the fourth must be chosen so that it makes the total equal to m, therefore your degree of freedom is 3.

Pretty soon you will encounter Pooled Variance of two samples where the degree of becomes n1+n2-2; you can imagine why once you take a look at the formula for pooled variance.

