The 365 Data Science team is proud to invite you to our own community forum. A very well built system to support your queries, questions and give the chance to show your knowledge and help others in their path of becoming Data Science specialists.   The best answers are voted up and moderated by our team

# N-1 for samples

0
1

Why do we use n-1 for samples instead of n?

365 Team
0

Hello again, Anis!

Whenever you are calculating a sample statistic from an estimate of another sample statistic, you lose one degree of freedom.

Sample Mean is an estimate of population mean. In order to determine, sample variance, you need to use sample mean. Obviously, Sample variance is second sample statistic being calculated using sample mean (first statistic). Therefore, you lose one degree of freedom when you calculate sample variance. Hence the formula has (n-1) and not n.

That’s statistical talk.. Another way to explain this is – assume you have four numbers (a, b, c and d) that must add up to a total of m; you are free to choose the first three numbers at random, but the fourth must be chosen so that it makes the total equal to m, therefore your degree of freedom is 3.

Pretty soon you will encounter Pooled Variance of two samples where the degree of becomes n1+n2-2; you can imagine why once you take a look at the formula for pooled variance.

Best,
365 Team

JanuaryPromo
×
Complete Data Science Education