Do we use z-score or t-score when we have a large number of data with unknown population variance?

Question

Answer 1

Hi Harold, Thanks for reaching out. When we have unknown population variance we normally use the T-statistic. However, after 100 degrees of freedom (when the sample is bigger than approximately 100), the T-table and the Z-table practically coincide. Therefore, it doesn't really matter which one you use. The T distribution is developed to mimic the Normal distribution for small sample sizes. Therefore, it is no surprise that with more than 100 observations (when the sample becomes large enough) the two coincide (the T-distribution perfectly mimics the Normal distribution). Best, The 365 Team

Answer 2

Hi Harold.
One is suggested to use T Distribution for unknown population variance. I won't get into the 'why' behind it because it is quite math heavy to be quoted here but if you fancy a look you can check this explanation out.
As for large data sets:
ztable-vs-ttable
But if your data is really large and the degree of freedom is really large too, then the Z Table and T Table will be practically identical. As you can see, for smaller values of df (degree of freedom) the curve is much flatter whereas once df starts approaches infinity, the curve starts resembling normal distribution/Gaussian distribution more and more. T-Distribution was specifically designed to address normal distribution for small sample sizes <30 usually. But we don't even have to go till infinity, right around df=120 the t distribution curve and the z distribution curve will be nearly identical and indistinguishable.
Hope this helps!

Do we use z-score or t-score when we have a large number of data with unknown population variance?

Submit an answer