Resolved: Population variance known vs. unknown
How do we know when the population variance is known or unknown without being told if it is known or unknown? So far we have been told in the title of the lessons whether or not variance is known or unknown, and Excel has functions for sample variance and population variance so with either function we should be able to find the variance right? Logically if that's true then variance will always be known when using those functions. We would just have to know if the data were sample or population data to use the correct variance function.
In both the z-score and t-score lessons we calculated the Mean, St Dev, and SE. The z-score lesson had a larger dataset but so far that seems to be the biggest difference between the two lessons. I understand the z-score lesson needed to have a known population variance so you could teach it but I don't know what that variance is. Since the t-score lesson also had the Mean, St Dev, and SE the information required seems identical yet the t-score has an unknown population variance.
The distinction between knowing and not knowing the population variance is often more of a practical matter, reflecting how the data was collected and what is known about the underlying population, rather than a mathematical distinction that can be determined from the data itself. Here's an attempt to clarify:
Known Population Variance (z-score):
-Sometimes, the population variance might be known from prior research, theoretical considerations, or because you have data from the entire population.
-If the population variance is known, you can use the standard normal (z) distribution to conduct hypothesis tests or create confidence intervals.
-In practice, truly knowing the population variance is rare. The examples you see in textbooks or lessons are often contrived to teach the principles.
Unknown Population Variance (t-score):
-More commonly, you don't know the population variance and must estimate it from a sample.
-In these cases, you use the t-distribution, which accounts for the additional uncertainty arising from estimating the population variance from the sample.
-This is more typical in real-world scenarios where you are working with a sample of data and you don't have access to information about the entire population.
so, to me, its always more practical to use t-score in real life situations even though the calculations are the same [unless you know for sure that you have a variance associated with your data]