The F-distro looks right skewed. Am I reading that right? Also, if it is right skewed, could you further explain how, if at all, skewness affects regression and H test? Thanks.

Hi Chitra,

The F-statistic in regressions is a comparison between our regression model, and **a model that has no independent variables**.

If you create a regression with no independent variables (so you don’t have Xs), just Y = some constant, say Y = 5. You will have some explanatory power still, like.. 0.1% (sometimes you’ll get close to the answer, no doubt). If you use a regression for this, the only coefficient will be the intercept and it will be equal to **the mean**!

So Y = the mean.

If you check your formula for SSR, you will realize that if the prediction is always the mean,

then SSR = 0 (no explanained variability).

***

Now, the F-stat is a ratio.

It is a ratio of **our model **and **the model with no Xs**, which had SSR = 0.

So what the F-stat shows us is:

*How much better is our current model, than a model that has no explanatory power whatsoever?*

We would usually have F-stats >50, or >300, or >2000. And those are all normal.

However, an F-stat = 2, would imply that our model, is just 2 times better than saying: the answer is always the mean. We want something much more dramatic than 2, right?

And that’s what the F-stat constitutes of.

***

The F-stat follows the F-distribution, which is:

1) always non-negative (for regressions, it is the ratio of two **sum of squares** so it is 0 or above)

2) right skewed, because the F-distribution is right-skewed by definition

**Why the F-stat follows the F-distribution (and is therefore right-skewed, too)?**

Most of the time, the F-stat is 2,3,4,5,6, so our model is not much better than the one which has no explanatory power.

Usually, there is a critical value, which says: okay, the cut-off line is say 3. If the F-stat is >3, then this model has some merit.

Of course, there is an F-table you can consult about that!

Hope this helps!

Best,

The 365 Team