🛠️ Scheduled Maintenance | We’ll be undergoing scheduled maintenance and upgrades between 00:00 PST Jan 26th until 00:00 PST Jan 28th. There may be brief interruption of services in that period. We apologize for the inconvenience.

The 365 Data Science team is proud to invite you to our own community forum. A very well built system to support your queries, questions and give the chance to show your knowledge and help others in their path of becoming Data Science specialists.
Anybody can ask a question
Anybody can answer
The best answers are voted up and moderated by our team

P-value calculation with two opposite Null hypothesis

P-value calculation with two opposite Null hypothesis

Super Learner

I’ve asked this question before, but I didn’t get an answer, so I try to clarify my problem. In the hypothesis testing section in the video I linked below there is this 40% email open rate example. We want to know whether our competitor’s open rate is higher than that of ours so we create the following hypotheses: H0: Mean(or) <= 40%, and H1: Mean(or) > 40%. Then we get a T-score of -0.53 and a critical value of 1.83 so we accept the H0. So far, so good.
What if we wanted to know whether our competitor’s open rate is lower that that of ours? Then H0: Mean(or) >= 40%, and H1: Mean(or) < 40%. H0 is the opposite, but we have the exact same database, exact same sample mean, standard deviation, standard error, T-score and critical value. Again we accept the H0. This is a paradox to me.
In both cases we accept the H0 in spite of the fact that they are the exact opposite of each other. So my question is the following: What am I missing here? How does the calculation change if we ask the opposite question, therefore have an opposite H0.
Thank you in advance :).

1 Answer

365 Team

Hi Gergely,
In both cases we say we fail to reject the null.
This may sound strange to you, however, it means that “whatever we were trying to test – we failed”. The result is not satisfactory for us to claim one is correct and the other is wrong. 
The reason for that is that the open rate is likely to be exactly 40% (or very close to it). As such, it is very hard for us “prove” that it is bigger or smaller than 40%. “Statistically” speaking it is impossible to make this claim.
Sometimes this happens due to a big variance or small sample size. 
Hope this helps!