Properly including p values with sklearn using real estate data produces unexpected results
Hello,
I am doing the multiple linear regression exercise in the Machine Learning with Python module https://learn.365datascience.com/courses/machine-learning-in-python/multiple-linear-regression This exercise uses the Real Estate dataset provided which contains price, size and year variables and the sklearn package. As well as the univariate p values, I thought I would also calculate the p values properly with sklearn copying the example given earlier in the course (which used the SATS dataset): https://learn.365datascience.com/courses/machine-learning-in-python/a-note-on-calculation-of-p-values-with-sklearn Unfortunately, the "proper" p values come out as zero while the univariate ones come out as I would expect:
reg_with_pvalues.p. gives.
array([0., 0.])
but the univariate p values come out as
p_values.round(3)
array([0. , 0.357])
The other "proper" regression values (Intercept, coefficients and R squared) come out as expected and match the solution given but the p values do not.
Could you give me some advice on troubleshooting this issue?
Thanks!