In the lecture example, it mentioned the following code:
# Add a constant. Essentially, we are adding a new column (equal in length to x), which consists only of 1s
x = sm.add_constant(x1)
# Fit the model, according to the OLS (ordinary least squares) method with a dependent variable y and an independent x
results = sm.OLS(y,x).fit()
# Print a nice summary of the regression. That’s one of the strong points of statsmodels -> the summaries
results.summary()
I don’t understand what the sm.add_constant do, and why are we fitting x instead of x1 to the statsmodels?
Hi Kam,
Good question! It is explained in the subsequent lectures!
If after watching them until the end of the section you are still having this question, please come back here and we will dive deeper into it 🙂
Best,
The 365 Team
Hi, I finished the linear regression model section, and working on the multiple regression model session now. I sort of understand sm.add_constant do is to add beta 0, yet I still don’t understand why the syntax is sm.OLS(y.x).fit() that we are fitting x instead of x1. If we are assigning our data to x1, don’t we want statmodels to compute the with all the data? What I can see from x is all the ones from the notebook provided. Can you elaborate a bit more?
Hi Kam, statsmodels works in a way in which, x = sm.add_constant(x1) means that we are *ADDING A COLUMN OF 1S* to x1. Therefore, x contains x1 and a column of 1s. You want to fit this one. The 1s here are the ‘data associated with the constant (intercept) of the regression.