The 365 Data Science team is proud to invite you to our own community forum. A very well built system to support your queries, questions and give the chance to show your knowledge and help others in their path of becoming Data Science specialists.
Ask
Anybody can ask a question
Answer
Anybody can answer
Vote
The best answers are voted up and moderated by our team

Question on simple linear regression example.

Question on simple linear regression example.

0
Votes
1
Answer

In the lecture example, it mentioned the following code:
    # Add a constant. Essentially, we are adding a new column (equal in length to x), which consists only of 1s
    x = sm.add_constant(x1)
    # Fit the model, according to the OLS (ordinary least squares) method with a dependent variable y and an independent x
    results = sm.OLS(y,x).fit()
    # Print a nice summary of the regression. That’s one of the strong points of statsmodels -> the summaries
    results.summary()
I don’t understand what the sm.add_constant do, and why are we fitting x instead of x1 to the statsmodels? 
 

1 Answer

365 Team
0
Votes

Hi Kam,
Good question! It is explained in the subsequent lectures!
If after watching them until the end of the section you are still having this question, please come back here and we will dive deeper into it 🙂
Best,
The 365 Team

Hi, I finished the linear regression model section, and working on the multiple regression model session now. I sort of understand sm.add_constant do is to add beta 0, yet I still don’t understand why the syntax is sm.OLS(y.x).fit() that we are fitting x instead of x1. If we are assigning our data to x1, don’t we want statmodels to compute the with all the data? What I can see from x is all the ones from the notebook provided. Can you elaborate a bit more?

6 months

Hi Kam, statsmodels works in a way in which, x = sm.add_constant(x1) means that we are *ADDING A COLUMN OF 1S* to x1. Therefore, x contains x1 and a column of 1s. You want to fit this one. The 1s here are the ‘data associated with the constant (intercept) of the regression.

6 months