The 365 Data Science team is proud to invite you to our own community forum. A very well build system to support your queries, questions and give the chance to show your knowledge and help others in their path of becoming Data Science specialists.
Ask
Anybody can ask a question
Answer
Anybody can answer
Vote
The best answers are voted up and moderated by our team

How to scatter plot the real estate document with 2 independent variables (size, year)

How to scatter plot the real estate document with 2 independent variables (size, year)

0
Votes
8
Answer

From Machine Learning – Multiple Linear Regression – Using Python 3, how can I create a scatter plot given 2 independent variables (year, size)?  And what would the linear regression formula be?

8 Answers

Top Answer

365 Team

Hi Casey,
You know that if you have a single predictor, then you’d have a 2D plot, right? There’s 1 independent X and the a dependent Y (say ‘size’ and ‘price’ in that case). 
To plot 2 independent variables, you will need yet another dimension. In that case you will have a 3D plot. To achieve that you can use the mplot3d package (built on top of matplotlib). Here’s the documentation: https://matplotlib.org/mpl_toolkits/mplot3d/
So, assuming that your 2 independent variables are ‘size’ and ‘year’, while the dependent is ‘price’, for the specific exercise you are referring to, you can use code in the lines of:

from mpl_toolkits.mplot3d import axes3d, Axes3D
fig = plt.figure()
ax = plt.axes(projection="3d")
ax.scatter3D(data['size'], data['year'], data['price'], cmap='hsv')
plt.show()

This will result into a 3D scatter plot:

Now what if you want to plot a regression line? 
First, it is important to note that it is a regression line, in 2D, however, in 3D that would be a plane.
To find that plane with statsmodels (and within the given example), we can write:

results.params[0] + results.params[1]*data['size'] + results.params[2]*data['year']

Note that here, results.params contains the coefficients of the regression. There are 3 of them: constant, coef. for size and coef. for year. So the regression line becomes the abovementioned equation. Finally, we can incorporate that in the graph, using a 3D line (plane).

from mpl_toolkits.mplot3d import axes3d, Axes3D
fig = plt.figure()
ax = plt.axes(projection="3d")
x_line = data['size']
y_line = data['year']
z_line = results.params[0] + results.params[1]*data['size'] + results.params[2]*data['year']
ax.plot3D(x_line, y_line, z_line, 'orange')
ax.scatter3D(data['size'], data['year'], data['price'], cmap='hsv')
plt.show()

And what you’ll see is:

Best,
The 365 Team  

You didn’t define ‘results’

2 months

I see. Earlier homework assignments required defining ‘results’

2 months

I start a new document after my code gets to be overwhelming, so I started a new document today and needed to copy and paste some of the code into the new doc

2 months

Glad that you figured it out!

2 months

0
Votes

Ok, I am almost there, thank you, but the following code displayed the same looking plane with no dots.  What’s a scatter plot with no dots?!  Any ideas on why the dots are missing?

fig = plt.figure
ax = plt.axes(projection="3d")
x_line = data['size']
y_line = data['year']
z_line = results.params[0] + results.params[1]*data['size'] + results.params[2]*data['year']
ax.plot3D(x_line, y_line, z_line, 'orange')
plt.show()

 

365 Team
0
Votes

Hi Casey,
The scatter itself is created with the code:

ax.scatter3D(data['size'], data['year'], data['price'], cmap='hsv')

I believe you are missing this bit. This scatter represents the points. The rest of the code (the one you’ve implemented) draws the regression line. To get only the scatter, please refer to the first code cell in the original answer:

fig = plt.figure()
ax = plt.axes(projection="3d")
ax.scatter3D(data['size'], data['year'], data['price'], cmap='hsv')
plt.show()
Hope this helps,
The 365 Team

0
Votes

I had to add this

from mpl_toolkits.mplot3d import axes3d, Axes3D

 

Oh.. completely forgot to reference this in the original answer. It has now been added!

2 months

0
Votes

So I tried this because the code following the # leads to an IndexError: Index is out of bounds.  Can you explain why that is?  Also, the code below yields a different plane than yours x_line = data[‘size’] y_line = data[‘year’] z_line = data[‘price’] #z_line = results.params[0] + results.params[1]*data[‘size’] + results.params[2]*data[‘year’] ax.plot3D(x_line, y_line, z_line, ‘orange’) ax.scatter3D(data[‘size’], data[‘year’], data[‘price’], cmap=’hsv’) plt.show()” alt=”” />, though I don’t know how to post it

fig = plt.figure()
ax = plt.axes(projection="3d")
x_line = data['size']
y_line = data['year']
z_line = data['price']
#z_line = results.params[0] + results.params[1]*data['size'] + results.params[2]*data['year']
ax.plot3D(x_line, y_line, z_line, 'orange')
ax.scatter3D(data['size'], data['year'], data['price'], cmap='hsv')
plt.show()


Hi Casey,Could you please post a screenshot of what you see? You can upload it on imgur: https://imgur.com/ and then share the link here. My assumption is that you should restart the Kernel and Run All cells again.

2 months

0
Votes

I don’t know, I’m having trouble capturing a screenshot of my work

0
Votes

I posted it to imgur.com but I don’t know how you’ll find it so I embedded it

Multiple Linear Regression

” title=””>” alt=”Multiple Linear Regression Scatter Plot” data-mce-id=”__mcenew”>

0
Votes

Here’s the imgur link:
https://imgur.com/gallery/hCsr3eF

Hi Casey! Unfortunately, this screenshot does not precisely the code we are interested in. Could you please show us the cell it relates to?

2 months

I thought you were interested in seeing the difference in our scatter plots and the screenshot depicts a different plot than yours. Can you clarify what the problem is?

2 months