The 365 Data Science team is proud to invite you to our own community forum. A very well built system to support your queries, questions and give the chance to show your knowledge and help others in their path of becoming Data Science specialists.
Ask
Anybody can ask a question
Answer
Anybody can answer
Vote
The best answers are voted up and moderated by our team

Logistic Regression on purchase incidence vs average candy price

Logistic Regression on purchase incidence vs average candy price

0
Votes
1
Answer

I am going through the Customer Analystic section. In this section, logistic regression is conduted to fit the model with our X or price and our Y or incidence.

The codes provided by the course are as follow:
“”
# We create a Logistic Regression model using sk learn. Then we fit the model with our X or price and our Y or incidence.
model_purchase = LogisticRegression(solver = ‘sag’)
model_purchase.fit(X, Y)
“”
I try to do scatter plot on X,Y to see what happen. My codes following the above codes are
“”
f=plt.figure(figsize=(16,8))
ax=f.add_subplot(111)
ax.scatter(X,Y)
**
However, I found that plot is not like a sigmoid function as we commonly have in Logistic Regression. ( it looks like a pair of parrell lines)
Confused.. is my scatter plot making sense here? do we expect to see a sigmoid-like scatter plot?

 

Many thanks!!

1 Answer

365 Team
0
Votes

 

 
Hi MinliYu, 
thanks for reaching out!
When you create a scatter plot on X and Y, you’re creating a scatter plot on those two variables X and Y. In our case:
Y is the incidence column which is binary(contains 0s and 1s)
X is the average price.
A scatter plot shows each point with coordinates X and Y on a 2D plane.So, the pair of parallel lines you’re seeing are for the 0s and 1s from the X variable. And a scatter plot of X and Y should look exactly like that. 

A sigmoid function, on the other hand, is something different. The sigmoid function is the logistic function used by the Logistic Regression to estimate the class probabilities. 
 
Best, 
Eli