Last answered:

21 Nov 2022

Posted on:

14 Nov 2022

0

# A Question About The stat_density2d() Function

We don't have any data in hdi-cpi.csv about population.  So how did we get
a density function - stat_density2d() -  in the graph?  What am I missing?

This is the code:
sp <- ggplot(hdi, aes(CPI.2015, HDI.2015))

sp + geom_point(aes(color = Region), shape = 21,
fill = "white", size = 3, stroke = 2) +
theme_light() +
labs(x = "Corruption Perception Index, 2015",
y = "Human Development Index, 2015",
title = "Corruption And Human Development") +
stat_density2d()

Thank you

3 answers ( 0 marked as helpful)
Instructor
Posted on:

14 Nov 2022

0

Hi T.Serghides,
thanks for reaching out and sorry for the inconvenience! You should be able to download the data set in question from this lecture:https://learn.365datascience.com/courses/introduction-to-r-programming/intro-to-ggplot2/
Let me know if you have any further issues.

Best,
365 Eli

Posted on:

19 Nov 2022

0

Elitsa, Thank you for the reply.

>  You should be able to download the data set in question from this lecture...

I did download the data and I can replicate the scatter plot with the countour lines (the density plot) with the density function stat_density2d() as follows:

library(ggplot2)

hdi <- read.csv("hdi-cpi.csv")

sp <- ggplot(hdi, aes(CPI.2015, HDI.2015))

sp + geom_point(aes(color = Region), shape = 21,
fill = "white", size = 3, stroke = 2) +
theme_light() +
labs(x = "Corruption Perception Index, 2015",
y = "Human Development Index, 2015",
title = "Corruption And Human Development") +
stat_density2d()

My question is this:  Where is the data, in the csv file (hdi-cpi.csv), for the countour lines?  What is the name of the variable (the column) for the contour lines?  This, I don't understand.

Thank you again for the reply.

Instructor
Posted on:

21 Nov 2022

0

Hi T. Serghides,
sorry for the confusion. `stat_density2d()` is an R function. So, you can use it for every scatterplot you create in R and it will plot the density of the data. As mentioned in the video, that's a good option if you have too many data points and the scatter will look overcrowded. In that case you can plot the density of the data distribution for a nicer overview.
Hope this helps!

Best,
365 Eli