**Why GGPlot2 Scatter Plot?**

Data visualization is one of the most important steps in data analysis. It helps us gain insight from the data, which would be hard-gained with data as pure numbers. One of the major advantages of visualizing data is that we can relay our findings to an audience, irrelevant to its membersâ technical expertise.

As for which environment we choose for data visualization, there is no shortage of available options. From high-level software Tableau and Power BI to leveraging the data viz properties of data science programming languages such as Python and R.

Though overshadowed in recent years by Python, R remains one of the most important tools in data science-related fields, and especially in statistics related professions. Which is why we choose to dedicate this tutorial to charts with none other the R programming language.

## And if youâre using R for data visualization, you canât go far without the ggplot2 package.

ggplot2 is a data visualization package, created by Hadley Wickam in 2005. It relies on a concept known as the grammar of graphics, which is a set of rules for dividing each plot into components or layers.

### Moreover, ggplot2 is a high-level visualization library and is one of the most popular packages in R. Itâs considered a staple for any data scientist working in R.

But hold on Python users, donât go anywhere! Due to its overall popularity, there is even a ggplot package available in Python. Here, itâs important to note, that ggplot is closelyÂ related to pandas. So, you might wish to store your data as a pandas data frame when using ggplot in Python.

This also means, that once youâve become a master of ggplot, youâd be able to conjure up plots in both R and Python! Killing two birds with one stone. So, letâs dive straight into ggplot and introduce the grammar of graphics!

## How To Make a GGPlot2 Scatter Plot in R: What is the Grammar of Graphics?

The âgrammar of graphicsâ is the basis for how each GG plot is created. You can think of it as a way of dividing each plot into layers, where each layer is responsible for a specific element of the chart. There are seven layers we can use when creating a âGG plotâ.

So, below, Iâll give you an overview of each of these 7 layers that constitute the âgrammar of graphicsâ. Then, to put what youâve learned into practice, Iâll show you how to create your very own scatter plot using a ggplot.

First, to the ggplot2 layers: The first three layers are mandatory, while the remaining four are optional.

## How To Make a GGPlot2 Scatter Plot in R: Mandatory Layers

### GGPlot2 Data Layer

Letâs start with the first layer, also known as **the data layer**. Creating a chart, naturally, means we require some data. Otherwise, our chart wouldnât have anything to display. Therefore, it is only logical that data is the first and most important layer.

### GGPlot2 Aesthetics Layer

Next, we must decide how the data will be visually organized onto different axes. This is where the second layer comes in: it is called **the aesthetics** layer. Here we specify the mapping to the x and y axes.

### GGPlot2 Geometry Layer

The last required layer is **the geometry layer**. Through geometry, we specify what shape our data will take. In other words, will the points take the form of bars, circles, dots, etc?

These **three** layers alone are all it takes to create a chart in R. The remaining four layers are optional. However, if youâd like to be proficient at visualizing data in R, it pays off to master them, too!

#### An important distinction between the mandatory and optional layers concerns the order in which they appear.

For the first three layers, we must start with data, continue with aesthetics and, as a third step, determine the geometries.

The remaining four layers donât need to appear in strict order. In fact, we donât have to include any of them in our plot, hence the optional part.

In this tutorial, we list the optional layers in a specific order simply for convenience. But it is **not** necessarily the order in which they should appear in a âGG plotâ.

## How To Make a GGPlot2 Scatter Plot in R: Optional Layers

### GGPlot2 Facets Layer

With that in mind, letâs continue with the fourth layer:Â the âFacetsâ layer. The âFacetsâ layer enables us to split our visualization into subplots, according to a categorical variable or variables. Subsequently, each subplot corresponds to a subset of categories of the variables. For instance, say we have an engine type variable in our data, containing diesel, petrol, and electric engine cars. Using the âfacetsâ layer would allow us to divide our data set into three plots, each containing one of the three categories:

- diesel,
- petrol,
- and electric cars.

### GGPlot2 Statistics Layer

Moving on to the fifth layer which is: âStatisticsâ. This layer represents the statistical transformations we might perform on our data. It can be used for various purposes, such as determining the number of bins when plotting a histogram or smoothing lines when drawing a regression line.

### GGPlot2 Coordinates Layer

The sixth layer is the âCoordinatesâ layer. The coordinates layer, as the name suggests, is connected to the coordinates or boundaries of our graphs. This layer can be used to zoom in or out of a plot. Apart from this, it could be used to perform transformations on the coordinate system â for instance, switch to polar coordinates.

### GGPlot2 Themes Layer

Finally, we arrive at the seventh layer: âThemesâ. The âThemes layerâ has a sole purpose â to polish the appearance of our plot. This is the place that controls the overall style of our graph. For instance, we can choose from darker or lighter themes. Not only that, with the aid of the âthemesâ layer we can use predefined templates or recreate the visuals and style of well-known publications such as the BBC or the Economist, among others.

And thatâs all the seven layers of a ggplot!

Alright, so far, this all has been theoretical. But weâre ready to proceed with a practical example. Next up, youâll learn how to implore your newly gained knowledge to create your very own ggplot2 scatter plot.

## How to Create a Simple GGplot2 Scatter plot?

Weâll be using âGG plot 2â – one of the most popular visualization libraries in R.

To use a library in R, we need to include it in our script. We achieve it by writing âlibraryâ. Then, in the round brackets include the name of the library, in our case âGG Plot 2â.

Next, youâd need some data. In this example, weâre using a data set, containing information on California real estate.

Of course, you can use your data for plotting. Once you have it, you require the ggplot function. For a complete plot, we require all three mandatory layers.

Nonetheless, as this is an introductory tutorial, letâs examine where acquiring each individual layer gets us. We begin with:

- Data layer. Creating a ggplot with only the data layer results in an empty plot.

- Aesthetics. Here, we specify the x and y axes. This allows us to assign the variables to their corresponding axis.

- Lastly comes the geometry. For a scatter plot, the required geometry is geom_point, as each data entry is displayed as a point on our plot.

And thatâs all you need to make a ggplot2 scatter plot. For more awesome tips and tricks, you should most definitely check out the ggplot2 cheat sheet. It is a superb resource!

## Some final words on GGPlot2 Scatter Plot…

You already have the tools necessary to make a gg plot2 scatter plot. But this tutorial is only the beginning. R programming and the ggplot2 library offer a variety of options for visualizing and styling exceptional charts. We encourage you to explore the ggplot package on your own and try your hand at crafting ggplots. Bet youâll come up with some great ideas!

## Ready to take the next step towards a data science career?

Check out theÂ **complete Data Science Program**Â today. Start with the fundamentals with our Statistics, Maths, and Excel courses. Build up a step-by-step experience with SQL, Python, R, Power BI, and Tableau. And upgrade your skillset with Machine Learning, Deep Learning, Credit Risk Modeling, Time Series Analysis, and Customer Analytics in Python. Still not sure you want to turn your interest in data science into a career?Â *You can explore the curriculum or sign up 12 hours of beginner to advanced video contentÂ for freeÂ by clicking on the button below.*