Working with the LEFT JOIN in SQL

Join over 2 million students who advanced their careers with 365 Data Science. Learn from instructors who have worked at Meta, Spotify, Google, IKEA, Netflix, and Coca-Cola and master Python, SQL, Excel, machine learning, data analysis, AI fundamentals, and more.

Start for Free
Martin Ganchev Martin Ganchev 25 Apr 2023 12 min read

How does a LEFT JOIN in SQL work?

If you have any idea what JOINS are and you are familiar with the INNER JOIN type, learning how to use the LEFT JOIN in SQL should be a walk in the park. Moreover, we will be handling duplicate records, so make sure you know a thing or two about it. You can learn all this and more in our SQL course.

Side note: If you don’t have the ‘employees’ database, you can download it here.

A Venn Diagram

The Venn diagram you see in the picture below, allows you to visualize how a LEFT JOIN in SQL works.

Highlighted left side, left join in sql

Its output allows us to see all records from the table on the left side of the JOIN, including all matching rows of the two tables. That’s why, compared to the INNER JOIN, the result set, coloured in red, includes the rest of the area of the left table.

In SQL terms, this translates to retrieving all matching values of the two tables plus all values from the left table that do not match any values from the right table.

all matching values and all values on the left, left join in sql

To clarify the concept, let’s look at an application.

An Application of the LEFT JOIN in SQL

So, if you’ve read the tutorial on the duplicate records, there are a few things we want to delete. First, let’s remove the duplicates from the two tables. We can achieve that by writing the following code:

DELETE FROM dept_manager_dup
WHERE emp_no = ‘110228’;
 
DELETE FROM departments_dup
WHERE dept_no = ‘d009’;

If you are not sure how the DELETE statement works, the linked tutorial may well shed some light on the subject.

Duplicate values inflate our output with repeating records and do not contribute to the results we obtain meaningfully.

Technically, after deleting the rows in which we see the respective employee and department numbers, MySQL will delete all initial records and all their duplicates.

2 rows affected, left join in sql

How to Re-add the Initial Records?

In order to add them back, try coding:

INSERT INTO dept_manager_dup
VALUES (‘110228’, ‘d003’, ‘1992-03-21’, ‘9999-01-01’);
 
INSERT INTO departments_dup
VALUES (‘d009’, ‘Customer Service’);

The INSERT statement is discussed in the following tutorial.

After running the query, everything looks good.

1 row affected, left join in sql

How to Implement the LEFT JOIN in SQL?

We can write similar code to the one we would use with the INNER JOIN. However, the JOIN syntax will be subject to a slight change. Instead of relating the two tables through the keyword JOIN, or INNER JOIN, we must type LEFT JOIN. This is what the code would look like:

SELECT
    m.dept_no, m.emp_no, d.dept_name
FROM
    dept_manager_dup m
        LEFT JOIN
    departments_dup d ON m.dept_no = d.dept_no
GROUP BY m.emp_no
ORDER BY m.dept_no;

That’s all that needs to be modified in the code.

Left join

After executing, the query returned 26 rows, as shown in the picture below!

26 rows affected, left join in sql

The Extra Rows

6 rows more than the 20 rows we obtained in the example about INNER JOINS. Which are these 6 rows that make the difference?

26 rows left join 20 rows inner join

To start our analysis, please, refer to the Venn diagram. These 6 records represent the area filled with blue. This is the part of the left set where no values match with the right table.

All values that don't match values from the right table, left join in sql

All 6 records of interest are in the top part of the output, as you can see in the picture below.

26 rows left join 20 rows inner join, left join in sql

The First Rows

To begin with, all 4 records that appear in the “dept_manager_dup” table, but not in the “departments_dup” table, have been retrieved. They refer to employees with numbers 9999 0-4, 5, 6, and 7.

First four rows, left join in sql

Why were these 4 rows retrieved? Because they were registered in the table on the left, but not in the table on the right. There are no department number values that could match these records with the table on the right. Hence, they are an example of non-matching values from the left table.

Venn Diagram representing tables, left join in sql

The Last Rows

There are two more records we must discuss – the ones of department 2 from the left table, “dept_manager_dup”. They appear in the LEFT JOIN output. There are no such records in the “departments_dup” table.

2 rows with no corresponding values, left join in sql

There we had information for all departments except department 2. The two records appear in the final left-join list, but there are no corresponding values for the “dept_name” field that can come from the table on the right.

no d002, left join in sql

The Order Matters

Basically, this is a proof that, differently from an INNER JOIN, when working with a LEFT JOIN in SQL, the order in which you join tables matters. Having the “dept_manager_dup” table, M, or the “departments_dup” table, D, on the left, can change results completely.

How to Change the Order?

Let’s invert the order of the tables in the query. We will place the “departments_dup” table on the left and the “dept_manager_dup” table on the right.

“departments_dup” table on the left and the “dept_manager_dup” table on the right, left join in sql

Let’s see what will happen, after we write the following code:

SELECT
    m.dept_no, m.emp_no, d.dept_name
FROM
    departments_dup d
        LEFT JOIN
    dept_manager_dup m ON m.dept_no = d.dept_no
ORDER BY m.dept_no;

 

The output, left join in sql

The Results

A different output! This was due to swapping the direction of the two tables. So, now we will have all matching values from the two tables plus the department numbers from the “departments_dup” table that match no value from the “dept_manager_dup” table.

dept_no column highlighted, left join in sql

Therefore, we can expect to see records for departments 1, 10, and 11.

no records for for departments 1, 10, 11, left join in sql

But why don’t we see any of them?

Correcting the Mistake We Made

The answer lies in the way we’ve chosen the first selection in this query. After we inverted the order of the tables and executed the query, the output changed, but that is not so obvious. To re-organize it properly, please, retrieve the department number data from the “departments_dup” table, not from the “dept_manager_dup” table.

not from the “dept_manager_dup” table, left join in sql

The correct way to do that is to retrieve the first selection from the first table you have set in the join syntax.

What We Have Swapped

The picture is the opposite of the one the analysis just conducted. Referring to the Venn diagram, imagine that, with this query, we swapped the places of the two tables. That’s why, in the result section, we have only one null department number value, and we have data about department 1 that has no information about an employee number.

d001, left join in sql

The latter is a consequence of the fact that we have no data about department number 1 in the “dept_manager_dup” table.

And because the “departments_dup” table contains no information about the second department, there will be no record of it in the result set either.

no d002, left join in sql

The Rest of the Departments

Finally, as you scroll down, you’ll see information about all departments from 3 to 9 with corresponding employee numbers and department names. At the end, as you probably expected, departments 10 and 11 are quoted, with no information in the remaining two fields.

d010 and d011, left join in sql

Some Syntactical Peculiarities

Why are INNER JOINS called that? Because the values in the output are common to the two tables, and the result set is in the inner part of the Venn diagram; it is the area where both circles overlap.

Inner join result set

OUTER JOINS

So, OUTER JOINS must exist, too! The LEFT JOIN in SQL falls in this category since in the output obtained you have data from the outer part of the Venn diagram too.

My point here is, in this case, you could also use the keywords LEFT OUTER JOIN, and you will extract an identical result.

left join is left outer join

Therefore, if we re-run the query, adding the keyword OUTER in the code, the outcome will not differ from the one we obtained earlier.

highlight table

Side note: The LEFT JOIN and LEFT OUTER JOIN are two interchangeable phrases. From a professional perspective, there is no reason for typing the keyword OUTER here, because if you are using a LEFT JOIN in SQL, it will always be an OUTER type of JOIN. This is like omitting the word INNER from INNER JOIN. Therefore, as long as it is clear what type of join will be applied, professionals prefer shorter phrases.

left join will always be outer

How to Obtain Records Only from the Left Table?

There is one final interesting detail I’d like to add. LEFT JOINS can deliver a list with all records from the left table that do not match any rows from the right table. Put differently, the idea is to obtain all records that appear only in the left outer blue part of the Venn diagram in this picture.

all records that appear only in the left outer blue part of the Venn diagram

Which are these records, in our case?

Important: Let’s get the code back to its original state:

SELECT
    m.dept_no, m.emp_no, d.dept_name
FROM
    dept_manager_dup m
        LEFT JOIN
    departments_dup d ON m.dept_no = d.dept_no
ORDER BY m.dept_no;

Looking at the output obtained with the first LEFT JOIN, these are the top six rows of the data list. It is shown in the picture below.

venn diagram and table

They have null values in the “dept_name” field. And how could we get to these rows programmatically?

Using the WHERE Clause

The trick is to include a WHERE clause, whose condition will be all “dept_name” values to be null.

column name is null

So, we can add the following code, right before the ORDER BY statement:

WHERE dept_name IS NULL

Through a LEFT JOIN, we will retrieve this bit of the table, whose column information comes from the right table and is null.

6 rows returned

We managed to obtain a 6-row result, as we wanted.

What is Different about the LEFT JOIN?

In conclusion, when we use a LEFT JOIN in SQL, we are still joining data. However, it is more of an OUTER JOIN. What this means, is that some of the data belongs to only one of the two tables. Particularly, to the left one. Contrary to the INNER JOIN, changing the order of the tables in our code will make a huge difference to the output.

So, I suppose you are thinking that if there is a LEFT JOIN in SQL, there must be a RIGHT JOIN, too. Well, you are absolutely right. Feel free to jump into the tutorial where we will be discussing it.

***

Eager to hone your SQL skills? Learn how to put theory into practice with our hands-on tutorials!  

Next Tutorial: SQL RIGHT JOIN

Martin Ganchev

Martin Ganchev

Instructor at 365 Data Science

Martin holds an MSc degree in Economic and Social Sciences from Bocconi University. His diverse academic and research experience combined with his friendly and explanatory approach to teaching have made him one of the most beloved instructors on our team. Some of the courses he has authored include: SQL, SQL + Tableau, SQL+Tableau+Python, Introduction to Python, Introduction to Jupyter, to name a few.

Top