How to Write SQL Subqueries

Join over 2 million students who advanced their careers with 365 Data Science. Learn from instructors who have worked at Meta, Spotify, Google, IKEA, Netflix, and Coca-Cola and master Python, SQL, Excel, machine learning, data analysis, AI fundamentals, and more.

Start for Free
Martin Ganchev Martin Ganchev 25 Apr 2023 9 min read

In this lesson, we will see how SQL subqueries can be incorporated within queries.

Side note: We are going to mention the possible uses of JOINS. So, if you are not familiar with the subject, you can check out the following set of tutorials, beginning with the INNER JOIN.

As their name suggests, subqueries are queries embedded within a query. They are also called inner queries or nested queries, and they are part of another query, called an outer query. Alternative names for these SQL features are inner select and outer select, respectively.

outer select, SQL subqueries

Subqueries can be applied in many ways. Nevertheless, the main idea is the same.

Side note: You can see how to load the databases we are using, in the linked tutorial.

A Case in Point

Most often, SQL subqueries are employed in the WHERE clause of a SELECT statement. Let’s see what will happen after we run the following code:

SELECT
    *
FROM
    dept_manager;

24 rows returned, sql subqueries

We retrieved 24 records, which means 24 people currently work in the enterprise as managers. If our database is in perfect condition, these same 24 people must be registered in the “employees” table, where we can find different information about them.

24 rows, sql subqueries

Why We Need SQL Subqueries

So, assume you are interested in obtaining the names of all employees working within the enterprise as managers. Keep in mind that names can be found only in the “employees” table, not in the “dept_manager” table.

names in employees table, sql subqueries

There are a few ways you could obtain the desired result. We will try one involving the use of SQL subqueries, as it will allow us to extract the required information from the first table, so long as the same employees have been registered in the “dept_manager” table.

24 rows, sql subqueries

The Syntax

We select the first and last name from the “employees” table for the same employee numbers that can be found in the “dept_manager” table. The last condition can be set through the WHERE clause and the IN operator. It will enlist all employee numbers recorded in the “dept_manager” table.  In programming language, it will look like this: 

SELECT
    e.first_name, e.last_name
FROM
    employees e
WHERE e.emp_no IN (SELECT
    dm.emp_no
FROM
    dept_manager dm);

After we execute this query, we will be able to verify that the data about all 24 managers has been stored in the “employees” table as well.

24 rows returned, sql subqueries

Analysis

Let’s analyse this example. First, the outer query is the entire code, starting from SELECT first name, last name, and so on, until the IN operator.

outer query

The subquery, or the inner query, that is nested within the outer query, is what is underlined in the picture below.

subquery

Important: A subquery should always be placed within parentheses.

In this case, we should make no exception to this rule as well.

The Process

From a processing point of view, the SQL engine starts by running the inner query. Then it uses its returned output, which is intermediate, to execute the outer query.

In general, a subquery may return a single value (a scalar), a single row, a single column, or an entire table. In our case, it delivers the entire “emp_no” column from the “dept_manager” table.

24 rows emp number, sql subqueries

Actually, to prove this is the case, we could run the subquery as a stand-alone query.

So, we will simply write:

SELECT
    dm.emp_no
FROM
    dept_manager dm;

24 rows returned

As you can see in the picture above, we obtained the “emp_no” column the way we should have.

Multiple Nested SQL Subqueries

Although we will not dive into tasks that are too complex, please remember you can have a lot more than one subquery in your outer query. It is possible to nest inner queries within other inner queries, thus allowing your analysis to achieve significant depth. In that case, the SQL engine would execute the innermost query first, and then each subsequent query, until it runs the outermost query last.

What Does the EXISTS Operator Do?

There is another logical operator we need to introduce, as it is related strictly to using SQL subqueries.

The EXISTS operator checks whether certain row values are found within a subquery. This check is conducted row by row. It returns a Boolean value. If a row value of a subquery exists, the operator returns TRUE and the corresponding record of the outer query is extracted. Otherwise, EXISTS returns FALSE, and no row value from the outer query is extracted.

operator returns true exists returns false

An Example

Try writing the following code:

SELECT
    e.first_name, e.last_name
FROM
    employees e
WHERE EXISTS( SELECT
        *
    FROM
        dept_manager dm
    WHERE
        dm.emp_no = e.emp_no);

It will deliver all first and last names of the people in the “employees” table who are also found in the “dept_manager” table. As a matter of fact, we’ll create a whole table, not just a column, as we did with the IN operator.

dept manager table

After running the query, the output contains 24 rows, as shown in the picture below.

24 rows returned

EXISTS vs IN

Obviously, the EXISTS and the IN operators resemble each other when applied to SQL subqueries. Here is the substantial difference between the two.

It is noticeable, particularly when working on large web, mobile, or business applications, where visualizing data quickly is crucial.

EXISTS tests row values for existence, while IN searches among values. Hence, EXISTS is quicker in retrieving large amounts of data, while IN is faster with smaller datasets.

exists and in

ORDER BY in Nested Queries

Let me make a short side note about using ORDER BY in the context of nested queries. In MySQL, you can order the result list by “emp_no” and insert the ORDER BY clause within the subquery.

The code should look like this:

SELECT
    e.first_name, e.last_name
FROM
    employees e
WHERE
    EXISTS( SELECT
        *
    FROM
        dept_manager dm
    WHERE
        dm.emp_no = e.emp_no
    ORDER BY emp_no);

Of course, the query will run properly – the outer query will simply follow the order of the inner query.

exists order by

The Better Way to Do It

However, it is more professional to apply ORDER BY in the outer query.

Just like this:

SELECT
    e.first_name, e.last_name
FROM
    employees e
WHERE
    EXISTS( SELECT
        *
    FROM
        dept_manager dm
    WHERE
        dm.emp_no = e.emp_no)
ORDER BY emp_no;

So, please try to get used to leaving this clause in the outer select. It is more acceptable logically to sort the final version of your dataset and not versions preceding the final one.

exists where

The Alternative

Be aware that some, though not all, nested queries can be rewritten using JOINs, which are more efficient in general. This is true particularly for inner queries using the WHERE clause. Although sometimes the same answers can be obtained with a join, nested queries are considered an essential tool, and that’s why you must learn them.

nested queries can be rewritten using joins

What Are the Benefits of SQL Subqueries?

On certain occasions, subqueries can be bad for performance concerns, but here’s why people would still need to use them.

First, they allow for better structuring of the outer query. Thus, each inner query can be thought of in isolation, and organizing the extraction process can be improved. Hence the name of SQL – Structured Query Language!

allow for better structuring of the outer query

Second, in some situations, the use of subqueries is much more intuitive compared to the use of complex joins and unions.

Finally, many users prefer subqueries simply because they offer enhanced code readability.

What’s Next?

So, if you didn’t know, there is a tool in SQL that allows you to join a table with itself. Feel free to embark on the journey where we will be discussing the SELF JOIN.

***

Eager to hone your SQL skills? Check out our SQL course.

Next Tutorial: Use the SELF JOIN in SQL

Martin Ganchev

Martin Ganchev

Instructor at 365 Data Science

Martin holds an MSc degree in Economic and Social Sciences from Bocconi University. His diverse academic and research experience combined with his friendly and explanatory approach to teaching have made him one of the most beloved instructors on our team. Some of the courses he has authored include: SQL, SQL + Tableau, SQL+Tableau+Python, Introduction to Python, Introduction to Jupyter, to name a few.

Top