How to Write SQL Subqueries

SQL Tutorials 9 min read
SQL Subqueries

How to Write SQL Subqueries

9 min read

In this lesson, we will see how SQL subqueries can be incorporated within queries.

Side note: We are going to mention the possible uses of JOINS. So, if you are not familiar with the subject, you can check out the following set of tutorials, beginning with the INNER JOIN.

As their name suggests, subqueries are queries embedded within a query. They are also called inner queries or nested queries, and they are part of another query, called an outer query. Alternative names for these SQL features are inner select and outer select, respectively.

outer select

Subqueries can be applied in many ways. Nevertheless, the main idea is the same.

Side note: You can see how to load the databases we are using, in the linked tutorial.

 

A Case in Point

Most often, SQL subqueries are employed in the WHERE clause of a SELECT statement. Let’s see what will happen after we run the following code:

SELECT

               *

FROM

               dept_manager;

24 rows returned

We retrieved 24 records, which means 24 people currently work in the enterprise as managers. If our database is in perfect condition, these same 24 people must be registered in the “employees” table, where we can find different information about them.

24 rows

 

Why We Need SQL Subqueries

 

So, assume you are interested in obtaining the names of all employees working within the enterprise as managers. Keep in mind that names can be found only in the “employees” table, not in the “dept_manager” table.

names in employees table

There are a few ways you could obtain the desired result. We will try one involving the use of SQL subqueries, as it will allow us to extract the required information from the first table, so long as the same employees have been registered in the “dept_manager” table.

24 rows

 

The Syntax

 

We select the first and last name from the “employees” table for the same employee numbers that can be found in the “dept_manager” table. The last condition can be set through the WHERE clause and the IN operator. It will enlist all employee numbers recorded in the “dept_manager” table.  In programming language, it will look like this:

SELECT

               e.first_name, e.last_name

FROM

               employees e

WHERE

               e.emp_no IN (SELECT

                              dm.emp_no

               FROM

               dept_manager dm);

After we execute this query, we will be able to verify that the data about all 24 managers has been stored in the “employees” table as well.

24 rows returned

 

Analysis

 

Let’s analyse this example. First, the outer query is the entire code, starting from SELECT first name, last name, and so on, until the IN operator.

outer query

The subquery, or the inner query, that is nested within the outer query, is what is underlined in the picture below.

subquery

Important: A subquery should always be placed within parentheses.

In this case, we should make no exception to this rule as well.

 

The Process

From a processing point of view, the SQL engine starts by running the inner query. Then it uses its returned output, which is intermediate, to execute the outer query.

In general, a subquery may return a single value (a scalar), a single row, a single column, or an entire table. In our case, it delivers the entire “emp_no” column from the “dept_manager” table.

24 rows emp number

Actually, to prove this is the case, we could run the subquery as a stand-alone query.

So, we will simply write:

SELECT

               dm.emp_no

FROM

               dept_manager dm;

24 rows returned

As you can see in the picture above, we obtained the “emp_no” column the way we should have.

 

Multiple Nested SQL Subqueries

 

Although we will not dive into tasks that are too complex, please remember you can have a lot more than one subquery in your outer query. It is possible to nest inner queries within other inner queries, thus allowing your analysis to achieve significant depth. In that case, the SQL engine would execute the innermost query first, and then each subsequent query, until it runs the outermost query last.

 

The EXISTS Operator

 

There is another logical operator we need to introduce, as it is related strictly to using SQL subqueries.

The EXISTS operator checks whether certain row values are found within a subquery. This check is conducted row by row. It returns a Boolean value. If a row value of a subquery exists, the operator returns TRUE and the corresponding record of the outer query is extracted. Otherwise, EXISTS returns FALSE, and no row value from the outer query is extracted.

operator returns true exists returns false

 

An Example

Try writing the following code:

SELECT

               e.first_name, e.last_name

FROM

               employees e

WHERE

               EXISTS( SELECT

                              *

               FROM

                              dept_manager dm

               WHERE

                              dm.emp_no = e.emp_no);

It will deliver all first and last names of the people in the “employees” table who are also found in the “dept_manager” table. As a matter of fact, we’ll create a whole table, not just a column, as we did with the IN operator.

dept manager table

After running the query, the output contains 24 rows, as shown in the picture below.

24 rows returned

 

EXISTS vs IN

 

Obviously, the EXISTS and the IN operators resemble each other when applied to SQL subqueries. Here is the substantial difference between the two.

It is noticeable, particularly when working on large web, mobile, or business applications, where visualizing data quickly is crucial.

EXISTS tests row values for existence, while IN searches among values. Hence, EXISTS is quicker in retrieving large amounts of data, while IN is faster with smaller datasets.

exists and in

 

ORDER BY in Nested Queries

 

Let me make a short side note about using ORDER BY in the context of nested queries. In MySQL, you can order the result list by “emp_no” and insert the ORDER BY clause within the subquery.

The code should look like this:

SELECT

               e.first_name, e.last_name

FROM

               employees e

WHERE

               EXISTS( SELECT

                              *

               FROM

                              dept_manager dm

               WHERE

                              dm.emp_no = e.emp_no

               ORDER BY emp_no);

Of course, the query will run properly – the outer query will simply follow the order of the inner query.

exists order by

 

The Better Way to Do It

However, it is more professional to apply ORDER BY in the outer query. Just like this:

SELECT

               e.first_name, e.last_name

FROM

               employees e

WHERE

               EXISTS( SELECT

                              *

               FROM

                              dept_manager dm

               WHERE

                              dm.emp_no = e.emp_no)

ORDER BY emp_no;

So, please try to get used to leaving this clause in the outer select. It is more acceptable logically to sort the final version of your dataset and not versions preceding the final one.

exists where

 

The Alternative

 

Be aware that some, though not all, nested queries can be rewritten using JOINs, which are more efficient in general. This is true particularly for inner queries using the WHERE clause. Although sometimes the same answers can be obtained with a join, nested queries are considered an essential tool, and that’s why you must learn them.

nested queries can be rewritten using joins

 

The Benefits of SQL Subqueries

On certain occasions, subqueries can be bad for performance concerns, but here’s why people would still need to use them.

First, they allow for better structuring of the outer query. Thus, each inner query can be thought of in isolation, and organizing the extraction process can be improved. Hence the name of SQL – Structured Query Language!

allow for better structuring of the outer query

Second, in some situations, the use of subqueries is much more intuitive compared to the use of complex joins and unions.

Finally, many users prefer subqueries simply because they offer enhanced code readability.

 

What’s Next?

 

So, if you didn’t know, there is a tool in SQL that allows you to join a table with itself. Feel free to embark on the journey where we will be discussing the SELF JOIN.

Next Tutorial: Use the SELF JOIN in SQL

Leave a Reply

Your email address will not be published.

A Free Data Science Career Guide?

Check your email shortly!