Solving a Business Task in SQL and Visualizing it in Tableau

SQL Tutorials Tableau Tutorials 17 min read
business task in sql
Blog / SQL Tutorials / Solving a Business Task in SQL and Visualizing it in Tableau

If you want to become a business intelligence analyst, a great way to start is to solve a business task in SQL. As the saying goes:

“Knowledge that is not put into practice is like food that is not digested.”

So, it’s time to get our hands dirty and start writing some code! Moreover, we will learn how to visualize the solution of the task in Tableau. We will be working with the ‘employees_mod’ database. Provided that you don’t have it on your computer, you can get it here.

laptop next to pot of tea, business task in sql

The Task

Now, let’s see what business task in SQL we are going to be dealing with:

Compare the number of male managers to the number of female managers from different departments for each year, starting from 1990.

The First Step to Solve a Business Task in SQL

First, you should imagine a type of visualization that could respond to the problem clearly, and then you should figure out a way to retrieve a set from the ‘employees_mod’ database containing the data that will let you obtain the desired graph.

Graphs, business task in sqlFollowing this logic, perhaps you are thinking that the solution would be a bar chart where each bar is split into two rows. One representing the male employees, and the other one – the female employees.F M, business task in sql

The Right Approach

However, we will ask you to accept the following challenge: Respond to the given problem with an area chart.

Area chart, business task in sqlAn area chart is a type of graph which can be perceived as a line chart where the area between the lines

can b perceived as a line chart, business task in sqland between the lowest line and the horizontal axis has been filled with colours.

colours, business task in sqlThus, the result allows you to visually compare the proportional relationship between the quantities examined.

The Information We Need to Extract

Apparently, we will need a column for department name and gender, as we will want to compare our results in these fields. Then, to count the managers who have worked in a certain department, we will use their employee numbers.Sql output, busness task in sqlHere comes the challenging part.

How can we show that a manager has worked in a particular department in year X, and has not worked there in year Y?

By creating the “active” column which will display 1 or 0 if a person has or has not worked for the company in a certain calendar year!active, business task in sql

The Rest of the Fields

So, the remaining three fields we need in the SQL output would be the “from_date“ and “to_date“ from the ‘t_dept_manager’ table for each employee, and a calendar year.

In this way, if a calendar year falls within the boundaries of the “from_date“ and “to_date“ , in the ‘active’ field we will see 1. Otherwise, 0 will be displayed.

dates, business task in sql

A Few Hints for Solving Our Business Task in SQL

While creating the ‘active’ column, you’ll need a CASE – WHEN statement to designate the circumstances under which ‘active’ must indicate 1. And then the ones under which it must indicate 0. If you have no idea what is a CASE-WHEN statement, don’t worry. You can find out all you need to know about it in the linked tutorial.

active, business task in sql

Later, when joining the tables, you can use the following information:

cross join, business task in sql

A column containing the hire date of the “t_employees” table must cross-joint_dept_manager”. Then, it must join “t_departments” and “t_employees”.

Important: Remember that a cross join ensures that you can combine each row from a table with each row from the other table.

Finally, order by employee number and calendar year to finish writing the query.

emp_no and calendar year, business task in sql

The Solution

Here’s the query that will give us the desired SQL output:

1 SELECT
2
3                d.dept_name,
4
5                 ee.gender,
6
7                 dm.emp_no,
8
9                 dm.from_date,
10
11                 dm.to_date,
12
13                 e.calendar_year,
14
15                 CASE
16
17                                WHEN YEAR(dm.to_date) >= e.calendar_year AND YEAR(dm.from_date) <= e.calendar_year THEN 1
18
19                       ELSE 0
20
21                                   END as active
22
23                  FROM
24
25                                    (SELECT
26
27                                                   YEAR(hire_date) AS calendar_year
28
29                                       FROM
30
31                                                    t_employees
32
33                                       GROUP BY calendar_year) e
34
35                                                         CROSS JOIN
36
37                                       t_dept_manager dm
38
39                                                              JOIN
40
41                                      t_departments d ON dm.dept_no = d.dept_no
42
43                                                              JOIN
44
45                                       t_employees ee ON dm.emp_no = ee.emp_no
46
47                       ORDER BY dm.emp_no, calendar_year;

Let’s analyse it step by step.

Analysis

Naturally, we must start by stating the fields that we want to see in the table output – ‘dept_name’, ‘gender’, ‘emp_no’, ‘from_date’, ‘to_date’, and ‘calendar_year’.

select statement, business task in sql

All these columns need to be preceded by the aliases of the tables they will be extracted from.

extracted column, business task in sql

And the last required field is the ‘active’ field.select statement, business task in sql

To create it, let’s take advantage of one of the hints we mentioned before.

The CASE WHEN Statement

We will use CASE, WHEN, and then the following two conditions:

  • the year of the final date of the contract to be greater than a certain calendar year

>=, business task in sql

  • the year of the start date to be smaller than the same calendar year.

<=, business task in sql

If these are true, we can let ‘active’ display 1 by typing ‘THEN 1’.

Otherwise, (or as can be said in SQL terms – ELSE) we will let it display 0.

Finally, to give this field a name, let’s type END ASactive’.

end as active, business task in sql

We are done with designating the columns of interest!

The JOINS

Now, let’s think about the joins.

The hint related to this part of the query began in the following way: ”a column containing the year from the hire date of the “t_employees” table must cross-join the “t_dept_manager” table”.

The idea of extracting only a column from the “t_employees” table must ring a bell, and it is to use a subquery within the FROM clause.

(), business task in sql

The Subquery

In this subquery, we must select the year of the “hire_date” field from the “t_employees” table. Then, we must type “CROSS JOIN”, “t_dept_manager”, and then JOINt_departments” and “t_employees” on the relevant matching fields.

Thus, by using this code in the FROM clause, we will retrieve information only for those calendar years where data is available for all designated tables.

From statement, business task in sql

A Quick Test

To demonstrate this is the join we need, let’s execute it in a separate SELECT statement, starting with “SELECT * FROM”.

The whole query looks like this:

SELECT

    *
FROM

    (SELECT
 
        YEAR(hire_date) AS calendar_year
    FROM

        t_employees

    GROUP BY calendar_year) e

        CROSS JOIN

    t_dept_manager dm
  
        JOIN

    t_departments d ON dm.dept_no = d.dept_no

        JOIN

    t_employees ee ON dm.emp_no = ee.emp_no

ORDER BY dm.emp_no, calendar_year; 

output, business task in sql

The Output

As you can see in the picture above, this intermediate output shows that for each manager, as expressed by the different employee numbers, there’s enough information regarding all the subsequent fields. These fields can be found within the data tables mentioned in the join for all years from 1990 to 2000 inclusive.

calendar year, business task in sql

The Overall Look

So, looking at the query from top to bottom, we can say that we have selected seven of the many columns we retrieved in the last query output.

To conclude the initial query, don’t forget to abide by the last instruction – order by employee number and calendar year.

Well, at this stage we can execute the query and obtain the output we were aiming for at the beginning!

output, business task in sql

Now, let’s export this SQL output in a CSV format.

export, business task in sql

The Final Step of the Solution of a Business Task in SQL

Now, it’s time to finalize the solution of this business task in SQL. We will achieve that by loading the data in Tableau and then visualizing it in a way that will respond to it.

Let’s load the data in Tableau.

We can start a new workbook, click on “Text file”, select the CSV file in question, and press “Open”.

text file

Another Way to Load the Data

However, there’s another option we may want to take advantage of if we wish to remain in the same workbook and create a graph in a new worksheet.

new work sheet

First, let’s click on “New Worksheet” in the lower-left part of Tableau to create a new worksheet.

Notice that if we have previously worked with another worksheet, by clicking on “Data Source”, we’ll go to the same start page we used for it.

data source

Therefore, we will create a new data source!

data-science-training

Creating a New Data Source

To do that, we must go back to the new worksheet and expand the “Data” menu.

data

We have to choose “New Data Source”.

new data source

We can select “Text file” again.

text file

After that, we should pick the new CSV file and then press “Open”.

open

Now we can access Sheet 2.

sheet 2

Let’s rename it to “Chart 2”.

chart 2

Now, we are ready to work on our visualization.

The Visualization

So, we want to compare the number of ‘active’ male employees versus the number of ‘active’ female employees in the company for each year.

Hence, we should set the years to be laid on the horizontal axis by dragging the “Calendar Year” field up and dropping it in the “Columns” section.

columns

The Rows

Then, we should designate which statistic will represent the rows on our visualization.

There’s no need for discussion – this must be the ‘active’ field. Let’s click on and hold ‘active’, drag and drop it right over the “Rows” section.

rows

As you can see in the picture above, a SUM function with an argument ‘active’ appeared in the “Rows” section. Why so?

The Benefits of the SUM Function

The reason is that Tableau automatically aggregates the numeric values of a field. In our case, this is doing a perfect job, because in the SQL output, ‘active’ would take values of 1 or 0 only.

active

By summing the values automatically for us, Tableau shows us the total number of employees that have been hired in a certain calendar year. And that is precisely what we need.

total values

The Detail Level and the Colors

Now, let’s assign ‘Gender’ as a field that will define the detail level and the colors of the graph. We can achieve this by dragging and dropping “Gender” over “Detail”.

gender over detail

And once again over “Color”.

color

We can also swap the colours representing the male and female employees.

swapping colors

At this point we’ve gotten to something that’s probably new for you.

The Type of Visualization

We can expand the drop-down list from the “Marks” section and change the type of visualization to “Area”.

area

Let’s see how it looks.

chart

How to Interpret it

In Tableau, hovering over certain parts of the visualization allows you to obtain specific quantitative information.

For example, as defined by the two dots you can see in the picture below, the area between them tells us that in 1996, there were 64 male managers working in the company.

62 male managers, tableau

While if we hover a little bit higher, we can see that 44 female managers were employed during the same calendar year.

44 female managers, tableau

Analogically, we can check the information related to any of the years depicted on the chart.

Finalizing the Solution of Our Business Task in SQL

To reply to the initial business task in SQL accurately, we should create the same type of chart for each department.

The quickest way is to right click on “Department name” and then select “Show filter”.

department name show filter, tableau

That being done, “Dept name” will appear in the “Filters” section, because the term for this operation in Tableau is called ‘filtering’.

filtering, tableau

For instance, it is correct to say, “Filter the chart by dept name”.

Exploring the Distribution of Male vs Female Employees in Different Departments

Besides, a multiple value list appears on the right part of the screen.

multiple value list, tableau

By ticking and un-ticking the different boxes, we can explore the distribution of male versus female employees for the different years for the entire company. Or a specific department.

Such as “Marketing”:

marketing

Production”:

production, tableau

and “Quality Management”:

quality management, tableau

Why We Use This Database

Before we conclude, we must admit that if you deem some of these graphs slightly odd, we will agree with you.

The data has been fabricated. It is heavily based on the ‘employees’ database, which can be found on GitHub.

We’ve chosen to keep using the same data set in this section, too, albeit slightly modified. We’ve done this simply for learning purposes. We had to face a trade-off between solidifying your SQL knowledge and focusing on the combination with Tableau or shifting the focus on an entirely new dataset.

And we chose the first option.

trade off, tableau

The Proper Way to Solve a Business Task in SQL

To sum up, when you are facing a business task in SQL, which you want to visualize in Tableau, you should follow several steps. Let’s look back at what we did:

  1. First, we figured out what data we are supposed to extract.
  2. Then, we started writing the code, extremely precisely.
  3. After that, we went over it, in order to fully understand what it does.
  4. What we did next, was load the CSV file in Tableau.
  5. From then on, we created a new data source and managed the way we want our data to be visualized.

So, finding the solution to another business task in SQL shouldn’t be rocket science after solving this one yourself.

That’s the last of our SQL Tutorials but there is one last thing for you – Check out the next article to get some useful tip on how to use SQL to its full potential.

***

Eager to hone your SQL skills? Learn how to put theory into practice with our hands-on tutorials!

Next Tutorial: SQL Best Practices

Earn your Data Science Degree

Expert instructions, unmatched support and a verified certificate upon completion!

  1. Hi, im still new to SQL but i have some experience working with data. I thought of a way to solve this question that results in a slightly different extracted data set and i was wondering if its possible to write a query in SQL that will do this. The resulting table I want is basically a pivot table where the rows are grouped by Year>Department and the columns are Managers(Male and Female) aggregated by Count.

Leave a Reply

Your email address will not be published.