🛠️ Scheduled Maintenance | We’ll be undergoing scheduled maintenance and upgrades between 00:00 PST Jan 26th until 00:00 PST Jan 28th. There may be brief interruption of services in that period. We apologize for the inconvenience.

×
The 365 Data Science team is proud to invite you to our own community forum. A very well built system to support your queries, questions and give the chance to show your knowledge and help others in their path of becoming Data Science specialists.
Ask
Anybody can ask a question
Answer
Anybody can answer
Vote
The best answers are voted up and moderated by our team

Problem 3: Solution in SQL

Problem 3: Solution in SQL

Super Learner
0
Votes
1
Answer

Hello everyone.
I would like to ask if the code in the solution includes the salaries of people who had the same salary for 2 and more years. For example, if people with the salary they began to earn in 2001 and had still amount in 2002 were included in calculating average per year.
I think that the code does not consider it. (1)
Also, I think that in the obtained dataset people, who are no longer employed in the company will not show up because of join with employees table for acquiring gender of employee. (2)
Not sure how to resolve problem (2), but could my query address at least problem (1)?
 

SELECT
y.calendar_year AS calendar_year,
AVG(s.salary) AS avg_salary,
e.gender AS gender,
de.dept_no AS dept_no,
d.dept_name as dept_name
FROM
( SELECT DISTINCT(YEAR(from_date)) AS calendar_year
FROM t_salaries
UNION SELECT DISTINCT(YEAR(to_date)) AS calendar_year
FROM t_salaries) as y
JOIN
t_salaries s ON YEAR(s.from_date) <= calendar_year AND calendar_year <= YEAR(s.to_date)
JOIN
t_dept_emp de ON s.emp_no = de.emp_no
JOIN
t_departments d ON de.dept_no = d.dept_no
JOIN
t_employees e ON s.emp_no = e.emp_no
WHERE
calendar_year <= 2002
GROUP BY
dept_name,
calendar_year,
gender
ORDER BY
calendar_year ASC,
dept_name ASC,
gender DESC
;

 
Sorry for the long code, I know you have better things to do than read it :D, but it would really help my understanding.
Thank you.
 
Best,
Maros Jevocin

1 Answer

365 Team

Hi Maros!
Thanks for reaching out.
Thank you very much for sharing this query. You are right when you say that more clarity is needed in this case, so thank you for reaching out.

  1. We step on the assumption that the data in the t_salaries table has been recorded for every year, practically. You can run the following query to see that in practice.
SELECT * FROM t_salaries LIMIT 100;

Hope this answers your question but please feel free to reply should you need more clarifications or still have any doubts.
2. According to the query we’ve used, the average will only be taken until the year when they last worked for the company. Please execute the following query to observe that there are many such employees in the company.

SELECT 
emp_no, MAX(to_date) AS max_date
FROM
t_salaries
GROUP BY emp_no
ORDER BY max_date
LIMIT 10000;

Hope this helps.
Best,
Martin

Hello Martin, thank you for your reply. As for (1), thanks for clarifying, now it makes sense :). As for (2) thanks for showing an interesting application of the MAX function. It gave me new ideas on how it could be used. However, what I meant was that t_salaries has more distinct emp_no values than t_employees as you can find out using SELECT emp_no FROM t_employees ORDER BY emp_no LIMIT 100; and SELECT DISTINCT(emp_no) FROM t_salaries ORDER BY emp_no LIMIT 100; I think the reason is that once employee left the company, he will no longer show up in t_employees (but still will be in t_salaries). Then if we want to use JOIN to obtain a gender of an employee e.g. 10002, who is no longer employed JOIN will eliminate him from the result table. He is not in the t_employees table. These are just details, but maybe it has some value in interpreting the result.

1 month

Hi Maros! Thank you very much for your reply. Also, thank you very much for sharing your reasoning. You are right. We have run our SQL-Tableau tasks on several assumptions that we had decided to not specifically mention, the reason being we wanted to deprive these tasks from extra complications but here they arise anyway. Therefore, thank you very much for providing your reasoning. I will discuss it with my colleagues and get back to you if we’ll address in our next update, or how we’ll update the lecture. Thank you very much! Good luck and please feel free to post another question should you encounter any difficulties. Thank you.
Best,
The 365 Team

1 month