I would like to ask if the code in the solution includes the salaries of people who had the same salary for 2 and more years. For example, if people with the salary they began to earn in 2001 and had still amount in 2002 were included in calculating average per year.
I think that the code does not consider it. (1)
Also, I think that in the obtained dataset people, who are no longer employed in the company will not show up because of join with employees table for acquiring gender of employee. (2)
Not sure how to resolve problem (2), but could my query address at least problem (1)?
y.calendar_year AS calendar_year,
AVG(s.salary) AS avg_salary,
e.gender AS gender,
de.dept_no AS dept_no,
d.dept_name as dept_name
( SELECT DISTINCT(YEAR(from_date)) AS calendar_year
UNION SELECT DISTINCT(YEAR(to_date)) AS calendar_year
FROM t_salaries) as y
t_salaries s ON YEAR(s.from_date) <= calendar_year AND calendar_year <= YEAR(s.to_date)
t_dept_emp de ON s.emp_no = de.emp_no
t_departments d ON de.dept_no = d.dept_no
t_employees e ON s.emp_no = e.emp_no
calendar_year <= 2002
Sorry for the long code, I know you have better things to do than read it :D, but it would really help my understanding.
Thanks for reaching out.
Thank you very much for sharing this query. You are right when you say that more clarity is needed in this case, so thank you for reaching out.
- We step on the assumption that the data in the t_salaries table has been recorded for every year, practically. You can run the following query to see that in practice.
SELECT * FROM t_salaries LIMIT 100;
Hope this answers your question but please feel free to reply should you need more clarifications or still have any doubts.
2. According to the query we’ve used, the average will only be taken until the year when they last worked for the company. Please execute the following query to observe that there are many such employees in the company.
emp_no, MAX(to_date) AS max_date
GROUP BY emp_no
ORDER BY max_date
Hope this helps.
Hello Martin, thank you for your reply. As for (1), thanks for clarifying, now it makes sense :). As for (2) thanks for showing an interesting application of the MAX function. It gave me new ideas on how it could be used. However, what I meant was that t_salaries has more distinct emp_no values than t_employees as you can find out using SELECT emp_no FROM t_employees ORDER BY emp_no LIMIT 100; and SELECT DISTINCT(emp_no) FROM t_salaries ORDER BY emp_no LIMIT 100; I think the reason is that once employee left the company, he will no longer show up in t_employees (but still will be in t_salaries). Then if we want to use JOIN to obtain a gender of an employee e.g. 10002, who is no longer employed JOIN will eliminate him from the result table. He is not in the t_employees table. These are just details, but maybe it has some value in interpreting the result.
Hi Maros! Thank you very much for your reply. Also, thank you very much for sharing your reasoning. You are right. We have run our SQL-Tableau tasks on several assumptions that we had decided to not specifically mention, the reason being we wanted to deprive these tasks from extra complications but here they arise anyway. Therefore, thank you very much for providing your reasoning. I will discuss it with my colleagues and get back to you if we’ll address in our next update, or how we’ll update the lecture. Thank you very much! Good luck and please feel free to post another question should you encounter any difficulties. Thank you.
The 365 Team