Resolved: Another Prospective for the Solution
I am quite confused regarding the solution as the given solution is just checking the year when an employee starteed working
However, maybe there are some employees that are employed before 1990 but still working in the company by that year and even after
and this can be applied to all years
Also there is nothing mentioned regarding when the employee has left
I had a solution trying to consider those factors, it seems quite complicated and do not know if I am right or not
SELECT
cal_year,
gender,
SUM(CASE
WHEN cal_year BETWEEN YEAR(hire_date) AND YEAR(end_date) THEN 1
ELSE 0
END) AS no_of_emp
FROM
(SELECT
e.emp_no, e.gender, e.hire_date, MAX(d.to_date) AS end_date
FROM
employees e
JOIN dept_emp d ON e.emp_no = d.emp_no
GROUP BY e.emp_no) a
CROSS JOIN
(SELECT DISTINCT
YEAR(hire_date) AS cal_year
FROM
employees
WHERE
YEAR(hire_date) >= 1990) b
GROUP BY cal_year , gender
ORDER BY cal_year;
Thank you for raising this, I am confused as well.I hope 365datascience repond to this.
Hi Eslam and Temitope!
Thanks for reaching out!
Please accept my apologies for the delayed response.
In your query, the condition WHERE YEAR(hire_date) >= 1990 is applied only in the subquery b, which generates distinct calendar years. However, this condition isn't applied to the main query, which performs a cross join between the generated calendar years and the result set from the subquery a. This means that for each calendar year in the result set, the query will attempt to calculate the number of employees for all years regardless of the condition.
In contrast, the solution query directly filters the data based on the condition HAVING calendar_year >= 1990, ensuring that only years from 1990 and onwards are included in the result set. This results in fewer rows returned compared to the first query, where data for all years may be included.
Hope this helps.
Best,
Martin