Resolved: What are the units of salary?
If Salary (as most of us think about it) is a rate, then these are $/year and to get the amount spent you need to multiply that by the number of years.
If the salary is just $s then it is a lump sum of money made during that period of time and then no need to do anything but sum.
The first interpretation is how I interpreted the problem. But, I also noticed that 81,635 emp_no's had `Year(to_dates) = '9999'`, I figured those people were still working there.
So I created a field that calculated the number of years each person worked at a particular salary using the `datediff()/365` function, and then also created a case for those who are still working there to have to_dates equal to today's date. After that I could then sum my new collumn WHERE from_date > '1997-01-01' because the question wanted the money spent on contracts made after the start of 1997. This gave me about 153 Billion dollars, but then I got to thinking that maybe we also wanted to know about the people that started before 1997 but were still working after that time, so I created an other case to only count the time in a contract after 1997 and found that the company spent between 2 and 3 Billion dollars on those contracts. That means from the begining of 1997 to today this company spent approimately 155 Billion dollars on employee contracts.
SELECT
SUM(salary*years_at_salary)
FROM
(SELECT
*,
CASE
# in this case the employee started working after 1997 and is still working there
WHEN YEAR(to_date) > '2023' AND from_date >= '1997-01-01'
THEN DATEDIFF('2023-11-08', from_date) / 365
# in this case the employee is still working there but started before 1997
WHEN YEAR(to_date) > '2023' AND from_date < '1997-01-01'
THEN DATEDIFF('2023-11-08', '1997-01-01') / 365
# in this case the employee is no longer working there but started
# before 1997
WHEN from_date < '1997-01-01'
THEN DATEDIFF(to_date, '1997-01-01') / 365
#in this case the person is working in the correct window
ELSE DATEDIFF(to_date, from_date) / 365
END AS years_at_salary
FROM
salaries
# only consider dates where the end date falls in the correct time period
WHERE to_date > '1997-01-01') salaries_ext;
I am hoping that someone will look at this and see if I made an error in my thought process or if they see I did some calculation incorrectly. I know that by dividing by 365 there are leap year issues, but I figured that since I was rounding to the nearest billion in my report that it wouldn't matter.
Also, It would be nice to have meta data to go along with the schema so that the units could be interpreted. And then also people to talk to about the ambiguities of the question.
Thanks in advance for any feed back.
Hi Sarah!
Thanks for reaching out.
The SQL code you've provided seems like a well-thought-out approach to calculate the total amount spent on employee salaries from the start of 1997 to the present day, considering different scenarios for employees who joined before and after 1997 and those still employed.Here's a review of the thought process and the SQL code:
Interpretation of Salary as a Rate: You've correctly treated salary as a rate (dollars per year) to calculate the total amount paid to each employee by multiplying the annual salary rate by the number of years they have been paid that salary.
Handling Current Employees: For employees still working there (indicated by a to_date of '9999'), you've adapted the to_date to today's date, which is a common practice for handling current employees in salary calculations.
Timeframe Adjustment: You've set up cases to ensure that only the salary paid after the start of 1997 is calculated, which aligns with the question's requirement. This also correctly includes salaries for those who started before 1997 but were still working after that date.
Calculation of Years at Salary: You've used DATEDIFF() divided by 365 to calculate the number of years at a salary. While this method doesn't account for leap years, as you've stated, the rounding to the nearest billion dollars likely absorbs this discrepancy, making it insignificant in terms of the total amount calculation.
Potential Improvements and Considerations:
Leap Year Handling: To improve accuracy, you might consider calculating the exact number of days worked and then converting this to years by dividing by 365.25 instead, which averages the additional day provided by leap years over four years.
Clarity on to_date Handling for Past Employees: The code snippet assumes that the to_date for past employees is accurate and doesn't need adjustment. It's important to ensure that this assumption holds true in the actual dataset.
Data Validation: It's always good practice to validate some of your calculated salary years and totals against known data points or manually calculated samples to ensure the logic is functioning as expected.
Hope this helps.
Best,
Tsvetelin