How to delete duplicate records in SQL?
Here is my table with dept_no and dept_name.
NULL,'Public Relations'
'd001','Marketing'
'd001','Marketing'
'd002','Finance'
'd002','Finance'
'd003','Human Resources'
'd003','Human Resources'
'd004','Production'
'd004','Production'
'd005','Development'
'd005','Development'
'd006','Quality Management'
'd006','Quality Management'
'd007','Sales'
'd007','Sales'
'd008','Research'
'd008','Research'
'd009','Customer Service'
'd009','Customer Service'
'd010',NULL
'd010',NULL
'd010',NULL
'd010',NULL
'd011',NULL
'd011',NULL
'd011',NULL
'd011',NULL
I want to delete the duplicates of these? Just wondering if you guys went over how to delete duplicates in the lessons, I probably don't remember, I tried looking at some solutions in stackoverflow, but to no avail, all of there solutions seems confusing. This Question is from intro to Joins, INNER JOINs part I.
Thanks, hope you respond soon,
Okuich Abella
Hi Okuich!
Thanks for reaching out.
As of now, we have not covered this topic, since removing duplicates is neither straightforward, nor intuitive. There are several ways to remove duplicates and actually all of them require a combination of several tools.
The one I'd suggest you try is a combination of your skills for self joins, the use of the DELETE statement and the knowledge on how to select duplicates (the latter, at least analytically).
https://learn.365datascience.com/courses/sql/duplicate-records/
https://learn.365datascience.com/courses/sql/the-delete-statement-part-i/
https://learn.365datascience.com/courses/sql/sql-self-join/
The structure you can use to remove duplicates is the following one:
DELETE t1 FROM table_name t1
JOIN table_name t2
WHERE t1.dept_no < t2.dept_no AND t1.dept_name = t2.dept_name;
Hope this helps.
Best,
Martin