I recently had to clean up a database table of user data because it had the same email addresses assigned to multiple users. The first step was putting together a query that pulled only the duplicate email addresses and also told me how many times they occurred – from there the actual clean up process was fairly straight forward.
This simple query looked something like this:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
SELECT email_address, COUNT(email_address) AS occurrences | |
FROM users | |
GROUP BY email_address | |
HAVING occurrences > 1 |
I find this to be a particularly helpful SQL snippet as this kind of clean up process is fairly common. It uses MySQL’s HAVING
clause because WHERE
cannot handle aggregate functions – COUNT()
in this case.
Thanks for the snippet … but how can I get all records returned so I can compare the rows. In my case, duplicate records need to be compared manually, so I need more than just the count. Can you help with that?
@dcolumbus, I have a suggestion. Sometimes the easiest thing to do is just export the table to Excel and then use Excel’s more intuitive tools to highlight the duplicate rows/values.