#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2009
    Posts
    5
    Rep Power
    0

    Delete older duplicates


    I've got a table with duplicate records and timestamps. I'm trying to figure out how to delete only the oldest duplicate records.

    For simplicity, we'll say the table has 3 fields. Name, ID, and Date.

    The ID is where the duplication is.

    So a sample of the records looks like this -

    Dave | 123 | 1-2-2011
    Jim | 123 | 1-3-2011
    Mike | 123 | 1-10-2011
    Bill | 111 | 1-2-2011
    Henry | 222 | 1-3-2011
    Larry | 222 | 1-4-2011

    I have a query that will show me the duplicates -

    Code:
     SELECT Name,ID, Date
     FROM UserTable
     where ID IN (SELECT ID FROM UserTable GROUP BY ID HAVING COUNT(*) >1) 
     GROUP BY ID, NAME,DATE
    That will give me the duplicates and omit the unique ones.

    I'm trying to figure out how to make a statement that can delete the older duplicates and leave the newest one of each record as well as the unique ones.

    Ideally the end result run on the above data would look like this -

    Mike | 123 | 1-10-2011
    Bill | 111 | 1-2-2011
    Larry | 222 | 1-4-2011

    Any help would be appreciated.
  2. #2
  3. Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Feb 2005
    Posts
    724
    Rep Power
    295
    What does this query give you?

    Code:
    SELECT Name,ID, Date
    FROM UserTable a
    WHERE
    	Date = (SELECT MAX(Date) FROM UserTable b WHERE a.ID = b.ID) 
    	AND ID IN (SELECT ID FROM UserTable GROUP BY ID HAVING COUNT(*) > 1)
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2009
    Posts
    5
    Rep Power
    0
    That appears to have given me the most recent entry for each record.

    So if I change the Date part of the statement like this


    Code:
    SELECT Name,ID, Date
    FROM UserTable a
    WHERE
    	Date <> (SELECT MAX(Date) FROM UserTable b WHERE a.ID = b.ID) 
    	AND ID IN (SELECT ID FROM UserTable GROUP BY ID HAVING COUNT(*) > 1)
    It looks like I get the older records.

    Would this then be a sensible way to go -

    Code:
    Delete FROM UserTable a
    WHERE
    	Date <> (SELECT MAX(Date) FROM UserTable b WHERE a.ID = b.ID) 
    	AND ID IN (SELECT ID FROM UserTable GROUP BY ID HAVING COUNT(*) > 1)
    I'm still learning SQL without any formal training and am not really versed on best practices or methods. Thanks for your help!
  6. #4
  7. Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Feb 2005
    Posts
    724
    Rep Power
    295
    Yes, if that is giving you the records you need to delete then use it. I'm glad you tested before running

    Comments on this post

    • whyareyou agrees
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2009
    Posts
    5
    Rep Power
    0
    Great! Thanks for you help.

IMN logo majestic logo threadwatch logo seochat tools logo