The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.
|
 |
|
Dev Shed Forums
> Databases
> MySQL Help
|
Finding duplicates based on multiple columns AND deleting the second occurrence.
Discuss Finding duplicates based on multiple columns AND deleting the second occurrence. in the MySQL Help forum on Dev Shed. Finding duplicates based on multiple columns AND deleting the second occurrence. MySQL Help forum discussing administration, SQL syntax, and other MySQL-related topics. MySQL is an open-source relational database management system (RDBMS).
|
|
 |
|
|
|
|

Dev Shed Forums Sponsor:
|
|
|

September 19th, 2012, 07:34 AM
|
|
Registered User
|
|
Join Date: Sep 2012
Posts: 5
Time spent in forums: 45 m 51 sec
Reputation Power: 0
|
|
|
Finding duplicates based on multiple columns AND deleting the second occurrence.
Good morning.
I currently have the following MySQL query that finds rows in a table that have identical values for "tutor", "startTime", "endTime", and "date":
Quote: | select tutor, date, startTime, endTime, count(*) cnt from reservations group by date, tutor, startTime, endTime having cnt>1 order by cnt asc; |
I would like to expand this query to delete all but the first occurrence of each of the duplicates.
In the table, the only difference between the two rows is a field called "appointmentID". Therefore, the end result of the query should be that the first appointmentID row is kept while any subsequent duplicate rows are deleted.
Is this possible? I've searched here and through many other sites and, while I've found numerous examples for finding and deleting duplicates, none have dealt with doing so based on the values in several columns.
Thanks so much!
Richard
|

September 19th, 2012, 07:49 AM
|
|
|
|
Which one's first?
Think before you answer!
|

September 19th, 2012, 07:50 AM
|
|
Registered User
|
|
Join Date: Sep 2012
Posts: 5
Time spent in forums: 45 m 51 sec
Reputation Power: 0
|
|
Quote: | Originally Posted by cafelatte Which one's first?
Think before you answer! |
Hi Cafe Latte!
Thank you for writing!
The "first" one would be defined as the one with the lowest "appointmentID" as sorted numerically.
Is this what you're asking?
Richard
|

September 19th, 2012, 01:55 PM
|
|
Registered User
|
|
Join Date: Sep 2012
Posts: 6
Time spent in forums: 1 h 28 m 40 sec
Reputation Power: 0
|
|
Quote: | Originally Posted by heyrichard I currently have the following MySQL query that finds rows in a table that have identical values for "tutor", "startTime", "endTime", and "date":
I would like to expand this query to delete all but the first occurrence of each of the duplicates. |
Hopefully this will be a job carried out only once! After you have taken care of this, you should add a unique constraint on those columns in your table. Otherwise, you'll have to clean things up over and over again...
|

September 19th, 2012, 03:25 PM
|
|
Registered User
|
|
Join Date: Sep 2012
Posts: 5
Time spent in forums: 45 m 51 sec
Reputation Power: 0
|
|
Quote: | Originally Posted by bobhairgrove Hopefully this will be a job carried out only once! After you have taken care of this, you should add a unique constraint on those columns in your table. Otherwise, you'll have to clean things up over and over again... |
It is a one time job, and it's not my original code. But, I'm still looking for ideas on the find and delete if anyone has any!
|

September 19th, 2012, 03:41 PM
|
|
Registered User
|
|
Join Date: Sep 2012
Posts: 6
Time spent in forums: 1 h 28 m 40 sec
Reputation Power: 0
|
|
Try this:
Code:
CREATE TABLE tmp_reservations AS
SELECT
`date`, -- reserved word ... I would choose another name
tutor,
startTime,
endTime,
MIN(appointmentID) AS appointmentID
FROM reservations
GROUP BY `date`, tutor, startTime, endTime;
TRUNCATE TABLE reservations;
INSERT INTO reservations (appointmentID,`date`,tutor,startTime,endTime)
SELECT appointmentID,`date`,tutor,startTime,endTime
FROM tmp_reservations;
DROP TABLE tmp_reservations;
Caveat emptor, I didn't try this at home! 
|

September 19th, 2012, 05:24 PM
|
|
|
Or just...
Code:
DELETE x
FROM my_table x
JOIN my_table y
ON y.id = x.id
AND y.date < x.date;
|

September 19th, 2012, 05:31 PM
|
 |
SQL Consultant
|
|
Join Date: Feb 2003
Location: Toronto Canada
|
|
Quote: | Originally Posted by cafelatte Or just... | um... no

|

September 20th, 2012, 06:22 AM
|
|
Registered User
|
|
Join Date: Sep 2012
Posts: 5
Time spent in forums: 45 m 51 sec
Reputation Power: 0
|
|
Quote: | Originally Posted by r937 um... no
 |
Since there doesn't seem to be a good solution without creating a new table, is there a way, in the original request, to have MySQL output both duplicated appointmentIDs? That way, I could manually construct a deletion run, I guess...
|

September 20th, 2012, 06:29 AM
|
 |
SQL Consultant
|
|
Join Date: Feb 2003
Location: Toronto Canada
|
|
Quote: | Originally Posted by heyrichard Since there doesn't seem to be a good solution without creating a new table... | i dispute this assertion
cafelatte was on the right track but didn't get the self-join conditions right
Code:
DELETE x
FROM my_table x
JOIN my_table y
ON y.tutor = x.tutor
AND y.date = x.date
AND y.startTime = x.startTime
AND y.endTime = x.endTime
AND y.id < x.id;
|

September 20th, 2012, 06:41 AM
|
|
Registered User
|
|
Join Date: Sep 2012
Posts: 5
Time spent in forums: 45 m 51 sec
Reputation Power: 0
|
|
Quote: | Originally Posted by r937 i dispute this assertion |
Thank you for disputing and helping! I'm out to play with this now. I appreciate it and understand how it works, so thank you.
|
Developer Shed Advertisers and Affiliates
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate This Thread |
Linear Mode
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|