MySQL Help
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsDatabasesMySQL Help

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old September 19th, 2012, 07:34 AM
heyrichard heyrichard is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2012
Posts: 5 heyrichard User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 45 m 51 sec
Reputation Power: 0
Finding duplicates based on multiple columns AND deleting the second occurrence.

Good morning.

I currently have the following MySQL query that finds rows in a table that have identical values for "tutor", "startTime", "endTime", and "date":

Quote:
select tutor, date, startTime, endTime, count(*) cnt from reservations group by date, tutor, startTime, endTime having cnt>1 order by cnt asc;

I would like to expand this query to delete all but the first occurrence of each of the duplicates.

In the table, the only difference between the two rows is a field called "appointmentID". Therefore, the end result of the query should be that the first appointmentID row is kept while any subsequent duplicate rows are deleted.

Is this possible? I've searched here and through many other sites and, while I've found numerous examples for finding and deleting duplicates, none have dealt with doing so based on the values in several columns.

Thanks so much!

Richard

Reply With Quote
  #2  
Old September 19th, 2012, 07:49 AM
cafelatte cafelatte is offline
Contributing User
Dev Shed Intermediate (1500 - 1999 posts)
 
Join Date: Mar 2008
Posts: 1,923 cafelatte User rank is Major (30000 - 40000 Reputation Level)cafelatte User rank is Major (30000 - 40000 Reputation Level)cafelatte User rank is Major (30000 - 40000 Reputation Level)cafelatte User rank is Major (30000 - 40000 Reputation Level)cafelatte User rank is Major (30000 - 40000 Reputation Level)cafelatte User rank is Major (30000 - 40000 Reputation Level)cafelatte User rank is Major (30000 - 40000 Reputation Level)cafelatte User rank is Major (30000 - 40000 Reputation Level)cafelatte User rank is Major (30000 - 40000 Reputation Level)cafelatte User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 1 Month 5 Days 16 h 21 m 8 sec
Reputation Power: 377
Which one's first?

Think before you answer!

Reply With Quote
  #3  
Old September 19th, 2012, 07:50 AM
heyrichard heyrichard is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2012
Posts: 5 heyrichard User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 45 m 51 sec
Reputation Power: 0
Quote:
Originally Posted by cafelatte
Which one's first?

Think before you answer!


Hi Cafe Latte!

Thank you for writing!

The "first" one would be defined as the one with the lowest "appointmentID" as sorted numerically.

Is this what you're asking?

Richard

Reply With Quote
  #4  
Old September 19th, 2012, 01:55 PM
bobhairgrove bobhairgrove is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2012
Posts: 6 bobhairgrove User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 h 28 m 40 sec
Reputation Power: 0
Quote:
Originally Posted by heyrichard
I currently have the following MySQL query that finds rows in a table that have identical values for "tutor", "startTime", "endTime", and "date":

I would like to expand this query to delete all but the first occurrence of each of the duplicates.

Hopefully this will be a job carried out only once! After you have taken care of this, you should add a unique constraint on those columns in your table. Otherwise, you'll have to clean things up over and over again...

Reply With Quote
  #5  
Old September 19th, 2012, 03:25 PM
heyrichard heyrichard is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2012
Posts: 5 heyrichard User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 45 m 51 sec
Reputation Power: 0
Quote:
Originally Posted by bobhairgrove
Hopefully this will be a job carried out only once! After you have taken care of this, you should add a unique constraint on those columns in your table. Otherwise, you'll have to clean things up over and over again...


It is a one time job, and it's not my original code. But, I'm still looking for ideas on the find and delete if anyone has any!

Reply With Quote
  #6  
Old September 19th, 2012, 03:41 PM
bobhairgrove bobhairgrove is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2012
Posts: 6 bobhairgrove User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 h 28 m 40 sec
Reputation Power: 0
Try this:
Code:
CREATE TABLE tmp_reservations AS 
SELECT 
  `date`, -- reserved word ... I would choose another name
  tutor, 
  startTime, 
  endTime, 
  MIN(appointmentID) AS appointmentID
FROM reservations 
GROUP BY `date`, tutor, startTime, endTime;

TRUNCATE TABLE reservations;

INSERT INTO reservations (appointmentID,`date`,tutor,startTime,endTime)
SELECT appointmentID,`date`,tutor,startTime,endTime
FROM tmp_reservations;

DROP TABLE tmp_reservations;
Caveat emptor, I didn't try this at home!

Reply With Quote
  #7  
Old September 19th, 2012, 05:24 PM
cafelatte cafelatte is offline
Contributing User
Dev Shed Intermediate (1500 - 1999 posts)
 
Join Date: Mar 2008
Posts: 1,923 cafelatte User rank is Major (30000 - 40000 Reputation Level)cafelatte User rank is Major (30000 - 40000 Reputation Level)cafelatte User rank is Major (30000 - 40000 Reputation Level)cafelatte User rank is Major (30000 - 40000 Reputation Level)cafelatte User rank is Major (30000 - 40000 Reputation Level)cafelatte User rank is Major (30000 - 40000 Reputation Level)cafelatte User rank is Major (30000 - 40000 Reputation Level)cafelatte User rank is Major (30000 - 40000 Reputation Level)cafelatte User rank is Major (30000 - 40000 Reputation Level)cafelatte User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 1 Month 5 Days 16 h 21 m 8 sec
Reputation Power: 377
Or just...
Code:
DELETE x 
  FROM my_table x
  JOIN my_table y
    ON y.id = x.id
   AND y.date < x.date;

Reply With Quote
  #8  
Old September 19th, 2012, 05:31 PM
r937's Avatar
r937 r937 is offline
SQL Consultant
Click here for more information.
 
Join Date: Feb 2003
Location: Toronto Canada
Posts: 26,357 r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level) 
Time spent in forums: 3 Months 1 Week 2 Days 4 h 40 m 42 sec
Reputation Power: 4140
Quote:
Originally Posted by cafelatte
Or just...
um... no

__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book

Reply With Quote
  #9  
Old September 20th, 2012, 06:22 AM
heyrichard heyrichard is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2012
Posts: 5 heyrichard User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 45 m 51 sec
Reputation Power: 0
Quote:
Originally Posted by r937
um... no



Since there doesn't seem to be a good solution without creating a new table, is there a way, in the original request, to have MySQL output both duplicated appointmentIDs? That way, I could manually construct a deletion run, I guess...

Reply With Quote
  #10  
Old September 20th, 2012, 06:29 AM
r937's Avatar
r937 r937 is offline
SQL Consultant
Click here for more information.
 
Join Date: Feb 2003
Location: Toronto Canada
Posts: 26,357 r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level)r937 User rank is General 47th Grade (Above 100000 Reputation Level) 
Time spent in forums: 3 Months 1 Week 2 Days 4 h 40 m 42 sec
Reputation Power: 4140
Quote:
Originally Posted by heyrichard
Since there doesn't seem to be a good solution without creating a new table...
i dispute this assertion

cafelatte was on the right track but didn't get the self-join conditions right
Code:
DELETE x 
  FROM my_table x
  JOIN my_table y
    ON y.tutor = x.tutor
   AND y.date = x.date
   AND y.startTime = x.startTime
   AND y.endTime = x.endTime
   AND y.id < x.id;

Reply With Quote
  #11  
Old September 20th, 2012, 06:41 AM
heyrichard heyrichard is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2012
Posts: 5 heyrichard User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 45 m 51 sec
Reputation Power: 0
Quote:
Originally Posted by r937
i dispute this assertion


Thank you for disputing and helping! I'm out to play with this now. I appreciate it and understand how it works, so thank you.

Reply With Quote
Reply

Viewing: Dev Shed ForumsDatabasesMySQL Help > Finding duplicates based on multiple columns AND deleting the second occurrence.

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap