MySQL Help
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsDatabasesMySQL Help

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old November 1st, 2009, 03:18 PM
mwaterous mwaterous is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Nov 2009
Posts: 4 mwaterous User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 38 m 17 sec
Reputation Power: 0
With ROLLUP not accurately summing up rows?

I might've missed something in the explanation on MySQL's documentation as to the proper usage/operation of WITH ROLLUP. My impression is that it would provide me with a summary row, that would provide totals for data generated from a SELECT statement.

Here's the Query:

mysql Code:
Original - mysql Code
    SELECT SUBSTR( timestamp, 1, 10 ) AS timestamp, COUNT( DISTINCT ip ) AS visitors, COUNT( timestamp ) AS pageviews FROM `%s` WHERE feed = '' AND spider = '' AND preserved IS NULL GROUP BY SUBSTR( timestamp, 1, 10 ) WITH ROLLUP


And the results are obviously not accurate totals:

Code:
2009-10-31   |   8   |   14
2009-11-01   |  72   |  149
NULL         |  78   |  163


72 + 8 = 78? Maybe according to my high school math teacher, but I've learned since that 2 + 8 = 10, which means...

Reply With Quote
  #2  
Old November 1st, 2009, 03:31 PM
r937's Avatar
r937 r937 is offline
SQL Consultant
Click here for more information.
 
Join Date: Feb 2003
Location: Toronto Canada
Posts: 20,768 r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level) 
Time spent in forums: 2 Months 1 Week 3 Days 17 h 39 m 41 sec
Reputation Power: 2481
try it with COUNT instead of COUNT DISTINCT and see if that gives you a hint
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book

Reply With Quote
  #3  
Old November 1st, 2009, 04:01 PM
mwaterous mwaterous is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Nov 2009
Posts: 4 mwaterous User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 38 m 17 sec
Reputation Power: 0
The problem is that I need the DISTINCT clause for the query to return the results I'm looking for. Without it, the two columns return duplicate results.

If there's a way to get the same type of result that simply adds the total of all columns with no other bias than straight math, I'd be interested in knowing.

Reply With Quote
  #4  
Old November 1st, 2009, 04:42 PM
r937's Avatar
r937 r937 is offline
SQL Consultant
Click here for more information.
 
Join Date: Feb 2003
Location: Toronto Canada
Posts: 20,768 r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level) 
Time spent in forums: 2 Months 1 Week 3 Days 17 h 39 m 41 sec
Reputation Power: 2481
Quote:
Originally Posted by mwaterous
The problem is that I need the DISTINCT clause for the query to return the results I'm looking for.
that's what it's giving you -- COUNT DISTINCT on all results

Reply With Quote
  #5  
Old November 1st, 2009, 05:29 PM
mwaterous mwaterous is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Nov 2009
Posts: 4 mwaterous User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 38 m 17 sec
Reputation Power: 0
Quote:
Originally Posted by r937
that's what it's giving you -- COUNT DISTINCT on all results


Ohhh, wait I get what you mean. So two of the IPs returned from the second day also exist on the first day, so the total of the distinct from each day is actually only 78.

So returns to my original question... is there a version of ROLLUP I could use that would instead return the sum of all columns unbiased by the SELECT?

Reply With Quote
  #6  
Old November 1st, 2009, 05:43 PM
r937's Avatar
r937 r937 is offline
SQL Consultant
Click here for more information.
 
Join Date: Feb 2003
Location: Toronto Canada
Posts: 20,768 r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level)r937 User rank is General 22nd Grade (Above 100000 Reputation Level) 
Time spent in forums: 2 Months 1 Week 3 Days 17 h 39 m 41 sec
Reputation Power: 2481
sorry, there is only one version of WITH ROLLUP, and the options are: WITH ROLLUP

Reply With Quote
  #7  
Old November 2nd, 2009, 04:39 AM
pabloj's Avatar
pabloj pabloj is offline
Modding: Oracle MsSQL Firebird
Dev Shed God 7th Plane (8000 - 8499 posts)
 
Join Date: Jun 2001
Location: Outside US
Posts: 8,388 pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level) 
Time spent in forums: 3 Months 4 Days 22 h 50 sec
Reputation Power: 532
Wrap it into another select and apply the rollup to the external one.
Something like
sql Code:
Original - sql Code
  1.  
  2. SELECT
  3. aa.time_stamp SUM(aa.visitors) total_visitors, SUM(aa.pageviews) total_pageviews
  4. FROM
  5. (
  6. SELECT
  7. SUBSTR( timestamp, 1, 10 ) AS time_stamp,
  8. COUNT( DISTINCT ip ) AS visitors,
  9. COUNT( timestamp ) AS pageviews
  10. FROM
  11. `%s`
  12. WHERE
  13. feed = ''
  14. AND spider = ''
  15. AND preserved IS NULL
  16. GROUP BY
  17. SUBSTR( timestamp, 1, 10 )
  18. ) aa GROUP BY aa.time_stamp
  19. WITH ROLLUP

Reply With Quote
  #8  
Old November 2nd, 2009, 02:34 PM
mwaterous mwaterous is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Nov 2009
Posts: 4 mwaterous User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 38 m 17 sec
Reputation Power: 0
Quote:
Originally Posted by pabloj
Wrap it into another select and apply the rollup to the external one.


That worked great, thanks!

Obviously it's invisible to the naked eye, but does anybody have a good summary on the overhead of wrapping a SELECT inside of a SELECT? Is it negligible or due to the fact that I have to loop through each row anyways is it faster to just stick with counting it inside of PHP?

I realize that's probably not as simple as yes or no, given possible factors, but I've only just recently started learning how to write more complex SQL queries to try and reduce the workload on my actual PHP scripts... so I wonder here and there whether I'm actually optimizing things or just placing more load on the SQL server haha... :P

Reply With Quote
  #9  
Old November 3rd, 2009, 04:10 AM
pabloj's Avatar
pabloj pabloj is offline
Modding: Oracle MsSQL Firebird
Dev Shed God 7th Plane (8000 - 8499 posts)
 
Join Date: Jun 2001
Location: Outside US
Posts: 8,388 pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level)pabloj User rank is Colonel (50000 - 60000 Reputation Level) 
Time spent in forums: 3 Months 4 Days 22 h 50 sec
Reputation Power: 532
Start by looking at the explain plan for that query.
You might want to check this blog post as far as performance of "with rollup" goes.

Reply With Quote
Reply

Viewing: Dev Shed ForumsDatabasesMySQL Help > With ROLLUP not accurately summing up rows?


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump




 Free IT White Papers!
 
How to Present Effectively Online
This white paper offers practical and actionable advice on the key steps that any presenter should consider as they plan and execute a Webinar or online meeting.

 
Open Source Security Myths
Open Source Software (OSS) is computer software whose source code is available to the general public with relaxed or non-existent intellectual property restrictions (or arrangement such as the public domain), and is usually developed with the input of many contributors.

 
Power and Cooling Capacity Management for Data Centers
This paper describes the principles for achieving power and cooling capacity management.

 
Scalable, Fault-Tolerant NAS for Oracle - The Next Generation
For several years NAS has been evolving as a storage alternative for Oracle databases, and for good reason: NAS is quite often the simplest, most cost-effective storage approach for Oracle. Learn about the benefits that HP's approach to scalable NAS brings to Oracle environments in this comprehensive white paper.

 
Understanding Web Application Security Challenges
This white paper discusses many common threats and preventive measures for Web application security, and explains what you can do to help protect your organization.

 

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 




© 2003-2009 by Developer Shed. All rights reserved. DS Cluster 4 Hosted by Hostway
For more Enterprise Application Development news, visit eWeek