|
|
|||||||||
|
|||||||||
| |||||||||
|
|
|
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
|
|
Stop making mediocre tutorials.The best tutorials are video! Camtasia Studio makes it easy to create engaging, buzz-building screen videos at any size, in any popular format. Download the free trial!
|
|
#1
|
|||
|
|||
|
SQL Modal Problem
Hi,
I have a form that just displays some data regarding web logs. My SQL is not too hot so i hope somebody can point me in the right direction. I have a DB table that amongst others has a server name and an execution times column. There are thousands of records each for every second within a particular week. I basically want to be able to select for each distinct server name the most popular execution time. For example if for server 'A' there are 5 records with an execution time of 0.5secs and 4 records with an execution time of 0.35secs then i want to return a record with the execution time of 0.5secs. However i want to be able to do this for every different occurrence of server name in my table. So i want to return records for server name 'B' and server name 'C' etc. Here is the SQL i have at the moment but it only returns the most popular or most occurring execution time in the table not the most popular execution time for EVERY server name in my table. For example Server_Name | Execution_Time ------------------------------------- A | 0.5 <code> SELECT DISTINCT instance, exec_time AS modal FROM Llserverlogs GROUP BY instance, exec_time HAVING (COUNT(*) >= ALL (SELECT COUNT(*) FROM llserverlogs GROUP BY instance, exec_time)) </code> Hope somebody can help and thanks in davance for the help. |
|
#2
|
||||
|
||||
|
Code:
select X.instance
, X.exec_time AS modal
from Llserverlogs as X
inner
join Llserverlogs as Y
on X.instance
= Y.instance
group
by X.instance
, X.exec_time
having X.exec_time
= max(Y.exec_time)
|
|
#3
|
|||
|
|||
|
r937, thanks for the reply, i have posted this problem on a few different forums u r the first one to offer a solution.
However there is a problem. Unfortunately my table has hundreds of thousands of records and the query you posted takes many, many seconds to query. Is it possible for you to refine it so that the timeout property of the DB does not kick in. Thanks for you help. |
|
#4
|
||||
|
||||
|
i am so sorry, the query i gave you was woefully incomplete
you will need a temp table Code:
create temporary table instance_counts
select instance
, exec_time
, count(*) as time_count
from llserverlogs
group
by instance
, exec_time
then try my query -- Code:
select X.instance
, X.exec_time AS modal
, X.time_count
from instance_counts as X
inner
join instance_counts as Y
on X.instance
= Y.instance
group
by X.instance
, X.exec_time
, X.time_count
having X.time_count
= max(Y.time_count)
|
|
#5
|
|||
|
|||
|
r937, works like an absolute charm. Thanks for your help.
P.S. You dont have any queries for calculating the 'exec_time' median do you? Thanks again. |
|
#6
|
||||
|
||||
|
yep, it's in here -- http://www.oreilly.com/catalog/transqlcook/chapter/ch08.html
not sure if you want to run it on the llserverlogs table or the temp instance_counts table where there's only one row per instance and exec_time |
|
#7
|
|||
|
|||
|
I want to run it on the llserverlogs table. Will try that particular query you are referring to in the link.
Thanks. |
|
#8
|
|||
|
|||
|
Hi r937,
unfortunately the example above times out on my DB could be something to do with the number of records in my table eg 600k+ and rising to 2.5M in a few weeks. Could you have a look at my SQL and see if i am doing something wrong please. Code:
SELECT x.exec_time median
FROM llserverlogs x, llserverlogs y
GROUP BY x.exec_time
HAVING
SUM(CASE WHEN y.exec_time <= x.exec_time
THEN 1 ELSE 0 END)>=(COUNT(*)+1)/2 AND
SUM(CASE WHEN y.exec_time >= x.exec_time
THEN 1 ELSE 0 END)>=(COUNT(*)/2)+1
|
|
#9
|
||||
|
||||
|
you are seeing cross-join effects, as every row in the table is joined to every other row
you need to have a join condition: Code:
from Llserverlogs as X
inner
join Llserverlogs as Y
on X.instance
= Y.instance
Code:
from Llserverlogs as X
, Llserverlogs as Y
where X.instance
= Y.instance
|
|
#10
|
|||
|
|||
|
Code:
SELECT x.exec_time median
FROM Llserverlogs AS X INNER JOIN
Llserverlogs AS Y ON X.instance = Y.instance
GROUP BY x.exec_time
HAVING SUM(CASE WHEN y.exec_time <= x.exec_time THEN 1 ELSE 0 END) >= (COUNT(*) + 1) / 2 AND
SUM(CASE WHEN y.exec_time >= x.exec_time THEN 1 ELSE 0 END) >= (COUNT(*) / 2) + 1
Code:
SELECT x.exec_time median
FROM Llserverlogs AS X, Llserverlogs AS Y
WHERE X.instance = Y.instance
GROUP BY x.exec_time
HAVING SUM(CASE WHEN y.exec_time <= x.exec_time THEN 1 ELSE 0 END) >= (COUNT(*) + 1) / 2 AND
SUM(CASE WHEN y.exec_time >= x.exec_time THEN 1 ELSE 0 END) >= (COUNT(*) / 2) + 1
r937 tried both your suggestions but got a timeout on both. Sorry for being such a pain but anymore suggestions? |
|
#11
|
||||
|
||||
|
no, sorry
|
|
#12
|
||||
|
||||
|
uh, you have indexes on those fields, right?
|
|
#13
|
|||
|
|||
|
r937, no it doesnt have an ID column. Let me explain i am using a script that imports CSV data to a SQL Server 2000 DB table above in the queries. When i did try to add an ID column that incremented automatically it always kept failing on the last record insert. Do i need to have an ID column before using the queries above?
|
|
#14
|
||||
|