Oracle Development
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsDatabasesOracle Development

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old October 14th, 2005, 06:04 AM
Mark Baker Mark Baker is online now
Contributing User
Dev Shed Novice (500 - 999 posts)
 
Join Date: Nov 2001
Location: North West UK
Posts: 651 Mark Baker User rank is Sergeant Major (2000 - 5000 Reputation Level)Mark Baker User rank is Sergeant Major (2000 - 5000 Reputation Level)Mark Baker User rank is Sergeant Major (2000 - 5000 Reputation Level)Mark Baker User rank is Sergeant Major (2000 - 5000 Reputation Level)Mark Baker User rank is Sergeant Major (2000 - 5000 Reputation Level)Mark Baker User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 1 Week 1 Day 13 h 22 m 26 sec
Reputation Power: 42
Query that combines multiple date range records into a single record

I have a particularly knotty problem at the moment that is driving me insane trying to work out a solution, and hope that somebody on the forum might have had a similar issue before that they've been able to resolve.

Working on an HR system with data held on an Oracle 9 database, I have an absences table with the following key fields:
  • Person_ID
  • Start_Date
  • End_Date
and I need to count the frequency of absences for each individual over a given period.

It seems simple enough:
SELECT Person_ID,
Count(*)
FROM Absence_Table
GROUP BY Person_ID

Except that there's a couple of issues with the data that make it rather harder.

Absences may be entered as a single record for the entire period; so
Code:
Start_Date=1-Jan-2005 and End_Date=14-Jan-2005
or they may be entered as multiple records (typically blocks based on Saturday to Friday), so
Code:
Start_Date=1-Jan-2005 and End_Date=3-Jan-2005
Start_Date=4-Jan-2005 and End_Date=10-Jan-2005
Start_Date=11-Jan-2005 and End_Date=14-Jan-2005
giving three records that need to be counted as a single entry.

And just for luck, I can get multiple records covering the same absence period, or that overlap periods of absence; so
Code:
Start_Date=1-Jan-2005 and End_Date=14-Jan-2005
Start_Date=4-Jan-2005 and End_Date=10-Jan-2005
or
Code:
Start_Date=1-Jan-2005 and End_Date=10-Jan-2005
Start_Date=7-Jan-2005 and End_Date=14-Jan-2005

which should obly be counted as one absence.

I can't clean up the core data, but I need to work out a select that will filter all these ambiguities.

Code:
select * from (
select A1.date_start ,
   A1.date_end
from tbl_absences A1
 where A1.person_id = 65955
   and not exists ( select A2.person_id
                           from tbl_absences A2
                        where A2.person_id = A1.person_id
                           and A2.date_start = A1.date_end + 1 )
  and not exists ( select A3.person_id
                          from owbrun.per_absence_attendances A3
                        where A3.person_id = A1.person_id
                           and A3.date_end = A1.date_start - 1 )
MINUS
select A1.date_start,
         A2.date_end,
  from tbl_absences A1 ,
         tbl_absences A2
 where A1.person_id = 65955
    and A2.person_id = A1.person_id
    and A2.date_start = A1.date_end + 1 )
 order by date_start ,
        date_end
;

Which works well for two consecutive entries, but 3 consecutive entries cause problems.

Likewise
Code:
select Person_ID
         MIN(date_start),
         MAX(date_end),
 from ( select date_start,
                   date_end,
                   date_start - rank() over ( partition by Person_ID
                                                      date_end,
                                              order by date_start ) group_col
            from tbl_absences )
 group by person_id,
             group_col
 order by person_id,
             group_col
;

Filters out double bookings with a matching start date, but can't handle true overlap bookings.

Any help at fixing my logic, or suggestions for a totally new approach would be appreciated.
__________________
---
Mark Baker
Development Projects: PHPExcel


Reply With Quote
  #2  
Old October 14th, 2005, 09:44 AM
Mark Baker Mark Baker is online now
Contributing User
Dev Shed Novice (500 - 999 posts)
 
Join Date: Nov 2001
Location: North West UK
Posts: 651 Mark Baker User rank is Sergeant Major (2000 - 5000 Reputation Level)Mark Baker User rank is Sergeant Major (2000 - 5000 Reputation Level)Mark Baker User rank is Sergeant Major (2000 - 5000 Reputation Level)Mark Baker User rank is Sergeant Major (2000 - 5000 Reputation Level)Mark Baker User rank is Sergeant Major (2000 - 5000 Reputation Level)Mark Baker User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 1 Week 1 Day 13 h 22 m 26 sec
Reputation Power: 42
As an update, I've been working on the query and come up with the following SQL that breaks the date ranges down into 'virtual' records fro each individual date in the range, filters out the duplicates, and then recombines the remaining individual date records into rows for uniquely consecutive ranges.
Code:
SELECT person_id ,
       reason , 
       MIN(ActualDate) start_date ,
       MAX(ActualDate) end_date
  FROM ( SELECT person_id ,
                reason ,
                ActualDate ,
                ActualDate - rank() OVER ( PARTITION BY person_id ,
                                                        reason
                                               ORDER BY ActualDate ) group_col
           FROM ( SELECT abs.person_id ,
                         abst.name                  Reason ,
                         DateRange.DateDay + TO_DATE('01-Jan-2000','DD-Mon-YYYY') ActualDate
                    FROM tbl_absences abs ,
                         tbl_absence_reasons  abst ,
                         ( SELECT ROWNUM - 1 DateDay
                             FROM sys.all_objects
                            WHERE ROWNUM <= TO_DATE('31-Dec-2020','DD-Mon-YYYY') - TO_DATE('01-Jan-2000','DD-Mon-YYYY') ) DateRange
                    WHERE DateRange.DateDay + TO_DATE('01-Jan-2000','DD-Mon-YYYY') >= abs.date_start (+)
                      AND DateRange.DateDay + TO_DATE('01-Jan-2000','DD-Mon-YYYY') <= abs.date_end (+)
                      AND abs.person_id = 65955
                      AND abs.absence_reason_id = abst.absence_reason_id
                    GROUP BY abs.person_id ,
                             abst.name ,
                             DateRange.DateDay ) )
 GROUP BY person_id ,
          reason ,
          group_col
;

The only change from my original request for help is that I've factored in the absence reason code that I hadn't mentioned previously.


It works exactly as required, but with one problem.
The inner select:
Code:
SELECT ROWNUM - 1 DateDay
  FROM sys.all_objects
 WHERE ROWNUM <= TO_DATE('31-Dec-2020','DD-Mon-YYYY') - TO_DATE('01-Jan-2000','DD-Mon-YYYY') ) DateRange

generates a complete list of day records for every date between 01-Jan-2000 and 31-Dec-2020, which is extremely time consuming, as it occurs for every absence record on a large database; but if I try to replace TO_DATE('01-Jan-2000','DD-Mon-YYYY') and TO_DATE('31-Dec-2020','DD-Mon-YYYY') with the actual start and end dates from the absence record, Oracle complains about abs.date_start being an invalid identifier.


I've brain overloaded on the basic query so far, and can't see the wood for the trees any more. Can anybody help me flush in the correct date values so I'm doing a minimum of work with the day generator loop?

Reply With Quote
  #3  
Old October 14th, 2005, 10:01 AM
Mark Baker Mark Baker is online now
Contributing User
Dev Shed Novice (500 - 999 posts)
 
Join Date: Nov 2001
Location: North West UK
Posts: 651 Mark Baker User rank is Sergeant Major (2000 - 5000 Reputation Level)Mark Baker User rank is Sergeant Major (2000 - 5000 Reputation Level)Mark Baker User rank is Sergeant Major (2000 - 5000 Reputation Level)Mark Baker User rank is Sergeant Major (2000 - 5000 Reputation Level)Mark Baker User rank is Sergeant Major (2000 - 5000 Reputation Level)Mark Baker User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 1 Week 1 Day 13 h 22 m 26 sec
Reputation Power: 42
Smile

Forget my witterings, I'm really brain-burned and the answer was so obvious when I looked at my SQL again:
Code:
SELECT person_id ,
	   reason , 
	   MIN(ActualDate) start_date ,
	   MAX(ActualDate) end_date
  FROM ( SELECT person_id ,
  	   	 		reason ,
				ActualDate ,
				ActualDate - rank() OVER ( PARTITION BY person_id ,
							  		 	  			    reason
											   ORDER BY ActualDate ) group_col
		   FROM ( SELECT abs.person_id ,
	   	   		  		 abst.name 	   		  Reason ,
	   					 DateRange.DateDay + abs.date_start ActualDate
  					FROM owbrun.per_absence_attendances abs ,
  	   					 owbrun.per_absence_attendance_types  abst ,
  	   					 ( SELECT ROWNUM - 1 DateDay
	      				     FROM sys.all_objects
         					WHERE ROWNUM <= 365 ) DateRange
 				   WHERE DateRange.DateDay + abs.date_start >= abs.date_start
   				     AND DateRange.DateDay + abs.date_start <= abs.date_end
   					 AND abs.person_id = 65955
   					 AND abs.absence_attendance_type_id = abst.absence_attendance_type_id
 				   GROUP BY abs.person_id ,
 	   	  		   		 	abst.name ,
							abs.date_start ,
		  					DateRange.DateDay ) )
 GROUP BY person_id ,
  	   	  reason ,
		  group_col
;


Thanks for taking the time to read, but I'm considering this problem closed now.

Reply With Quote
Reply

Viewing: Dev Shed ForumsDatabasesOracle Development > Query that combines multiple date range records into a single record


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 6 hosted by Hostway
Stay green...Green IT