#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2008
    Posts
    72
    Rep Power
    7

    Advanced SQL delete command


    I have a DB for instance like below

    id
    company
    salesman
    saledate

    I need a SQL string that would delete oldest records for each salesman over 10 records. So if company1, salesman "John Doe" had sales everyday from 1-1 to 1-15 the records for 1-1 through 1-5 would be deleted.

    Also this would need to not be for a specific salesman, it would need to be for ALL salesman with one SQL command, so that any of their sale records over 10 would be deleted, oldest first.

    Is this possible? Can anybody give me some tips please?
  2. #2
  3. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,280
    Rep Power
    4279
    Originally Posted by wakerunner
    Is this possible?
    yes, although i gotta ask... why do you want to do this at all???
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2008
    Posts
    72
    Rep Power
    7
    Originally Posted by r937
    yes, although i gotta ask... why do you want to do this at all???
    My example a simplified version of what I'm trying to do, those are not the actual fields or numbers. I have a table that keeps a history of sales records for salesmen, I do not want to keep the history forever because of the number of records involved it would make the DB very large very quickly. I only want to keep the 1000 newest records for each salesman. Currently I am using the command below for each salesman, and it works but slows down my processing considerably when processing thousands of salesmen.

    delete from saleshistory WHERE company=@company and storenumber =@storenumber and salesman =@salesman and id NOT IN (SELECT TOP 1000 id FROM saleshistory where company=@company and storenumber =@storenumber and salesman =@salesman ORDER BY saledate DESC)

    I want to be able to run the command either for all companies / storenumbers/salesmen, or for a specific company/storenumber and for all salesmen for that company/storenumber. So far everything I have tried does not delete the oldest for each salesmen, but rather the oldest records period.

    Maybe I'm going about this all wrong? Any help would be appreciated.
  6. #4
  7. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,280
    Rep Power
    4279
    Originally Posted by wakerunner
    I do not want to keep the history forever because of the number of records involved it would make the DB very large very quickly.
    very large is hundreds of millions of rows

    a few hundred thousand rows is not large

    i think you might be trying to solve a problem that isn't really a problem
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  8. #5
  9. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,280
    Rep Power
    4279
    i am so sorry, i accidentally deleted your post instead of replying to it

    (i have moderator privileges and accidentally clicked on Edit instead of Reply)

    if you would like to re-post your last post, i will do my best not to mess it up again
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2008
    Posts
    72
    Rep Power
    7
    Originally Posted by r937
    i am so sorry, i accidentally deleted your post instead of replying to it

    (i have moderator privileges and accidentally clicked on Edit instead of Reply)

    if you would like to re-post your last post, i will do my best not to mess it up again
    A couple thousand companies X 200 salesmen each x 60 sales day... adds up to billions a year.

    If you don't know how to do what I'm trying to do or can't give me another way of doing it maybe somebody else can?
  12. #7
  13. Wiser? Not exactly.
    Devshed God 1st Plane (5500 - 5999 posts)

    Join Date
    May 2001
    Location
    Bonita Springs, FL
    Posts
    5,959
    Rep Power
    4035
    I do not believe there is any single query that will do precisely what you want. You'll have to create a stored procedure that you can run which will do the proper delete statements for the given parameters.

    For example:
    Code:
    CREATE PROCEDURE cleanSales(@company int, @store int, @person int) AS 
    BEGIN
    	--Get a list of company, storenumber, salesman that we use to go through the delete
    	DECLARE delCursor CURSOR FOR
    	SELECT company, storenumber, salesman 
    	from saleshistory 
    	where
    		(@company is null OR company = @company)
    		AND (@store is null or storenumber = @store)
    		AND (@person is null or salesman = @person)
    	
    	DECLARE @del_company int, @del_store int, @del_person int
    	OPEN delCursor
    	FETCH FROM delCursor INTO @del_company, @del_store, @del_person
    	WHILE @@FETCH_STATUS=0
    	BEGIN
    		DELETE FROM saleshistory
    		WHERE
    			company=@del_company
    			AND storenumber=@del_store
    			AND salesman=@del_person
    			AND id NOT IN (
    				SELECT TOP 1000 id
    				FROM saleshistory
    				WHERE
    					company=@del_company
    					AND storenumber=@del_store
    					AND salesman=@del_person
    				ORDER BY
    					salesdate DESC
    			)
    		FETCH FROM delCursor INTO @del_company, @del_store, @del_person
    	END
    	
    	CLOSE delCursor
    	DEALLOCATE delCursor
    END
    The main issue is that, as far as I know, there is no way to get a TOP X list for each group, TOP X always applies to the whole result set, not individual groups. Since you want the top 1000 for each individual person you need to run a delete for each person.

    The above procedure queries first for a list of company/stores/salesmen based on the passed parameters, then goes through and issues a delete for each one.


    It would be much simpler if you did your cutoff by date rather than trying to keep the last x records. Such as maybe delete everything older than a week or whatever.
    Recycle your old CD's, don't just trash them



    If I helped you out, show some love with some reputation, or tip with Bitcoins to 1N645HfYf63UbcvxajLKiSKpYHAq2Zxud
  14. #8
  15. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2008
    Posts
    72
    Rep Power
    7
    Great, thanks for the code, I thought something like this would be required. I'll try it out and see if its faster than what I have now... and yes, it would be much easier to do it by date older than X, but unfortunately this will not work in this situation, although I may end up having to do something like that just to speed up the process, give up a little extra hard drive space to save time.

    Thank you!
  16. #9
  17. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2007
    Posts
    168
    Rep Power
    7
    Code:
    --generate some test data
    SELECT id, name , 'dt'=dateadd(dd,-colid,getdate()) 
    INTO #t1 FROM syscolumns 
    
    --The delete
    delete #t1 
    from   #t1 a
    join  (SELECT *, rowid=dense_rank () over (partition by id order by id,dt desc) FROM #t1)b
      on  rowid>10 -- keep last 10
      and b.id=a.id
      and b.dt=a.dt
    With large volums I suggest deleting in batches e.g.
    Code:
    declare @del_batch_size int, @rowcnt int
    select @del_batch_size=10, @rowcnt=@del_batch_size
    set rowcount @del_batch_size
    while @del_batch_size = @rowcnt
    begin
    delete #t1 
    from   #t1 a
    join  (SELECT *, rowid=dense_rank () over (partition by id order by id,dt desc) FROM #t1)b
      on  rowid>10 -- keep last 10
      and b.id=a.id
      and b.dt=a.dt
    	select @rowcnt=@@rowcount
    end
    set rowcount 0

IMN logo majestic logo threadwatch logo seochat tools logo