#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2005
    Posts
    62
    Rep Power
    9

    Drop duplicates and count diferent items from list... the correct way?


    Im using this function that I've made, but I was wondering if there is a better or faster way to do it. It counts only diferent items from a sorted list.
    Code:
    def ListReduce(list):
    	new = None
    	counter = 0
    	for item in list:
    		if string.lower(item) != new:
    			counter = counter + 1
    			new = string.lower(item)
    	return counter
    I'm sorting the list with:
    Code:
    list.sort(lambda x, y: cmp(string.lower(x), string.lower(y)))
    Cheers!
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2004
    Posts
    394
    Rep Power
    51
    Hi!

    A better, faster and shorter way is to use a set:
    Code:
    >>> lst = [1,1,2,1,3,4,3,2,3,2,1]
    >>> len(set(lst))
    4
    Regards, mawe
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Nov 2004
    Location
    There where the rabbits jump
    Posts
    556
    Rep Power
    11
    what do you want it sorted by give us an end example
    Those people who think they know everything are a great annoyance to those of us who do.
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2005
    Posts
    62
    Rep Power
    9
    Thank u! I didn't not know that method!
    But with strings it does not display what i want:
    Code:
    lst = ["hola","HoLa","hola","HolA","HOLA"]
    print len(set(lst))
    It prints: 4 and not 1 like my function. I need that "AnYCaSE" = "anycase"
    I'll play a bit with your idea and see what i can accomplish... thanks again for your help!
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2004
    Posts
    394
    Rep Power
    51
    Hi!

    One possibility:
    Code:
    >>> len(set([x.lower() for x in lst]))
    Regards, mawe
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2005
    Posts
    62
    Rep Power
    9
    Originally Posted by monkeyman23555
    what do you want it sorted by give us an end example
    Ok, I'll explain a little more what I'm doing. I'm using two functions. One to transform a tuple to a list and sort it alphabetically, and then another one to count only the diferent items in the list.
    Code:
    def Tuple2ListSort(tuple):
    	list = []
    	for item in tuple:
    		list.append(item[0])
    	list.sort(lambda x, y: cmp(string.lower(x), string.lower(y)))
    	return list
    
    def ListReduce(list):
    	new = None
    	counter = 0
    	for item in list:
    		if string.lower(item) != new:
    			counter = counter + 1
    			new = string.lower(item)
    	return counter
    So, for example...
    Code:
    # I get this from MySQL
    tuple = (('levis',), ('LeviS',), ('Sampier',), ('Sampier',), ('Levis',)) 
    list = Tuple2ListSort(tuple)
    count = ListReduce(list)
    print count # prints 2
    My concern is when the data fetched from mysql (the tuple) will be too big, and how that can affect performance.
    That's it... thanks again, I appreciate it.
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2005
    Posts
    62
    Rep Power
    9
    Originally Posted by mawe
    Hi!

    One possibility:
    Code:
    >>> len(set([x.lower() for x in lst]))
    Regards, mawe
    Yes it worked!
  14. #8
  15. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Nov 2003
    Posts
    624
    Rep Power
    34
    Code:
    SELECT LOWER(column) FROM table



    My concern is when the data fetched from mysql (the tuple) will be too big, and how that can affect performance.
    If you're worrying about performance before performance is a problem, that's premature optimisation, and it's kind of a waste of time - why spend time fixing a problem that isn't yet a problem? Chances are you already have too much to do and not enough time to do it in - why make more work for yourself if you don't have to?

    By the time you've spent another few hours on it, you may well have changed enough that your optimisations make no sense anymore...
    Last edited by sfb; August 26th, 2005 at 05:25 PM.
  16. #9
  17. Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Feb 2005
    Posts
    588
    Rep Power
    64

    Smile


    Originally Posted by mawe
    Hi!

    One possibility:
    Code:
    >>> len(set([x.lower() for x in lst]))
    Regards, mawe
    Wow, mawe that was sweet code!
    Just one question, if I may, how do you start an empty set?
  18. #10
  19. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2004
    Posts
    394
    Rep Power
    51
    Hi!

    Originally Posted by Dietrich
    Just one question, if I may, how do you start an empty set?
    Code:
    >>> s = set()
    >>> s
    set([])


    Regards, mawe

IMN logo majestic logo threadwatch logo seochat tools logo