#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2005
    Posts
    2
    Rep Power
    0

    Problems with in-built sort()


    Hi,

    I have an array of arrays in the form of
    list = [[3,'fork',0.3,1],[2,'fork,0.1,2],[3,'exec',0.2,2]]

    The in-built sort(),list.sort() sorts on the first element, if the first elts are equal then it sorts on the second elt and so on...But i really dont want to search on the second elt if the first elts are equal...the 1-D lists shud be left in the same position i.e. i want the sorted list to be [[2,'fork',0.1,2],[3,'fork,0.3,1],[3,'exec',0.2,2]] and not [[2,'fork',0.1,2],[3,'exec',0.2,2],[3,'fork,0.3,1]].

    I tried the following code:

    def mysort(x,y):
    return x[0]-y[0]

    list.sort(mysort)

    This somehow seems to work for a small subset of my input...but not for my real input which has more than 2000 sub-lists(and so obviously i cant trace each pass..). Can someone tell me what i'm missing?
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Feb 2004
    Location
    London, England
    Posts
    1,585
    Rep Power
    1373
    Your code should work for any size array so long as the first element of each sublist are always numbers - if you had one starting with a string it would throw an exception. It is safer to use the builtin cmp function rather than do a subtraction:
    Code:
    def mysort(x,y):
       return cmp(x[0], y[0])
    However using a comparison function significantly slows down sorting. If you are using Python 2.4 you can use the new 'key' argument to sort, which is a function called on each element that returns a value to compare. so your code now becomes:

    Code:
    def mysort(x):
        return x[0]
    
    list.sort(key=mysort)
    This is much faster than using a comparison function since it is only called once for each element, instead of for every comparison.

    Dave - The Developers' Coach
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2005
    Posts
    2
    Rep Power
    0
    Thanks for the reply..but i forgot to mention that im using Python 2.2...i also tried using cmp(x[0],y[0]), but it did not work either....so my problem still remains unsolved.
  6. #4
  7. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2004
    Location
    Toronto
    Posts
    3
    Rep Power
    0
    Try using the decorate-sort-undecorate idiom, sort of explained here:
    http://www.python.org/moin/SortingListsOfDictionaries

    Basically you want to create a list of touples where the first element is the thing you want to use for sorting and the 2nd element is the original element, and then use the built in sort, and then get back your data:

    Code:
    decorated = [(a[1],a) for a in list]
    decorated.sort()
    sorted_list = [b[1] for b in decorated]
    You can modify the first step to search for the proper key to sort on if it is not always found in a[1].
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Feb 2004
    Location
    London, England
    Posts
    1,585
    Rep Power
    1373
    I am curious as to why cmp(x[0], y[0]) did not work. Did it not sort it in the order you expected, or did it throw an exception?

    The standard decorate/sort/undecorate pattern will not work in this case - the sort function will sort on the x[0] element first, but if two of them are equal it will then sort on the rest of the arrays - the end result will be the same as if we had just done a straight sort.

    The solution is to use the cmp function, or to decorate with the original index. This is tricky to do in Python 2.2, since it does not have the enumerate function that was introduced in 2.3.

    This should do the trick (but is untested, so don't take my word for it).
    Code:
    decorated = []
    index = 0
    for a in list:
       decorated.append((a[0],index, a))
       index += 1
    
    decorated.sort()
    sorted_list = [b[2] for b in decorated]
    this will preserve the original order when two arrays start with the same value.

    Dave - The Developers' Coach
  10. #6
  11. Hello World :)
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Mar 2003
    Location
    Hull, UK
    Posts
    2,537
    Rep Power
    69
    I'm with Dave, I'm not sure why the cmp() doesn't work for you. Anyway, sorting on the third element gives me the results you're after. Maybe I missed something .

    Code:
    >>> toSort = [[3, 'fork', 0.3, 1], [2, 'fork', 0.1, 2], [3, 'exec', 0.2, 2]]
    >>>                 
    >>> def sortThird(x, y):
    ...     return cmp(x[2], y[2])
    ... 
    >>> toSort.sort(sortThird)
    >>> toSort
    [[2, 'fork', 0.10000000000000001, 2], [3, 'exec', 0.20000000000000001, 2], [3, 'fork', 0.29999999999999999, 1]]
    >>>
    Note: this code is being run on Python 2.3 since I don't have 2.2 anymore . In fact if you can I would suggest upgrading your Python installation.

    Give it a go and see,

    Mark.
    programming language development: www.netytan.com Hula


IMN logo majestic logo threadwatch logo seochat tools logo