#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    2
    Rep Power
    0

    Optimize nested loops


    Hello,

    I run into a performance problem in python3 :

    I need to iterate over a list twice.

    Code:
    mylist = ["45","89","59","47","88","554"]
    
    for x in mylist:
        print("x {}".format(x))
        o = dna.index(x)
        for p in mylist[o:]:
            print(p)
    My goal is to get the values of which the index is greater than the current value index.

    The result here :
    45 -> 45, 89,59,47,88,554
    89 -> 89,59 ...

    BUT with a large list the performances are really bad : 5 seconds for a 8000 elements list. (which turns into millions of iterated values)

    I am sure I missed something, a structure or whatever to really optimize that.

    Any clue ?

    Thanks,
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Location
    Usually Japan when not on contract
    Posts
    240
    Rep Power
    12
    I'm not quite sure I understand your case here (your snippet includes an undefined list "dna"...) but perhaps this would at least get you thinking a different direction?
    python Code:
    for i in range(len(mylist)):
        print('{0} -> {1}'.format(mylist[i], ','.join(mylist[i:])))

    or maybe the point isn't really to print them, just to arrange them? If so maybe generating a new list of strings would be better (for later file I/O, for example):
    python Code:
    q = ['{0} -> {1}'.format(mylist[i], ','.join(mylist[i:])) for i in range(len(mylist))]

    I'm much more familiar with Python 2 efficiency issues and there are about a million slick ways to arrange iterables in Python. But hopefully this will spark some ideas from other lurkers or in your own mind.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    2
    Rep Power
    0
    @zxq9

    If you think I can make it really faster in python 2.x, I can switch

    And yes I am sorry : dna stands for "dynamic new array" and i replaced it with "mylist" to be clearer.

    My bad !

    The intention is rather simple actually.

    Let's make this clearer :
    I won't reuse your code right now since I am not sure how to use it.
    I have a dict.
    i want to feed this dict with the values found with your code.

    The values in the dict are never the same.

    Code:
    #THE DICTIONARY
    listdict = {}
    
    #THE TEMP LIST WHICH WILL KEEP THE TEMP VALUES
    
    
    #MY LIST OF NUMBERS 
    #mylist = ["45","89","59","47","88","554"]
    mylist = ["45","89","78"]
    
    # -1 BECAUSE WE WILL INCREMENT BY ONE AND START AT 0. COULD HAVE SET IT AT 0 AND INCREMENT AT THE END BUT...
    i = -1
    
    #ITERATOR
    for x in mylist:
        
        #SET/RESET THE LIST
        tmp = []
        i += 1
        g = -1
        
        for p in mylist:
            g += 1
            
            if g > i:
    
                tmp.append(p)
        
        listdict[x] = tmp
    
    print(listdict)
    This code does exactly what I need.
    But it not good performance-wise.

    How could I implement your solution here, if it fits ?

    Thanks !
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2009
    Posts
    492
    Rep Power
    33
    You can first sort the list so once you find the first "greater than" the rest are also greater. You can also reduce the number of lookups by looking at the middle value of the sorted list (removes 1/2 of the list items), then the one-quarter or three-quarters value, depending on the middle value compare, etc. until you get down to 10 or so of the list items left which can be compared individually as it would take about as much time as additional lookups.

    Also, you could divide the list, so if you have values from 1 to 8000, divide it into 8 lists of 1000 each and that would mean a maximum of 1000 lookups=approximately 1/8 of what you are doing now, but would probably require a list or dictionary to keep track of the starting and ending values of each list and and some if statements to test each start and end
    Code:
    if value >= start and value <= end:
        ## look for first greater than
    elif start > value:
    Last, a third way is to sort the list and go through it once taking each value, placing it in a diictionary, and calculating the number of items that are greater than (using the length and position you are at now). Post what you come up with for additional help as none of these are difficult to code so don't spend too much time trying to figure it out.
    Last edited by dwblas; July 6th, 2013 at 12:56 PM.
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Location
    Usually Japan when not on contract
    Posts
    240
    Rep Power
    12
    Originally Posted by Larry3
    If you think I can make it really faster in python 2.x, I can switch
    There are some things that are faster in Python2, but if efficiency is your main concern then writing the slow-performing module in C is really the way to go.
    I won't reuse your code right now since I am not sure how to use it.
    From your explanation above, I think this is what you're after:
    python Code:
    # This is all you need
    mylist = ["45","89","59","47","88","554"]
    d = dict([(mylist[i], ','.join(mylist[i:])) for i in range(len(mylist))])
     
    # Let's make it pretty
    from pprint import pprint
    pprint(d)

    This produces the following output:
    python Code:
    {'45': '45,89,59,47,88,554',
     '47': '47,88,554',
     '554': '554',
     '59': '59,47,88,554',
     '88': '88,554',
     '89': '89,59,47,88,554'}

    You will notice that the order of the elements in the dictionary is random (well, with pprint() it is actually sorted by key alphabetically). If we walk down mylist and print each element in order, we get:
    python Code:
    >>> for x in mylist:
    ...   print(x, ':', d[x])
    ... 
    45 : 45,89,59,47,88,554
    89 : 89,59,47,88,554
    59 : 59,47,88,554
    47 : 47,88,554
    88 : 88,554
    554 : 554

    The important line is this one:
    python Code:
    d = dict([(mylist[i], ','.join(mylist[i:])) for i in range(len(mylist))])

    The outermost function is dict(). It takes a list of tuples and returns a dictionary where each element "x[0]" from the list becomes a key and each matching element "x[1]" becomes a value.

    Inside that we are doing a list comprehension (a much more efficient way of building a list than with a loop -- more efficient both in execution and the mind once you learn the idiom). List comprehensions are lists, but instead of specifying each element in the list (which you do one at a time when you loop and append()) you write your list [ ] and inside you describe what should go in the list. In the bit above we are creating a list of tuples (to pass to dict()).
    python Code:
    #Two identical results:
    mylist = ["45","89","59","47","88","554"]
     
    #Loop version (slow, verbose, not considered Pythonic)
    list_a = []
    for i in range(len(mylist):
        list_a.append((mylist[i], ','.join(mylist[i:])))
     
    # List comprehension version (optimized, considered Pythonic)
    list_b = [(mylist[i], ','.join(mylist[i:])) for i in range(len(mylist))]

    The part with "range(len(mylist))" gives us a generator that will yield an integer, incrementing it by 1 all the way to the max size passed to it.
    python Code:
    # The C-style way
    i = 0
    while i < 10:
      print(i)
      i += 1
     
    # Python way
    for i in range(10):
      print(i)

    The tuple we are creating (mylist[i], ','.join(mylist[i:])) is the value at i as member x[0], and a string containing every member from i to the end of the list connected by commas as member x[1]. The join() method on a string is a little weird in form, but very useful (and much faster than doing "foo" + "bar"). It accepts a list of strings and concatenates them together with its principal value in the middle.
    python Code:
    >>> 'z'.join(mylist)
    '45z89z59z47z88z554'
    >>> '...'.join(mylist)
    '45...89...59...47...88...554'
    >>> '...'.join(mylist[2:5])
    '59...47...88'
    >>> (mylist[1], ','.join(mylist[1:]))
    ('89', '89,59,47,88,554')
    Last edited by zxq9; July 7th, 2013 at 12:00 AM.

IMN logo majestic logo threadwatch logo seochat tools logo