Forums: » Register « |  Free Tools |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support |

New Free Tools on Dev Shed!

#1
July 6th, 2013, 03:53 AM
 Larry3
Registered User

Join Date: Jul 2013
Posts: 2
Time spent in forums: 43 m 59 sec
Reputation Power: 0
Optimize nested loops

Hello,

I run into a performance problem in python3 :

I need to iterate over a list twice.

Code:
```mylist = ["45","89","59","47","88","554"]

for x in mylist:
print("x {}".format(x))
o = dna.index(x)
for p in mylist[o:]:
print(p)
```

My goal is to get the values of which the index is greater than the current value index.

The result here :
45 -> 45, 89,59,47,88,554
89 -> 89,59 ...

BUT with a large list the performances are really bad : 5 seconds for a 8000 elements list. (which turns into millions of iterated values)

I am sure I missed something, a structure or whatever to really optimize that.

Any clue ?

Thanks,

#2
July 6th, 2013, 04:33 AM
 zxq9
Contributing User

Join Date: May 2013
Location: Usually Japan when not on contract
Posts: 240
Time spent in forums: 2 Days 11 h 54 m
Reputation Power: 11
I'm not quite sure I understand your case here (your snippet includes an undefined list "dna"...) but perhaps this would at least get you thinking a different direction?
python Code:
 Original - python Code
```for i in range(len(mylist)):    print('{0} -> {1}'.format(mylist[i], ','.join(mylist[i:])))
```

or maybe the point isn't really to print them, just to arrange them? If so maybe generating a new list of strings would be better (for later file I/O, for example):
python Code:
 Original - python Code
```q = ['{0} -> {1}'.format(mylist[i], ','.join(mylist[i:])) for i in range(len(mylist))]
```

I'm much more familiar with Python 2 efficiency issues and there are about a million slick ways to arrange iterables in Python. But hopefully this will spark some ideas from other lurkers or in your own mind.

#3
July 6th, 2013, 06:46 AM
 Larry3
Registered User

Join Date: Jul 2013
Posts: 2
Time spent in forums: 43 m 59 sec
Reputation Power: 0
@zxq9

If you think I can make it really faster in python 2.x, I can switch

And yes I am sorry : dna stands for "dynamic new array" and i replaced it with "mylist" to be clearer.

The intention is rather simple actually.

Let's make this clearer :
I won't reuse your code right now since I am not sure how to use it.
I have a dict.
i want to feed this dict with the values found with your code.

The values in the dict are never the same.

Code:
```#THE DICTIONARY
listdict = {}

#THE TEMP LIST WHICH WILL KEEP THE TEMP VALUES

#MY LIST OF NUMBERS
#mylist = ["45","89","59","47","88","554"]
mylist = ["45","89","78"]

# -1 BECAUSE WE WILL INCREMENT BY ONE AND START AT 0. COULD HAVE SET IT AT 0 AND INCREMENT AT THE END BUT...
i = -1

#ITERATOR
for x in mylist:

#SET/RESET THE LIST
tmp = []
i += 1
g = -1

for p in mylist:
g += 1

if g > i:

tmp.append(p)

listdict[x] = tmp

print(listdict)
```

This code does exactly what I need.
But it not good performance-wise.

How could I implement your solution here, if it fits ?

Thanks !

#4
July 6th, 2013, 01:38 PM
 dwblas
Contributing User

Join Date: May 2009
Posts: 412
Time spent in forums: 5 Days 5 h 27 m 28 sec
Reputation Power: 32
You can first sort the list so once you find the first "greater than" the rest are also greater. You can also reduce the number of lookups by looking at the middle value of the sorted list (removes 1/2 of the list items), then the one-quarter or three-quarters value, depending on the middle value compare, etc. until you get down to 10 or so of the list items left which can be compared individually as it would take about as much time as additional lookups.

Also, you could divide the list, so if you have values from 1 to 8000, divide it into 8 lists of 1000 each and that would mean a maximum of 1000 lookups=approximately 1/8 of what you are doing now, but would probably require a list or dictionary to keep track of the starting and ending values of each list and and some if statements to test each start and end
Code:
```if value >= start and value <= end:
## look for first greater than
elif start > value:  ```
Last, a third way is to sort the list and go through it once taking each value, placing it in a diictionary, and calculating the number of items that are greater than (using the length and position you are at now). Post what you come up with for additional help as none of these are difficult to code so don't spend too much time trying to figure it out.

Last edited by dwblas : July 6th, 2013 at 01:56 PM.

#5
July 6th, 2013, 08:55 PM
 zxq9
Contributing User

Join Date: May 2013
Location: Usually Japan when not on contract
Posts: 240
Time spent in forums: 2 Days 11 h 54 m
Reputation Power: 11
Quote:
 Originally Posted by Larry3 If you think I can make it really faster in python 2.x, I can switch

There are some things that are faster in Python2, but if efficiency is your main concern then writing the slow-performing module in C is really the way to go.
Quote:
 I won't reuse your code right now since I am not sure how to use it.

From your explanation above, I think this is what you're after:
python Code:
 Original - python Code
```# This is all you needmylist = ["45","89","59","47","88","554"]d = dict([(mylist[i], ','.join(mylist[i:])) for i in range(len(mylist))]) # Let's make it prettyfrom pprint import pprintpprint(d)
```

This produces the following output:
python Code:
 Original - python Code
```{'45': '45,89,59,47,88,554', '47': '47,88,554', '554': '554', '59': '59,47,88,554', '88': '88,554', '89': '89,59,47,88,554'}
```

You will notice that the order of the elements in the dictionary is random (well, with pprint() it is actually sorted by key alphabetically). If we walk down mylist and print each element in order, we get:
python Code:
 Original - python Code
```>>> for x in mylist:...   print(x, ':', d[x])... 45 : 45,89,59,47,88,55489 : 89,59,47,88,55459 : 59,47,88,55447 : 47,88,55488 : 88,554554 : 554
```

The important line is this one:
python Code:
 Original - python Code
```d = dict([(mylist[i], ','.join(mylist[i:])) for i in range(len(mylist))])
```

The outermost function is dict(). It takes a list of tuples and returns a dictionary where each element "x[0]" from the list becomes a key and each matching element "x[1]" becomes a value.

Inside that we are doing a list comprehension (a much more efficient way of building a list than with a loop -- more efficient both in execution and the mind once you learn the idiom). List comprehensions are lists, but instead of specifying each element in the list (which you do one at a time when you loop and append()) you write your list [ ] and inside you describe what should go in the list. In the bit above we are creating a list of tuples (to pass to dict()).
python Code:
 Original - python Code
```#Two identical results:mylist = ["45","89","59","47","88","554"] #Loop version (slow, verbose, not considered Pythonic)list_a = []for i in range(len(mylist):    list_a.append((mylist[i], ','.join(mylist[i:]))) # List comprehension version (optimized, considered Pythonic)list_b = [(mylist[i], ','.join(mylist[i:])) for i in range(len(mylist))]
```

The part with "range(len(mylist))" gives us a generator that will yield an integer, incrementing it by 1 all the way to the max size passed to it.
python Code:
 Original - python Code
```# The C-style wayi = 0while i < 10:  print(i)  i += 1 # Python wayfor i in range(10):  print(i)
```

The tuple we are creating (mylist[i], ','.join(mylist[i:])) is the value at i as member x[0], and a string containing every member from i to the end of the list connected by commas as member x[1]. The join() method on a string is a little weird in form, but very useful (and much faster than doing "foo" + "bar"). It accepts a list of strings and concatenates them together with its principal value in the middle.
python Code:
 Original - python Code
```>>> 'z'.join(mylist)'45z89z59z47z88z554'>>> '...'.join(mylist)'45...89...59...47...88...554'>>> '...'.join(mylist[2:5])'59...47...88'>>> (mylist[1], ','.join(mylist[1:]))('89', '89,59,47,88,554')
```

Last edited by zxq9 : July 7th, 2013 at 01:00 AM.

 Viewing: Dev Shed Forums > Programming Languages > Python Programming > Optimize nested loops