### Thread: Check if Items in a List are Part of Items in A Bigger List

1. No Profile Picture
Registered User
Devshed Newbie (0 - 499 posts)

Join Date
Feb 2013
Posts
15
Rep Power
0

#### Check if Items in a List are Part of Items in A Bigger List

list1 = ['a', 'b', 'c', 'd', 'e']
list2 = ['b', 'c']

I want to compare list1 and list2. All elements of list2 have to be in list1 but not vice versa.

Also, an element in list2 can not have more occurrences than list1 does. So if b shows up two times in list1, list 2 can not have 3 or more.

Any ideas?

Thanks
2. No Profile Picture
Contributing User
Devshed Newbie (0 - 499 posts)

Join Date
Oct 2012
Posts
194
Rep Power
3
This seems more of a job for sets than lists. This of course makes the assumption that all items in your lists will be unique but:
Code:
```>>> first = {'a', 'b', 'c', 'd', 'e'}
>>> second = {'b', 'c'}
>>> first.issubset(second)
False
>>> second.issubset(first)
True
>>>```
If the values are allowed multiple times as you seem to want, you will have to do some dancing around with dictionaries and counting items. Make an attempt and post back.

-Mek
3. No Profile Picture
Registered User
Devshed Newbie (0 - 499 posts)

Join Date
Feb 2013
Posts
15
Rep Power
0
Originally Posted by Mekire
This seems more of a job for sets than lists. This of course makes the assumption that all items in your lists will be unique but:
Code:
```>>> first = {'a', 'b', 'c', 'd', 'e'}
>>> second = {'b', 'c'}
>>> first.issubset(second)
False
>>> second.issubset(first)
True
>>>```
If the values are allowed multiple times as you seem to want, you will have to do some dancing around with dictionaries and counting items. Make an attempt and post back.

-Mek
Will do, I will try it out and get back to you. Thanks for the help, much appreciated.
4. No Profile Picture
Contributing User
Devshed Newbie (0 - 499 posts)

Join Date
Dec 2012
Posts
80
Rep Power
3
*whistles away to find his list intersection function*
*comes back smiling*

Okay, so awhile back I defined a function to find the intersection of two lists:

Code:
```def intersect(a,b):
#Gives the intersection of two sets, a and b.
out=list(v for v in a if v in b)
for v in out:
if out.count(v) > min(a.count(v),b.count(v)):
out.remove(v)
return out```
The code can easily be modified to do what you want. This one will do the counts also, and doesn't rely on dictionaries or any of that.

Code:
```def inclusive(little,big):
#Figures out if all items in the list 'little' are represented in the list 'big'
if not set(little).issubset(set(big)): #Simple test, to determine if little list has extra inclusions
return False
for item in set(little): #set() makes sure we only check once.
if little.count(item) > big.count(item):
return False #The little list has more instances of 'item' than it should.
return True #If these tests are passed, the list is purely inclusive.```
5. No Profile Picture
Contributing User
Devshed Newbie (0 - 499 posts)

Join Date
Oct 2012
Posts
194
Rep Power
3
This one will do the counts also, and doesn't rely on dictionaries or any of that.
You are missing the point of counting with dictionaries I believe.

Your code does not scale. It is O(N^2). Try using it on much bigger lists and you will see what I mean. However, there is actually an even bigger problem. You are changing the size of the list out while you are iterating over it. This is actually giving incorrect results in certain cases. You need to iterate over a copy of the list if you plan to change its size mid-iteration.

Let's compare the intersect function with one that counts first using a dictionary:
python Code:
```import timeit
from random import randint

def count_items(mylist):
count_dict = {}
for item in mylist:
count_dict[item] = count_dict.get(item,0)+1
return count_dict

def intersect(a,b):
result = []
a_dict = count_items(a)
b_dict = count_items(b)
for item in a_dict:
if item in b_dict:
for i in xrange(min(a_dict[item],b_dict[item])):
result.append(item)
return result

def mr909_intersect(a,b):
#Gives the intersection of two sets, a and b.
out=list(v for v in a if v in b)
for v in out:
if out.count(v) > min(a.count(v),b.count(v)):
out.remove(v)
return out```
First let's see an example of an incorrect result. We'll test with this:
python Code:
```if __name__ == "__main__":
a = [randint(0,1000) for _ in xrange(1000)]
b = [randint(0,1000) for _ in xrange(1000)]
test = intersect(a[:],b[:])
print("My intersect:")
print(sorted(test))
print("Length: {}\n".format(len(test)))
test = mr909_intersect(a[:],b[:])
print(sorted(test))
print("Length: {}\n".format(len(test)))```
Code:
```>>>
My intersect:
[2, 4, 13, 15, 22, 25, 31, 32, 35, 35, 36, 38, 38, 41, 42, 45, 45, 46, 48, 49, 51, 51, 52, 58, 61, 61, 62, 66, 66, 67, 69, 69, 70, 71, 76, 79, 79, 79, 86, 86, 90, 90, 92, 93, 95, 96, 96, 98, 99, 100, 101, 102, 104, 105, 107, 109, 110, 113, 115, 116, 117, 118, 131, 131, 132, 134, 135, 135, 137, 138, 138, 144, 145, 147, 150, 152, 152, 153, 163, 167, 173, 173, 175, 178, 179, 180, 184, 185, 185, 185, 189, 192, 192, 193, 198, 199, 200, 202, 204, 205, 207, 208, 211, 214, 215, 217, 219, 229, 229, 231, 234, 235, 238, 242, 244, 245, 246, 249, 250, 252, 254, 259, 260, 263, 263, 265, 265, 269, 269, 271, 274, 279, 279, 280, 280, 281, 282, 284, 284, 293, 296, 297, 298, 300, 301, 301, 305, 305, 308, 312, 313, 316, 318, 318, 321, 323, 330, 334, 338, 341, 343, 348, 348, 349, 351, 352, 355, 355, 356, 359, 361, 373, 373, 374, 374, 375, 378, 379, 379, 380, 384, 385, 391, 393, 394, 395, 398, 400, 401, 403, 405, 406, 408, 408, 409, 411, 412, 413, 416, 417, 419, 423, 424, 426, 427, 427, 429, 431, 432, 440, 440, 440, 441, 444, 447, 449, 451, 453, 453, 457, 458, 461, 463, 465, 465, 466, 469, 469, 469, 472, 473, 474, 479, 481, 481, 486, 487, 488, 494, 498, 499, 500, 503, 504, 505, 506, 506, 509, 510, 510, 514, 515, 518, 520, 521, 524, 527, 530, 530, 531, 532, 533, 534, 538, 539, 545, 546, 548, 548, 550, 551, 553, 556, 559, 561, 564, 565, 565, 567, 571, 573, 575, 576, 578, 582, 584, 586, 586, 589, 589, 592, 598, 600, 603, 605, 606, 606, 608, 608, 611, 612, 620, 630, 630, 631, 635, 635, 637, 638, 643, 644, 648, 649, 652, 652, 653, 656, 658, 663, 665, 666, 666, 667, 667, 669, 670, 671, 674, 675, 678, 681, 683, 684, 685, 693, 696, 698, 700, 701, 703, 704, 706, 711, 713, 714, 715, 715, 715, 716, 717, 718, 719, 720, 722, 723, 724, 725, 731, 732, 734, 742, 743, 747, 748, 749, 750, 751, 752, 752, 756, 758, 759, 760, 763, 765, 767, 769, 773, 774, 779, 788, 789, 790, 791, 794, 795, 797, 802, 805, 806, 807, 811, 814, 815, 816, 818, 819, 825, 825, 827, 838, 840, 841, 842, 843, 844, 844, 847, 848, 849, 850, 850, 853, 855, 856, 858, 862, 865, 865, 867, 872, 873, 875, 878, 879, 881, 883, 886, 887, 888, 889, 890, 893, 894, 897, 897, 900, 902, 904, 904, 905, 906, 907, 909, 909, 911, 912, 913, 914, 915, 918, 922, 925, 928, 930, 930, 931, 931, 934, 939, 939, 942, 951, 953, 954, 955, 961, 963, 967, 977, 979, 981, 985, 986, 986, 991, 994, 997, 998]
Length: 479

[2, 4, 13, 15, 22, 25, 31, 32, 35, 35, 36, 38, 38, 41, 42, 45, 45, 46, 48, 49, 51, 51, 52, 58, 61, 61, 62, 66, 66, 67, 69, 69, 70, 71, 76, 79, 79, 79, 86, 86, 90, 90, 92, 93, 95, 96, 96, 98, 99, 99, 100, 101, 102, 104, 105, 107, 109, 110, 113, 115, 116, 117, 118, 131, 131, 132, 134, 135, 135, 137, 138, 138, 144, 145, 147, 150, 152, 152, 153, 163, 167, 173, 173, 175, 175, 178, 179, 180, 184, 185, 185, 185, 189, 192, 192, 193, 198, 199, 200, 202, 204, 205, 207, 208, 211, 214, 215, 217, 219, 229, 229, 231, 234, 235, 238, 242, 244, 245, 246, 249, 250, 252, 254, 259, 260, 263, 263, 265, 265, 269, 269, 271, 274, 279, 279, 280, 280, 281, 282, 284, 284, 293, 296, 297, 298, 300, 301, 301, 305, 305, 308, 312, 313, 316, 318, 318, 321, 323, 330, 334, 338, 341, 343, 343, 348, 348, 349, 351, 352, 355, 355, 356, 359, 361, 373, 373, 374, 374, 375, 378, 379, 379, 380, 384, 385, 391, 393, 394, 395, 398, 400, 401, 403, 405, 406, 408, 408, 409, 411, 412, 413, 416, 417, 419, 423, 424, 424, 424, 426, 427, 427, 429, 431, 432, 440, 440, 440, 441, 444, 447, 449, 451, 453, 453, 457, 458, 461, 463, 465, 465, 466, 469, 469, 469, 472, 473, 474, 479, 481, 481, 486, 487, 487, 488, 494, 498, 499, 500, 503, 504, 505, 506, 506, 509, 510, 510, 514, 515, 518, 520, 521, 524, 527, 530, 530, 531, 532, 533, 534, 538, 539, 545, 546, 548, 548, 550, 551, 553, 556, 559, 561, 564, 565, 565, 567, 571, 573, 575, 576, 578, 582, 584, 586, 586, 589, 589, 592, 598, 600, 603, 605, 606, 606, 608, 608, 611, 612, 620, 630, 630, 631, 635, 635, 637, 638, 643, 644, 648, 649, 652, 652, 653, 656, 658, 663, 665, 666, 666, 667, 667, 669, 670, 671, 674, 675, 678, 681, 683, 684, 685, 693, 696, 698, 700, 701, 703, 704, 706, 711, 713, 714, 715, 715, 715, 716, 717, 718, 719, 720, 722, 723, 724, 725, 731, 732, 734, 742, 743, 747, 748, 749, 750, 751, 752, 752, 756, 758, 759, 760, 763, 765, 767, 769, 773, 774, 779, 788, 789, 790, 791, 794, 795, 797, 802, 805, 806, 807, 811, 814, 815, 816, 818, 819, 825, 825, 827, 838, 840, 841, 842, 843, 844, 844, 847, 848, 849, 850, 850, 853, 855, 856, 858, 862, 865, 865, 867, 872, 873, 875, 878, 879, 881, 883, 886, 887, 888, 889, 890, 893, 894, 897, 897, 900, 902, 904, 904, 905, 906, 907, 909, 909, 911, 912, 913, 914, 915, 918, 922, 925, 928, 930, 930, 931, 931, 934, 939, 939, 942, 951, 953, 954, 955, 961, 963, 967, 967, 977, 979, 981, 985, 986, 986, 991, 994, 997, 998]
Length: 486```
The first result that differs between our methods is 99 (you found 2, I found 1). But...
Code:
```>>> a.count(99)
2
>>> b.count(99)
1```
Now we can fix that easily by itterating over out[:] instead of out. So lets fix it and check timing. Using the code found at the top (save for the fix) and timing with this code:
python Code:
```if __name__ == "__main__":
a = [randint(0,1000) for _ in xrange(1000)]
b = [randint(0,1000) for _ in xrange(1000)]

times = 10
print("Starting")
T = timeit.Timer("intersect(a[:],b[:])","from __main__ import a,b,intersect")
print("My intersect: {:.4f} s".format(T.timeit(times)/times))
T2 = timeit.Timer("mr909_intersect(a[:],b[:])",
"from __main__ import a,b,mr909_intersect")
print("Mr909 intersect: {:.4f} s".format(T2.timeit(times)/times))```
Code:
```>>>
Starting
My intersect: 0.0010 s
Mr909 intersect: 0.0548 s
>>>```
Now let's ramp up the ranges and see how it scales. Change a and b to this:
python Code:
```a = [randint(0,100000) for _ in xrange(10000)]
b = [randint(0,100000) for _ in xrange(10000)]```
and now:
Code:
```>>>
Starting
My intersect: 0.0079 s
Mr909 intersect: 3.1739 s
>>>```
The list methods count and index are O(N). If you iterate over a list and use index or count on the elements every loop, your code instantly goes to O(N^2). This is why people use the dict approach.

Sorry about the mile long post,
-Mek

#### Comments on this post

• zxq9 agrees : An explanation and example worth referencing later.
• Mr909 agrees : Eh, point taken. Your first code didn't do it, I just wanted to provide one that did. But your complete example is better. Thank you.
6. No Profile Picture
Registered User
Devshed Newbie (0 - 499 posts)

Join Date
Feb 2013
Posts
15
Rep Power
0
Thanks for all the input...

Finally got around to it, and finally finished the code. What I was trying to do is create a list of all words that can be made from a given set of letters.

Here is my final piece:

Code:
```def get_letters(letter): # makes sure all characters are letters and removes any numbers, whitespaces or other special characters
letters = []
for l in letter:
if l.isalpha():
letters.append(l)
return letters

def check_word(letters, word): # checks to see if the potential word is a subset of the initial list
w = []
for let in word:
w.append(let)
w.sort()
word_set = set(word)
letters_set = set(letters)

if word_set.issubset(letters_set):
#print(word, 'is subset')
return w
else:
#print(word, 'is not subset')
return False

def get_word():
letter = input('enter your letters ') # get set of letters to generate a word list from
letters = get_letters(letter)
letters.sort()
fh = open('words.txt') #opens dictionary of words
words = []
for line in fh: #puts those words into list
words.append(line.rstrip())

possibles = [] #start a list of words that can be made of the inputed letters
for w in words:
if len(w) > 3: #add to the list if the letters are larger than n letters long
word = check_word(letters, w)
if word is False:
pass
else:
pass

count = 0

#print('w is', w)
#print(); print()
for item in set(w): # check if the potential word can be made with the letters entered
if word.count(item) > letters.count(item):
break
elif count + 1== len(set(w)) and word.count(item) <= letters.count(item): #if at the end of iteration and no occurs more times than initial letter list
#print('appending', w)
possibles.append(w)
else:
count += 1
#print('count is', count)
else:
pass
print(possibles)
get_word()```

Might be messy but did the job. Any comments or suggestions?

Thanks again
7. No Profile Picture
Contributing User
Devshed Newbie (0 - 499 posts)

Join Date
Oct 2012
Posts
194
Rep Power
3
I couldn't actually get yours to perform correctly, but regardless it is a bit overcomplicated.

Take a look at the following (python 3). I use Counter from the collections module in the standard library here, but it counts in a way identical to my previous post:
python Code:
```from collections import Counter

def get_letters(user_input):
return [char.lower() for char in user_input if char.isalpha()]

def is_subset(word,letters):
word_dict = Counter(word)
letter_dict = Counter(letters)
for item in word_dict:
if (item not in letter_dict) or (word_dict[item] > letter_dict[item]):
return False
return True

def get_possible_words():
letters = input("Enter your letters: ")
letters = get_letters(letters)
possible = set()
with open("words.txt") as myfile:
for word in myfile:
word_parse = get_letters(word)
if is_subset(word_parse,letters):
return possible

if __name__ == "__main__":
print(get_possible_words())```

-Mek
8. No Profile Picture
Registered User
Devshed Newbie (0 - 499 posts)

Join Date
Feb 2013
Posts
15
Rep Power
0
Originally Posted by Mekire
I couldn't actually get yours to perform correctly, but regardless it is a bit overcomplicated.

Take a look at the following (python 3). I use Counter from the collections module in the standard library here, but it counts in a way identical to my previous post:
python Code:
```from collections import Counter

def get_letters(user_input):
return [char.lower() for char in user_input if char.isalpha()]

def is_subset(word,letters):
word_dict = Counter(word)
letter_dict = Counter(letters)
for item in word_dict:
if (item not in letter_dict) or (word_dict[item] > letter_dict[item]):
return False
return True

def get_possible_words():
letters = input("Enter your letters: ")
letters = get_letters(letters)
possible = set()
with open("words.txt") as myfile:
for word in myfile:
word_parse = get_letters(word)
if is_subset(word_parse,letters):