i have a list of word pairs in the file result.txt:
and so on.. I need to check for their pairwise occurrences in a directory with multiple files(at most one occurrence per file), and print the pair and their frequency count,in decreasing order of the frequency count.
from collections import Counter
from glob import iglob
from collections import defaultdict
import itertools as it
folderpath = 'path/to/directory'
logfile = open('result.txt', 'r')
loglist = logfile.readlines()
found = False
for line in loglist:
for filepath in iglob(os.path.join(folderpath,'*.txt')):
with open(filepath,'r') as filehandle:
for pair in it.combinations(re.findall('\w+',line),2):
resultList=[pair+(occurences, ) for pair, occurences in pairs.iterkeys()]
the output must be of the form:
group their 205
they is 180
the of 56
and so on...
plz help..i am lost