The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.
|
 |
|
Dev Shed Forums
> Programming Languages
> Python Programming
|
Loop through each file in folder and look for specific character strings
Discuss Loop through each file in folder and look for specific character strings in the Python Programming forum on Dev Shed. Loop through each file in folder and look for specific character strings Python Programming forum discussing coding techniques, tips and tricks, and Zope related information. Python was designed from the ground up to be a completely object-oriented programming language.
|
|
 |
|
|
|
|

Dev Shed Forums Sponsor:
|
|
|

March 4th, 2013, 04:22 PM
|
|
Registered User
|
|
Join Date: Mar 2013
Posts: 8
Time spent in forums: 2 h 1 m 38 sec
Reputation Power: 0
|
|
|
Loop through each file in folder and look for specific character strings
Hi,
First off, I'd like to precise that I have never programmed anything in python, but I would like to write this little program to learn more about this language.
I currently work with VBA within Excel, but I would like to know if I could be able to do this job with python.
I have an Excel file with 600 lines. Those lines contains the specific characters strings I want to search in each .txt file in a folder.
Here is my current code in VBA. I commented the code a bit in order for you to understand each steps.
To start, how do I input my 600 values in python in order to look for them in each files?
Secondly, how do I rewrite the following code in python? :
Code:
Sub FINDlinesinfolder()
MsgBox ("Please choose the folder")
Application.ScreenUpdating = False
With Application.FileDialog(msoFileDialogFolderPicker)
.AllowMultiSelect = False
.Show
If .SelectedItems.Count > 0 Then
fd = .SelectedItems(1)
End If
End With
fn = Dir(fd & "\" & "*.*")
Set ws1 = Workbooks("myvalue.xls").Sheets(1)
'I set my sheet that contains the values I want to look at
ws1.Cells(1, 17) = "Found in following file"
ws1.Cells(1, 18) = "found on following line"
Do While fn <> "" ' I loop through each file
Set ws2 = Workbooks.Open(fd & "\" & fn).Sheets(1)
lr2 = ws2.Cells.Find(What:="*", After:=[A1], SearchDirection:=xlPrevious).Row
For i = 1 To 600 ' I loop through my 600 values
mydate = Right(ws1.Cells(i, 2), 4) & Mid(ws1.Cells(i, 2), 3, 2) & Left(ws1.Cells(i, 2), 2)
myaccount = WorksheetFunction.Substitute(ws1.Cells(i, 5), "-", "")
myamount = WorksheetFunction.Substitute(ws1.Cells(i, 3), ",", "")
'My values are a combination of the formatted value of 3 cells
'i could input only the end result in python
For y = 1 To lr2
If mydate <> "" Then
If ws2.Cells(y, 1) Like "*" & mydate & "*" & myaccount & "*" & myamount & "*" Then ws1.Cells(i, 17) = fn
ws1.Cells(i, 18) = y
End If
End If
Next y
'If the line in the text file is like "wildcard" & mydate & "wildcard" & myaccount & "wildcard" & myamount then write filename and line in my original excel file
Next i ' loop each line
ws2.Parent.Close False
fn = Dir
Loop 'loop each files
End Sub
Hope you undersand what I am trying to do.
Thank you for your help and time.
|

March 4th, 2013, 07:14 PM
|
|
Registered User
|
|
Join Date: Mar 2013
Posts: 8
Time spent in forums: 2 h 1 m 38 sec
Reputation Power: 0
|
|
|
Code
This is my first attempt at a simplified version of what I want to do. I want to loop through each file in a folder and print the filename if a line contains the text 'mytest'.
The only thing is that I am unable to make it run.
Can you please help me?
Code:
>>> import os
rootdir='c:\test\'
def myscan(line):
return line
for subdir, dirs, files in os.walk(rootdir):
for file in files:
f=open(file, 'r')
lines=f.readlines()
for line in lines:
if "mytest"
in line: print f.path
f.close()
|

March 4th, 2013, 08:23 PM
|
 |
Contributing User
|
|
|
|
Code:
import os
rootdir='c:\\test\\' # The backslashes are a problem.
for subdir, dirs, files in os.walk(rootdir):
for file in files:
with open(file, 'r') as f:
lines=f.readlines()
for line in lines:
if "mytest" in line:
print f.path
# In unix, I'd use this command
# find root_path -type f -exec grep --silent mytest {} \; -print
__________________
[code] Code tags[/code] are essential for python code!
|

March 4th, 2013, 08:45 PM
|
|
Registered User
|
|
Join Date: Mar 2013
Posts: 8
Time spent in forums: 2 h 1 m 38 sec
Reputation Power: 0
|
|
Quote: | Originally Posted by b49P23TIvg
Code:
import os
rootdir='c:\\test\\' # The backslashes are a problem.
for subdir, dirs, files in os.walk(rootdir):
for file in files:
with open(file, 'r') as f:
lines=f.readlines()
for line in lines:
if "mytest" in line:
print f.path
# In unix, I'd use this command
# find root_path -type f -exec grep --silent mytest {} \; -print
|
Thank you for the reply.
I get an invalid syntax error at f.path highlight on f
Can you please help me solve this problem?
|

March 5th, 2013, 01:42 AM
|
|
Registered User
|
|
Join Date: Mar 2013
Posts: 8
Time spent in forums: 2 h 1 m 38 sec
Reputation Power: 0
|
|
Got it to work but still have lot of questions.
Code:
import os
rootdir='c:\\test\\' #rootdir seems to be the directory where is located my python project (.py file) and not c:\test\. How do I solve this.
for subdir, dirs, files in os.walk(rootdir):
for file in files:
with open(file, 'r') as f:
lines=f.readlines()
for line in lines:
if "aaa" in line:
with open("Output.txt", "w") as text_file:text_file.write(f.name)
f.close() #my loop does not seem right because even tought I have multiple files that contains 'aaa' it only prints one
Also, since I am quite beginner, I would really like if you could help me write the final program. I learned quite a lot in VBA just by looking at other people code and I would be grateful if you could help me with this.
What I want it to do (I'm realizing that my first post might not be comprehensible) is this :
#1 I will have a .txt file (myvalues.txt) that will contains 600 lines and on each line there will be three value separated by a space.
Lets declare those 3 variables as follow : mydate, myaccount and myamount
#2 open each file in a specified folder and read each lines from them
#3 If the opened file contains a line with a string as follow from myvalue.txt (I will use "*" as wildcards and & to join each strings, even thought I'm not sure if this is how you do it in python): "*" & mydate & "*" & myaccount & "myamount" & "*" ; print this file name in an output file named output.txt
#4 loop through each file
Hope this is clearer and you can help me with this.
Thank you for your help and time.
|

March 5th, 2013, 10:45 AM
|
 |
Contributing User
|
|
|
|
instead of `f.path' use `f.name' since name is an attribute of files.
instead of `print string' use `print(string)' as this will work in python 2 and python 3.
Where you have `f.close()' remove it. The with context already closed your file f .
Maybe you should open this file in append mode. What you've got overwrites itself each time you execute the statement.
with open("Output.txt", "w")
Restructuring is better.
Therefor:
Code:
import os
rootdir='c:\\test\\' #rootdir seems to be the directory where is located my python project (.py file) and not c:\test\. How do I solve this.
with open("Output.txt", "w") as text_file:
for subdir, dirs, files in os.walk(rootdir):
for file in files:
with open(file, 'r') as f:
lines=f.readlines()
# f is closed when the code finishes the block
for line in lines:
if "aaa" in line:
text_file.write(f.name)
break # I assume you don't need to write the name for each occurrence of the target string
# python closes text_file when you get back to this indentation level.
Untested. My untested codes almost never work.
|

March 5th, 2013, 07:22 PM
|
|
Registered User
|
|
Join Date: Mar 2013
Posts: 8
Time spent in forums: 2 h 1 m 38 sec
Reputation Power: 0
|
|
|
Thank you for the reply.
Two concerns :
The program only writes one filename in output.txt even thought I have multiple files with 'aaa' string in them
The program reads file in the directory where the program file (.py) is. I want it to read data from c:\test\
Can you help me solve this
|

March 5th, 2013, 08:54 PM
|
 |
Contributing User
|
|
|
|
|
# the file is in subdir. Use os.path.join
with open(os.path.join(subdir,file), 'r') as f:
# you'll probably want a separator between the files
# listed in the output.
text_file.write(f.name+'\n')
|

March 5th, 2013, 09:42 PM
|
|
Registered User
|
|
Join Date: Mar 2013
Posts: 8
Time spent in forums: 2 h 1 m 38 sec
Reputation Power: 0
|
|
|
works like a charm thanks for the reply.
Now lets say I have a text file with the following text file (named myvalues.txt)
aaa bbb ccc
ddd eee fff
ggg hhh iii
Instead of having if "aaa" in line:
How do I get something like this :
myline(x) # This is to illustrate a variable that would contains the value of x line that i could loop
if "*" & left(myline(x)) & "*" & mid(myline(x),5,3) & "*" & right(myline(x),3) & "*" #Here & is used to join strings together and "*" is a wildcard (not sure how you do this in python)
How do I code this?
Thank you for your help and time with this.
Really appreciated.
|

March 6th, 2013, 09:57 PM
|
|
Registered User
|
|
Join Date: Mar 2013
Posts: 8
Time spent in forums: 2 h 1 m 38 sec
Reputation Power: 0
|
|
|
bump...
|

March 6th, 2013, 10:21 PM
|
 |
Contributing User
|
|
|
|
Input:
aaa bbb ccc
ddd eee fff
ggg hhh iii
What is the output you desire?
I can't derive a sane meaning from
if "*" & left(myline(x)) & "*" & mid(myline(x),5,3) & "*" & right(myline(x),3) & "*"
The nonsense I see:
) not connected in anyway to the input
) "*"&left
why would there be anything to the left of left? (or to the right of right)
Returning to your first post, you had
mydate = Right(ws1.Cells(i, 2), 4) & Mid(ws1.Cells(i, 2), 3, 2) & Left(ws1.Cells(i, 2), 2)
Study the re module, click here.
You can join python strings with + or with the join method.
>>> A='abc'
>>> A += 'def'
>>> ' , '.join((A,'ghi'))
'abcdef , ghi'
|

March 7th, 2013, 01:30 PM
|
|
Registered User
|
|
Join Date: Mar 2013
Location: Santa Clara, CA
Posts: 5
Time spent in forums: 2 h 51 m 16 sec
Reputation Power: 0
|
|
|
On a side note I've been developing a board on Python resources related to looping. Any recommendations you guys might have to add to this? http://www. verious.com/board/AKumar/looping-in-python/ (Sorry, I can't post links yet)
|
Developer Shed Advertisers and Affiliates
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate This Thread |
Linear Mode
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|