Python Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming LanguagesPython Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old March 4th, 2013, 04:22 PM
nicnad nicnad is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Mar 2013
Posts: 8 nicnad User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 2 h 1 m 38 sec
Reputation Power: 0
Loop through each file in folder and look for specific character strings

Hi,

First off, I'd like to precise that I have never programmed anything in python, but I would like to write this little program to learn more about this language.

I currently work with VBA within Excel, but I would like to know if I could be able to do this job with python.

I have an Excel file with 600 lines. Those lines contains the specific characters strings I want to search in each .txt file in a folder.

Here is my current code in VBA. I commented the code a bit in order for you to understand each steps.

To start, how do I input my 600 values in python in order to look for them in each files?

Secondly, how do I rewrite the following code in python? :

Code:
Sub FINDlinesinfolder()


    MsgBox ("Please choose the folder")



    Application.ScreenUpdating = False

    With Application.FileDialog(msoFileDialogFolderPicker)
        .AllowMultiSelect = False
        .Show
        If .SelectedItems.Count > 0 Then
            fd = .SelectedItems(1)
        End If
    End With



    fn = Dir(fd & "\" & "*.*")

    
        Set ws1 = Workbooks("myvalue.xls").Sheets(1)

'I set my sheet that contains the values I want to look at


        ws1.Cells(1, 17) = "Found in following file"
        ws1.Cells(1, 18) = "found on following line"

        Do While fn <> "" ' I loop through each file



            Set ws2 = Workbooks.Open(fd & "\" & fn).Sheets(1)
            lr2 = ws2.Cells.Find(What:="*", After:=[A1], SearchDirection:=xlPrevious).Row

            For i = 1 To 600 ' I loop through my 600 values


mydate = Right(ws1.Cells(i, 2), 4) & Mid(ws1.Cells(i, 2), 3, 2) & Left(ws1.Cells(i, 2), 2)
                myaccount = WorksheetFunction.Substitute(ws1.Cells(i, 5), "-", "")
                myamount = WorksheetFunction.Substitute(ws1.Cells(i, 3), ",", "")

'My values are a combination of the formatted value of 3 cells
'i could input only the end result in python

                For y = 1 To lr2
                    If mydate <> "" Then
                        If ws2.Cells(y, 1) Like "*" & mydate & "*" & myaccount & "*" & myamount & "*" Then                            ws1.Cells(i, 17) = fn
                            ws1.Cells(i, 18) = y
                        End If
                    End If
                Next y

'If the line in the text file is like "wildcard" & mydate  & "wildcard" & myaccount & "wildcard" & myamount then write filename and  line in my original excel file

            Next i ' loop each line

            ws2.Parent.Close False


            fn = Dir



        Loop 'loop each files
 

    End Sub


Hope you undersand what I am trying to do.

Thank you for your help and time.

Reply With Quote
  #2  
Old March 4th, 2013, 07:14 PM
nicnad nicnad is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Mar 2013
Posts: 8 nicnad User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 2 h 1 m 38 sec
Reputation Power: 0
Code

This is my first attempt at a simplified version of what I want to do. I want to loop through each file in a folder and print the filename if a line contains the text 'mytest'.

The only thing is that I am unable to make it run.

Can you please help me?

Code:
>>> import os
rootdir='c:\test\'
def myscan(line):
    return line
for subdir, dirs, files in os.walk(rootdir):
    for file in files:
        f=open(file, 'r')
        lines=f.readlines()
        for line in lines:
            if "mytest"
            in line: print f.path
        f.close()
        

Reply With Quote
  #3  
Old March 4th, 2013, 08:23 PM
b49P23TIvg's Avatar
b49P23TIvg b49P23TIvg is offline
Contributing User
Dev Shed Loyal (3000 - 3499 posts)
 
Join Date: Aug 2011
Posts: 3,389 b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 1 Month 2 Weeks 3 Days 14 h 22 m 25 sec
Reputation Power: 383
Code:
import os

rootdir='c:\\test\\'  # The backslashes are a problem.

for subdir, dirs, files in os.walk(rootdir):
    for file in files:
        with open(file, 'r') as f:
            lines=f.readlines()
        for line in lines:
            if "mytest" in line:
                print f.path


# In unix, I'd use this command

# find root_path -type f -exec grep --silent mytest {} \; -print
__________________
[code]Code tags[/code] are essential for python code!

Reply With Quote
  #4  
Old March 4th, 2013, 08:45 PM
nicnad nicnad is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Mar 2013
Posts: 8 nicnad User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 2 h 1 m 38 sec
Reputation Power: 0
Quote:
Originally Posted by b49P23TIvg
Code:
import os

rootdir='c:\\test\\'  # The backslashes are a problem.

for subdir, dirs, files in os.walk(rootdir):
    for file in files:
        with open(file, 'r') as f:
            lines=f.readlines()
        for line in lines:
            if "mytest" in line:
                print f.path


# In unix, I'd use this command

# find root_path -type f -exec grep --silent mytest {} \; -print


Thank you for the reply.

I get an invalid syntax error at f.path highlight on f

Can you please help me solve this problem?

Reply With Quote
  #5  
Old March 5th, 2013, 01:42 AM
nicnad nicnad is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Mar 2013
Posts: 8 nicnad User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 2 h 1 m 38 sec
Reputation Power: 0
Got it to work but still have lot of questions.

Code:
import os

rootdir='c:\\test\\'  #rootdir seems to be the directory where is located my python project (.py file) and not c:\test\. How do I solve this.

for subdir, dirs, files in os.walk(rootdir):
    for file in files:
        with open(file, 'r') as f:
            lines=f.readlines()
        for line in lines:
            if "aaa" in line:
                with open("Output.txt", "w") as text_file:text_file.write(f.name)
                f.close() #my loop does not seem right because even tought I have multiple files that contains 'aaa' it only prints one


Also, since I am quite beginner, I would really like if you could help me write the final program. I learned quite a lot in VBA just by looking at other people code and I would be grateful if you could help me with this.

What I want it to do (I'm realizing that my first post might not be comprehensible) is this :

#1 I will have a .txt file (myvalues.txt) that will contains 600 lines and on each line there will be three value separated by a space.
Lets declare those 3 variables as follow : mydate, myaccount and myamount

#2 open each file in a specified folder and read each lines from them

#3 If the opened file contains a line with a string as follow from myvalue.txt (I will use "*" as wildcards and & to join each strings, even thought I'm not sure if this is how you do it in python): "*" & mydate & "*" & myaccount & "myamount" & "*" ; print this file name in an output file named output.txt

#4 loop through each file

Hope this is clearer and you can help me with this.

Thank you for your help and time.

Reply With Quote
  #6  
Old March 5th, 2013, 10:45 AM
b49P23TIvg's Avatar
b49P23TIvg b49P23TIvg is offline
Contributing User
Dev Shed Loyal (3000 - 3499 posts)
 
Join Date: Aug 2011
Posts: 3,389 b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 1 Month 2 Weeks 3 Days 14 h 22 m 25 sec
Reputation Power: 383
instead of `f.path' use `f.name' since name is an attribute of files.

instead of `print string' use `print(string)' as this will work in python 2 and python 3.

Where you have `f.close()' remove it. The with context already closed your file f .

Maybe you should open this file in append mode. What you've got overwrites itself each time you execute the statement.
with open("Output.txt", "w")
Restructuring is better.

Therefor:
Code:
import os

rootdir='c:\\test\\'  #rootdir seems to be the directory where is located my python project (.py file) and not c:\test\. How do I solve this.

with open("Output.txt", "w") as text_file:

    for subdir, dirs, files in os.walk(rootdir):
        for file in files:

            with open(file, 'r') as f:
                lines=f.readlines()
            # f is closed when the code finishes the block

            for line in lines:
                if "aaa" in line:
                    text_file.write(f.name)
                    break # I assume you don't need to write the name for each occurrence of the target string 

# python closes text_file when you get back to this indentation level.
Untested. My untested codes almost never work.

Reply With Quote
  #7  
Old March 5th, 2013, 07:22 PM
nicnad nicnad is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Mar 2013
Posts: 8 nicnad User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 2 h 1 m 38 sec
Reputation Power: 0
Thank you for the reply.

Two concerns :

The program only writes one filename in output.txt even thought I have multiple files with 'aaa' string in them

The program reads file in the directory where the program file (.py) is. I want it to read data from c:\test\

Can you help me solve this

Reply With Quote
  #8  
Old March 5th, 2013, 08:54 PM
b49P23TIvg's Avatar
b49P23TIvg b49P23TIvg is offline
Contributing User
Dev Shed Loyal (3000 - 3499 posts)
 
Join Date: Aug 2011
Posts: 3,389 b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 1 Month 2 Weeks 3 Days 14 h 22 m 25 sec
Reputation Power: 383
# the file is in subdir. Use os.path.join
with open(os.path.join(subdir,file), 'r') as f:

# you'll probably want a separator between the files
# listed in the output.
text_file.write(f.name+'\n')

Reply With Quote
  #9  
Old March 5th, 2013, 09:42 PM
nicnad nicnad is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Mar 2013
Posts: 8 nicnad User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 2 h 1 m 38 sec
Reputation Power: 0
works like a charm thanks for the reply.

Now lets say I have a text file with the following text file (named myvalues.txt)


aaa bbb ccc
ddd eee fff
ggg hhh iii

Instead of having if "aaa" in line:

How do I get something like this :

myline(x) # This is to illustrate a variable that would contains the value of x line that i could loop

if "*" & left(myline(x)) & "*" & mid(myline(x),5,3) & "*" & right(myline(x),3) & "*" #Here & is used to join strings together and "*" is a wildcard (not sure how you do this in python)

How do I code this?

Thank you for your help and time with this.

Really appreciated.

Reply With Quote
  #10  
Old March 6th, 2013, 09:57 PM
nicnad nicnad is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Mar 2013
Posts: 8 nicnad User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 2 h 1 m 38 sec
Reputation Power: 0
bump...

Reply With Quote
  #11  
Old March 6th, 2013, 10:21 PM
b49P23TIvg's Avatar
b49P23TIvg b49P23TIvg is offline
Contributing User
Dev Shed Loyal (3000 - 3499 posts)
 
Join Date: Aug 2011
Posts: 3,389 b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 1 Month 2 Weeks 3 Days 14 h 22 m 25 sec
Reputation Power: 383
Input:

aaa bbb ccc
ddd eee fff
ggg hhh iii


What is the output you desire?


I can't derive a sane meaning from
if "*" & left(myline(x)) & "*" & mid(myline(x),5,3) & "*" & right(myline(x),3) & "*"

The nonsense I see:
) not connected in anyway to the input
) "*"&left
why would there be anything to the left of left? (or to the right of right)


Returning to your first post, you had
mydate = Right(ws1.Cells(i, 2), 4) & Mid(ws1.Cells(i, 2), 3, 2) & Left(ws1.Cells(i, 2), 2)


Study the re module, click here.

You can join python strings with + or with the join method.

>>> A='abc'
>>> A += 'def'
>>> ' , '.join((A,'ghi'))
'abcdef , ghi'

Reply With Quote
  #12  
Old March 7th, 2013, 01:30 PM
AK223 AK223 is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Mar 2013
Location: Santa Clara, CA
Posts: 5 AK223 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 2 h 51 m 16 sec
Reputation Power: 0
On a side note I've been developing a board on Python resources related to looping. Any recommendations you guys might have to add to this? http://www. verious.com/board/AKumar/looping-in-python/ (Sorry, I can't post links yet)

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesPython Programming > Loop through each file in folder and look for specific character strings

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap