#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2004
    Location
    Atlantic City, NJ
    Posts
    327
    Rep Power
    13

    Regular Expressions


    Hello everybody. I'm a python noob here and I've been learning things on my own very slowly. I've hit a brick wall here with regular expressions.

    The script I am trying to write needs to check the contents of a file for the word 'INFECTED'. If the word exists I want to print out the entire line that the word 'INFECTED' is on. What I have so far does not do the trick:

    Code:
    #!/usr/bin/env python
    
    import re
    
    file = open('/tmp/fake.txt', 'r+')
    
    contents = file.readlines()
    infected = re.search("INFECTED", str(contents))
    
    print infected.group()
    
    file.close
    Of course, this code will simply print out the word "INFECTED" if I have the string in my text file. The problem is that it only prints out that string and not the rest of the line.

    I'm sure this is an easy one but I'm a novice here and I was hoping you gurus could help a noob out. Thanks in advance.
    I'll learn this stuff someday.
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2004
    Location
    Regensburg, Germany
    Posts
    147
    Rep Power
    16
    If you only need to find the word "INFECTED", you don't need regular expressions. Use the find() method of strings instead. Example:
    Code:
    #!/usr/bin/env python
    
    file = open('/tmp/fake.txt', 'r+')
    
    for line in file.readlines():
        if line.find("INFECTED") >= 0:
            print line
    
    file.close()
    If you want to use regular expressions because you are searching for a more complex pattern, compile the regular expression first. This will speed up line processing:
    Code:
    #!/usr/bin/env python
    
    import re
    
    regex = re.compile("INFECTED")
    
    file = open('/tmp/fake.txt', 'r+')
    
    for line in file.readlines()
        if regex.search(line):
            print line
    
    # don't forget the brackets for close()
    # 'file.close' does nothing except that it returns a function
    # object containing the close() method of the file
    file.close()
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2004
    Location
    Atlantic City, NJ
    Posts
    327
    Rep Power
    13
    Hey thanks a lot. Your method makes more sense. One quick question:

    Code:
    if line.find("INFECTED") >= 0:
    The line.find method returns the number of instances that the word INFECTED is found? This is why you have >=0?
    I'll learn this stuff someday.
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2005
    Posts
    37
    Rep Power
    10
    line.find("INFECTED") returns the index of the first occurance of "INFECTED". If it doesn't find it, it returns -1. The line.count("INFECTED") method will count the occurances.
  8. #5
  9. Hello World :)
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Mar 2003
    Location
    Hull, UK
    Posts
    2,537
    Rep Power
    69
    If you don't care about where the sub-string is in the line then you can use the in operator; this returns True if the sub-string is present and False otherwise.

    Worth noting: in Python 2.3+ the file() object is iterable, this means that you don't have to call the readlines() method to loop over each line; this is more efficient.

    Here's an example that uses both of the above methods:

    Code:
    #!/usr/bin/env python
    
    path = '~/path/to/file.txt'
    
    for line in file(path):
        if 'INFECTED' in line: print line
    If you want to play around with the in operator a little then you can fire up your Python shell (or IDLE) and give it a go.

    Hope this helps,

    Mark.
    programming language development: www.netytan.com Hula

  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2004
    Location
    Atlantic City, NJ
    Posts
    327
    Rep Power
    13
    Netytan, I was unable to get your example to work on account of I don't know how to implement it because I'm such a python noob. This is what my script has so far for the chkrootkit section:

    Code:
    os.system('touch /tmp/results.txt')
    os.system('touch /tmp/chkroot.txt')
    
    resultstmp = open('/tmp/results.txt', 'w')
    chktmp = open('/tmp/chkroot.txt', 'r+')
    
    chktmp.write(commands.getoutput('chkrootkit'))
    for line in chktmp.readlines():
    	if line.find("INFECTED") >=0:
    		resultstmp.write(line)
    	else:
    		resultstmp.write("Nothing Infected...\n\n\n")
    Nothing is being written to resultstmp at all. I'm sure nothing should match the string "INFECTED" so I should be getting the "Nothing Infected" string added to my resultstmp file but its not working. Any ideas why?

    P.S. I'm aware that python has a tempfile module I can use. I have yet to figure that out but thats for another thread.
    I'll learn this stuff someday.
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2004
    Location
    Regensburg, Germany
    Posts
    147
    Rep Power
    16
    If there are no lines in the resultstmp file this might be because no lines are read from the chktmp file.

    The problem is that if a file is open for reading and writing using the "r+" qualifier, depending on the underlying operation system you may not read what you have written before without flushing and positioning the file pointer. You could use the 'flush()' and 'seek()' methods of the file objects for this.

    In general I don't want to bother with file pointers, operating system peculiarities etc.. and I try to do it the easy way:
    Code:
    # not required because python creates files
    # automatically if opened for writing
    # os.system('touch /tmp/results.txt')
    # os.system('touch /tmp/chkroot.txt')
    
    resultstmp = open('/tmp/results.txt', 'w')
    # open the file for writing first
    chktmp = open('/tmp/chkroot.txt', 'w')
    
    chktmp.write(commands.getoutput('chkrootkit'))
    # now close the file and reopen it for reading
    chktmp.close()
    
    chktmp = open('/tmp/chkroot.txt', 'r')
    ...
    or shorter:
    Code:
    # write the result of the chkrootkit command to a file
    os.system('chkrootkit>/tmp/chkroot.txt')
    
    chktmp = open('/tmp/chkroot.txt', 'r')
    ...
    or if you need the output of the chkrootkit command for parsing only, you could eliminate the chktmp file:
    Code:
    for line in commands.getoutput('chkrootkit').splitlines():
        # do something
        ...
  14. #8
  15. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2004
    Location
    Atlantic City, NJ
    Posts
    327
    Rep Power
    13

    Talking


    Okay I actually got the script to work finally. For reasons unkown to me, opening a file with read and write(r+) does not automatically create the file. So I have to actually 'touch /tmp/chkroot.txt'. Anyway, here is the whole script just in case you guys were wondering what I was up to:

    Code:
    #!/usr/bin/env python
    
    import os, commands
    
    resultstmp = open('/tmp/results.txt', 'w')
    
    os.system('touch /tmp/chkroot.txt')
    chktmp = open('/tmp/chkroot.txt', 'r+')
    
    def header(headername):
    	"""Create the header files for 
    	each section"""
    	resultstmp.write("-" * 60 + "\n")
    	resultstmp.write("                   " + headername + "\n")
    	resultstmp.write("-" * 60 + "\n")
    
    	
    header("Python Security Check Script")
    resultstmp.write("\n")
    
    header("Chkrootkit Results")
    chktmp.write(commands.getoutput('chkrootkit'))
    
    bolin = 0
    for line in chktmp.readlines():
    	if "INFECTED" in line:
    		resultstmp.write(line)
    		bolin = 1
    if bolin == 0:
    	resultstmp.write("Nothing Infected...\n\n")
    
    
    header("Available Package Updates")
    resultstmp.write(commands.getoutput('emerge -up world | grep ebuild') + "\n\n")
    
    header("Possible Security Updates")
    resultstmp.write(commands.getoutput('glsa-check -ln') + "\n\n")
    
    date = commands.getoutput('date')
    hostname = commands.getoutput('hostname')
    
    resultstmp.close()
    chktmp.close()
    
    #Mail Everything
    os.system('cat /tmp/results.txt | mail -s "Security Report for ' + hostname + ' on ' + date + '" security@localhost')
    
    #Clean up the temporary files
    os.system('rm -f /tmp/results.txt')
    os.system('rm -f /tmp/chkroot.txt')
    I'm sure it could use some fixing but I'm still a noob so I'm happy I got it working.

    Thanks for all your help guys and gals! You'll be hearing from me again I'm sure.
    I'll learn this stuff someday.

IMN logo majestic logo threadwatch logo seochat tools logo