1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2004
    Rep Power

    scanning a logfile help

    Ok here is the code which i have so far everything works but i need additional help

    import sys, string, getopt
    def count_lines ():
           linecount = 0
           logfile = opt[1]
           for  line in open(logfile).readlines():
                  linecount = linecount + 1
            print "accesses:", linecount
    def processline ():
           count = 0
           total = 0
           logfile = opt[1]
           for line in open(logfile).readlines():
                 words = string.split(line)
                 words = words[6]
                 total = total + 1
                 if words.startswith('/pipermail'):
                     count = count + 1
                     percentage = count * 100 / total
                print  'Accesses directory:',count,'(',percentage,'%)'
          optlist, list = getopt.getopt(sys.argv[1:], ':p:f:')
    except getopt.GetoptError:
             print "called exception"
    for opt in optlist
          if opt[0] == '-p':
          if opt[0] == '-f':

    when this code is run in a terminal using the following options:

    ./python.py -f access.data -p access.data
    i get the following output:

    accesses: 6
    Accesses directory: 3 ( 50% )

    but instead of putting the filename after the -p option i would like to be able to put in a directory name such as:
    /pipermail or,
    and then the output would display the number of access to that specific director and not the whole file.

    The logfile which I am processing contains the following data (just a sample as it contains many entries some with /pipermail and some with /Glamorgan) --{12/Dec/2002:07:41:19 +0000} " GET /pipermail/notes/web/m2f.html HTTP/1.0" 301 - - - [16/Nov/2003:08:15:21 +0000] "GET /Glamorgan/notes/scripting/html.html HTTP/1.1" 200 3974

    Edit: Fixed broken [CODE] tag..
    Last edited by netytan; January 19th, 2004 at 06:31 PM.

IMN logo majestic logo threadwatch logo seochat tools logo