#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2005
    Posts
    34
    Rep Power
    0

    help clean my code


    hi, i am new to python, i was wondering whether there was making my code any shorter/cleaner?

    Thankyou.


    Code:
    def grab_title(filename):
        f = open(filename).read()
        pos = f.find("<TITLE>")
        pos1 = f.find("</TITLE>")
        title = f[pos:pos1]
        
        if f[pos:pos1] == '':
            pos = f.find("<title>")
            pos1 = f.find("</title>")
            title = f[pos:pos1]        
        
        title = title[7:]
        print title
    
    def grab_keywords(filename):
        f = open(filename).read()
        pos = f.find('''"keywords" content="''')
        pos1 = f.find('''<meta name="description"''')
        title = f[pos:pos1]      
        
        title = title[20:-4]
        
        print title
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2004
    Posts
    40
    Rep Power
    10
    this should prevent you having to search for title in upper and lower case by putting your file into lower case f = f.lower()

    if you found this helpful you can click on the scales in tope right hand corner of my post

    Code:
    def grab_title(filename):
        f = open(filename).read()
        f = f.lower()  
        pos = f.find("<title>")
        pos1 = f.find("</title>")
        title = f[pos:pos1]
        title = title[7:]
    
    def grab_keywords(filename):
        f = open(filename).read()
        pos = f.find('''"keywords" content="''')
        pos1 = f.find('''<meta name="description"''')
        keywords = f[pos:pos1]      
        
        keywords = keywords[20:-4]
  4. #3
  5. Hello World :)
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Mar 2003
    Location
    Hull, UK
    Posts
    2,537
    Rep Power
    69
    Here's my take on things, everything has really just been cut down. There are a few bigger changes:

    The functions now accept any file-like object (or anything with a read method) rather than reading a file by name.

    There is also the addition of a temp variable which, is used to implement the case insensitive find() while not effecting the original data .

    Code:
    def grabTitle(fileObject):
        data = fileObject.read()
        temp = data.lower()
        pos1 = temp.find('<title>') + 7
        pos2 = temp.find('</title>')
        return data[pos1:pos2]
    
    def grabKeywords(fileObject):
        data = fileObject.read()
        temp = data.lower()
        pos1 = temp.find('"keywords" content="') + 20
        pos2 = temp.find('<meta name="description"') - 4
        return data[pos1:pos2]
    Note: this hasn't been tested with actual data and may need a little tweaking. If you have any problems let me know.

    Hope this helps,

    Mark.
    programming language development: www.netytan.com Hula

  6. #4
  7. Caress me down
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2005
    Location
    Pennsylvania
    Posts
    289
    Rep Power
    511
    so what will this program do?
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2005
    Posts
    133
    Rep Power
    11
    Originally Posted by lw22
    so what will this program do?
    Print Website title
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2005
    Posts
    34
    Rep Power
    0
    with the help of a different post i managed to create some better code - any thoguhts on how to make it even better?

    Code:
    import re
    
    aString = open('index.html').read()
    print get_title(aString)
    print get_keywords(aString)
    print get_description(aString)
    
    def get_title():
        res = re.findall('''<title>(.*?)</title>''',
                         aString, re.I | re.S)
        return res[0]
    
    def get_keywords():
        res = re.findall('''<meta name="keywords" content="(.*?)">''',
                         aString, re.I | re.S)
        return res[0]
    
    def get_description():
        res = re.findall('''<meta name="description" content="(.*?)">''',
                         aString, re.I | re.S)
        return res[0]
  12. #7
  13. Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Feb 2005
    Posts
    611
    Rep Power
    65

    Red face


    Originally Posted by sitepoint
    with the help of a different post i managed to create some better code - any thoguhts on how to make it even better?

    Code:
    import re
    
    aString = open('index.html').read()
    print get_title(aString)
    print get_keywords(aString)
    print get_description(aString)
    
    def get_title():
        res = re.findall('''<title>(.*?)</title>''',
                         aString, re.I | re.S)
        return res[0]
    
    def get_keywords():
        res = re.findall('''<meta name="keywords" content="(.*?)">''',
                         aString, re.I | re.S)
        return res[0]
    
    def get_description():
        res = re.findall('''<meta name="description" content="(.*?)">''',
                         aString, re.I | re.S)
        return res[0]
    This code is just loaded with errors!
  14. #8
  15. Hello World :)
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Mar 2003
    Location
    Hull, UK
    Posts
    2,537
    Rep Power
    69
    Firstly you need to reorganize your code. The function definitions should come before you try and use the functions . You also seem to want to use multi-line strings for everything, this isn't a problem but it's IMO bad practice. Instead use single line strings .

    You might also want to put the function calls inside an if __name__ == '__main__' block so that the file can be imported into another program without the functions being run. This helps with code reuse and is generally a good idea .

    After applying all this you should have something like:

    Code:
    import re
    
    def get_title():
        res = re.findall('<title>(.*?)</title>',
                         aString, re.I | re.S)
        return res[0]
    
    def get_keywords():
        res = re.findall('<meta name="keywords" content="(.*?)">',
                         aString, re.I | re.S)
        return res[0]
    
    def get_description():
        res = re.findall('<meta name="description" content="(.*?)">',
                         aString, re.I | re.S)
        return res[0]
        
    if __name__ == '__main__':
        aString = open('index.html').read()
        print get_title(aString)
        print get_keywords(aString)
        print get_description(aString)
    Mark.
    programming language development: www.netytan.com Hula


IMN logo majestic logo threadwatch logo seochat tools logo