#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2014
    Posts
    49
    Rep Power
    5

    How to use regular expression to pick numbers of any numbers


    Hi all,

    I am reading the attached file using python code.

    Below is the python code I am using to read the file, loop through it as I find all numbers in the file and extracting them out in a list.

    Code:
     
    import re
    openf = open('regex_sum_197551.txt')
    numL = list()
    for line in openf:
        line = line.rstrip()
        #stuff = re.findall('([0-9]+)',line)
        stuff = re.findall('^([0-9]+\S)',line)
        #print 'stuff', stuff
        if len(stuff)!=1:
            continue
        #print stuff
    
        num = int(stuff[0])
        numL.append(num)
    print numL
    print sum(numL)
    When I do a print to see all the numbers extracted I don't see all the numbers.

    Is my regex ok, how can I improve on my regex to extract out all the numbers?

    Thanks,

    Ron
    Attached Files
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2007
    Location
    Joensuu, Finland
    Posts
    471
    Rep Power
    71
    I think your only mistake is the “^” in the regex; with it, only numbers at the beginning of lines are found.

    Code:
    #!/usr/bin/env python3
    
    import re
    
    with open('regex_sum_197551.txt') as openf:
        numL = []
        for line in openf:
            line = line.rstrip()
            numL += [int(num) for num in re.findall('([0-9]+\S)',line)]
    print(numL)
    print(sum(numL))

    Comments on this post

    • Ron256 agrees : I understood the mistake I was making.
    My armada: Debian GNU/Linux 8 (desktop, home laptop, work laptop), Raspbian GNU/Linux 8 (nameserver), Ubuntu 14.04.3 LTS (HTPC), PC-BSD 10.2 (testbed), Android 4.2.1 (tablet)
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2014
    Posts
    49
    Rep Power
    5
    Thanks for the tip. It is much appreciated.

    Originally Posted by SuperOscar
    I think your only mistake is the ^ in the regex; with it, only numbers at the beginning of lines are found.

    Code:
    #!/usr/bin/env python3
    
    import re
    
    with open('regex_sum_197551.txt') as openf:
        numL = []
        for line in openf:
            line = line.rstrip()
            numL += [int(num) for num in re.findall('([0-9]+\S)',line)]
    print(numL)
    print(sum(numL))
  6. #4
  7. Banned ;)
    Devshed Supreme Being (6500+ posts)

    Join Date
    Nov 2001
    Location
    Woodland Hills, Los Angeles County, California, USA
    Posts
    9,782
    Rep Power
    4302
    Just a quick note: Your regular expression only matches positive integers. It doesn't match negative numbers, floating point etc. If you need to match those, a little googling will get you a regular expression that you're looking for.
    Up the Irons
    What Would Jimi Do? Smash amps. Burn guitar. Take the groupies home.
    "Death Before Dishonour, my Friends!!" - Bruce D ickinson, Iron Maiden Aug 20, 2005 @ OzzFest
    Down with Sharon Osbourne

    "I wouldn't hire a butcher to fix my car. I also wouldn't hire a marketing firm to build my website." - Nilpo

IMN logo majestic logo threadwatch logo seochat tools logo