Thread: Re.findall help

    #1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    27
    Rep Power
    0

    Re.findall help


    Hi, so I'm fairly new to python but I've been practicing coding on my own.

    Recently, I'm working with the re.match, re.findall, re.ignorecase.

    This is my code:
    Code:
    def mission(s):
      match = re.findall(r'\d+', s, re.IGNORECASE)
      return match
    It gives me this output:
    Code:
    ['1', '0']
    ['20', '500']
    ['3']
    ['20', '1']
    From this:
    Code:
    mission('Recon Mission 1 accomplished. Enemy found: 0.') 
    mission('recon mission 20 accomplished. enemies found: 500.') 
    mission("Recon Mission 3 failed.")
    mission("I have 20 carrots and 1 mushroom.")
    But I need to get this output:
    Code:
    (1, 0)
    (20, 500)
    (3, failed)
    fourth line returns empty b/c it does not contain mission or status
    I'm not sure how to extract mission number and status (either 'failed' or number of enemies found)

    Any suggestions on how to do this correctly?
  2. #2
  3. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,701
    Rep Power
    480
    1) convert the number strings to integers.
    2) Use simple regular expressions. Don't try to parse English with a single re.
    This code might be closer to what you want. Could be more restrictive about accepting interesting sentences.
    Code:
    # python -m doctest -v p.py
    
    import re
    
    def mission(s):
        '''
            >>> (1, 0) == mission('Recon Mission 1 accomplished. Enemy found: 0.')
            True
            >>> (20, 500) == mission('recon mission 20 accomplished. enemies found: 500.')
            True
            >>> (3, 'failed') == mission("Recon Mission 3 failed.")
            True
            >>> () == mission("I have 20 carrots and 1 mushroom.")
            True
        '''
        match = list(map(int,re.findall(r'\d+', s)))
        sl = s.lower()
        if 'mission' in sl:
            if ('accomplished' in sl) and ('found' in sl):
                return tuple(match)
            if match and ('failed' in sl):
                return (match[0],'failed',)
        return ()
    [code]Code tags[/code] are essential for python code and Makefiles!
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    27
    Rep Power
    0
    Thanks for that @b49P23TIvg

    what does map() do?

    But while I was away, I actually figured out how to do it another way, the only problem I have now is how to remove the brackets and quotation marks from the numbers to just have parentheses.

    This is my code now:
    Code:
    def mission(s):
      first = re.findall(r'Recon Mission (\d+)', s, re.IGNORECASE)
      second = re.findall(r'found: (\d+)', s)
      if first and second:
        s = first, second
      elif first and second !=None:
        second = 'failed'
        s = first, second
      else:
        s = None
      return s
    My output:
    Code:
    (['1'], ['0'])
    (['2'], ['500'])
    (['3'], 'failed')
    None
    How to code to get my (['1'], ['0']) to output as (1, 0)?
    Or is it not possible to do so unless I do it your way or another way?
  6. #4
  7. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,701
    Rep Power
    480
    Using your mission function, we can clean the output into strings matching your requirement.
    Code:
    import re
    
    def mission(s):
        first = re.findall(r'Recon Mission (\d+)', s, re.IGNORECASE)
        second = re.findall(r'found: (\d+)', s)
        if first and second:
            s = first, second
        elif first and second !=None:
            second = 'failed'
            s = first, second
        else:
            s = None
        return s
    
    def clean(s):
        '''
            >>> print(clean(mission('Recon Mission 1 accomplished. Enemy found: 0.')))
            (1, 0)
            >>> print(clean(mission('recon mission 20 accomplished. enemies found: 500.')))
            (20, 500)
            >>> print(clean(mission("Recon Mission 3 failed.")))
            (3, failed)
            >>> print(clean(mission("I have 20 carrots and 1 mushroom.")))
            None
        '''
        return re.sub("[][']",'',str(s)) # as a regular expression
    
        return str(s).replace('[','').replace(']','').replace("'",'')   # with builtin string methods
    map(function,iterable)
    Suppose function(object_x) returns object_y .
    Then if iterating iterable gives objects_x
    map(function,objects_x)
    is an iterable that, when iterated gives objects_y . I've stated this confusingly, obtusely, but generally. map changed between python 2 and 3. In python 3 map uses lazy evaluation, it avoids calling the function until the result is needed. map isn't so hard.
    For example, in python 2 and 3
    Code:
    >>> def double(a): return 2*a
    ... 
    
    >>> list(map(double,[3,'ho']))
    [6, 'hoho']
    >>>
    [code]Code tags[/code] are essential for python code and Makefiles!
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    27
    Rep Power
    0
    thank you, I understand the map function now

    Also, thanks for the other code, I've implemented it and although it removed the brackets and double quotes, it still seems to have the single quotes, at least it's replaced it all with the single quote.

    My output is closer to what I want but instead of (1,0) I now get '(1,0)'

    Is there a way to get rid of ' around the parentheses? Or is there a way to replace ' with an empty str?

    thank you again!
  10. #6
  11. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,701
    Rep Power
    480
    >>> print( 'string' )
    string
    >>> 'string'
    'string'
    >>>
    [code]Code tags[/code] are essential for python code and Makefiles!
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    27
    Rep Power
    0
    Hmm...I understand the print ('string') but I think it's different from what I actually want b/c that's the word itself: string
    I'm not sure that it will work in my situation...sry if I'm a bit slow and not really explaining myself well...

    I guess I'll continue to play around with the code and google some more.

    this is my code now:
    Code:
    def mission(s):
      first = re.findall(r'Recon Mission (\d+)', s, re.IGNORECASE)
      second = re.findall(r'found: (\d+)', s)
      if first and second:
        s = first, second
        x = str(s).replace('[','').replace(']','').replace("'",'')
      elif first and second !=None:
        second = 'failed'
        t = first, second
        x = str(t).replace('[','').replace(']','').replace("'",'')
      else:
        x = None
      return x
    My output:
    Code:
    '(1, 0)'
    '(2, 500)'
    '(3, failed)'
    None
    I think I'm suppose to use int() but when I tried implementing that, it didn't really work, I kept getting an error.
  14. #8
  15. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,701
    Rep Power
    480
    We've come full circle, using integers then removing characters from strings to make them look like numbers.

    If you truly want
    (3, failed)
    then it's a string, not a number and a piece that doesn't look like a number or a string.
    [code]Code tags[/code] are essential for python code and Makefiles!
  16. #9
  17. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    27
    Rep Power
    0
    yeah, I feel like I'm making things worse and confusing myself more. Sry about that.

    I went back and decided to just implement your earlier code so now I have this:

    Code:
    def mission(s):
      match = map(int,re.findall(r'\d+', s))
      sl = s.lower()
      if 'mission' in sl:
        if ('accomplished' in sl) and ('found' in sl):
          s = tuple(match)
        if match and ('failed' in sl):
          s = (match[0],'failed',)
      else:
        s = None
      return s
    It works the way I want it to now so thank you.

    Just one last question, if I wanted to implement re.Ignorecase instead of using s.lower(), how would I go about doing that?
  18. #10
  19. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,701
    Rep Power
    480
    I removed Ignorecase because it had nothing to do with digits. Maybe with these.
    Code:
      if re.search('mission',s,re.Ignorecase):
        if re.search('accomplished.*found',s,re.Ignorecase):
        if match and (re.search('failed',s,re.Ignorecase)):
    untested.
    [code]Code tags[/code] are essential for python code and Makefiles!

IMN logo majestic logo threadwatch logo seochat tools logo