#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2017
    Posts
    5
    Rep Power
    0

    How to understand look ahead in grep?


    \w+(?=abc) is a look ahead expression,it means to match characters before specified string abc.
    echo "adabc" |grep -oP "\w+(?=abc)"
    ad
    (?=abc)\w+ is not a look ahead expression.
    echo "adabc" |grep -oP "(?=abc)\w+"
    abc
    Why the result is not
    Code:
    adabc
    ?
    How to explain it?
  2. #2
  3. Lazy Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    16,280
    Rep Power
    9645
    (?=...) is a lookahead. That's what it is called. It doesn't matter where you put it in the expression, it's still a lookahead.

    (?=abc) means "at this point there must be 'abc'".
    Code:
    \w+(?=abc) - one or more word characters, then at that point there must be 'abc'
    Code:
    (?=abc)\w+ - at this point there must be 'abc', then match one or more word characters
    The latter cannot match "adabc" because of the lookahead:
    Code:
              adabc
    (?=abc)   ^ no match
    \w+
    
              adabc
    (?=abc)    ^ no match
    \w+
    
              adabc
    (?=abc)     ^   match
    \w+         ^^^ abc
    The first one is more complicated in how it matches:
    Code:
              adabc
    \w+       ^^^^^  adabc
    (?=abc)        ^ no match
    
              adabc
    \w+       ^^^^  adab
    (?=abc)       ^ no match
    
              adabc
    \w+       ^^^  ada
    (?=abc)      ^ no match
    
              adabc
    \w+       ^^  ad
    (?=abc)     ^ match
    Last edited by requinix; August 12th, 2017 at 11:00 PM.

IMN logo majestic logo threadwatch logo seochat tools logo