#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2017
    Posts
    7
    Rep Power
    0

    How to understand '(\w)((?=\1\1\1)(\1))+'


    Code:
    echo  "aaa ffffff 999999999" |grep -oP  '(\w)((?=\1\1\1)(\1))+'
    ffff
    9999999
    Why for six f ,regex get four of them, for 999999999 ,regex get seven of it?
  2. #2
  3. Lazy Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    16,395
    Rep Power
    9645
    Code:
                 aaa
    \w           ^    a
    (?=\1\1\1)    ^   no match
    
                 aaa
    \w            ^   a
    (?=\1\1\1)     ^  no match
    
                 aaa
    \w             ^  a
    (?=\1\1\1)      ^ no match
    
    fail
    Code:
                 ffffff
    \w           ^       f
    (?=\1\1\1)    ^      match #1
    \1            ^      f
    (?=\1\1\1)     ^     match #2
    \1             ^     f
    (?=\1\1\1)      ^    match #3
    \1              ^    f
    (?=\1\1\1)       ^   no match
    
    match "ffff"
    Code:
                 999999999
    \w           ^          9
    (?=\1\1\1)    ^         match #1
    \1            ^         9
    (?=\1\1\1)     ^        match #2
    \1             ^        9
    (?=\1\1\1)      ^       match #3
    \1              ^       9
    (?=\1\1\1)       ^      match #4
    \1               ^      9
    (?=\1\1\1)        ^     match #6
    \1                ^     9
    (?=\1\1\1)         ^    match #7
    \1                 ^    9
    (?=\1\1\1)          ^   no match
    
    match "9999999"
    The lookahead matches until there are 2 or fewer characters left in the string. Thus for "f" x6 the expression matches "f" x4, for "9" x9 it matches "9" x7, and for "z" x1000 it will match "z" x998.

IMN logo majestic logo threadwatch logo seochat tools logo