1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2002
    Rep Power

    Basic regex understanding help

    (a | b) + x = ax, bx, aax, abx, bax, bbx

    The values on the RHS are ones that match the regex on the LHS. However, from my interpretation, it either matches 'a' or 'b' one or more times and then 1 'x'.

    so, 'ax' 'aax' 'bbx' seem valid, but how come 'abx' 'bax' are valid? I thought | means 'or' so it's one or the other, 'a' or 'b' one or more times?
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Feb 2004
    San Francisco Bay
    Rep Power
    You've got the grouping mixed up. Take it apart.

    (a|b) -- matches either 'a' or 'b'.
    (a|b)+ -- matches one or more repetitions of (a|b), so 'a', 'b', 'aa', 'ab', 'ba', 'bb', 'aba', etc. are all good. Any string is good as long as each character individually matches (a|b). (a|b)+ is not the same as (a+|b+).

    Comments on this post

    • extrovertive agrees : Excellent explanation!

IMN logo majestic logo threadwatch logo seochat tools logo