#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2003
    Posts
    243
    Rep Power
    44

    RegEx hell - help appreciated


    Hi all

    I'm struggling with a regular expression. I need to match:

    1) The name of a place
    2) Then any of [.,: -]
    3) Then either nothing or a word that is NOT one of the following three words: oblast, region, krai.

    So for example it would match:

    "Investment into Chelyabinsk is up on 2007"
    "...factory in Chelyabinsk."

    But not...

    "Investment into Chelyabinsk oblast is up on 2007"
    "...factory in Chelyabinsk oblast."

    I've looked through lots of guides and found all sorts of info on exceptions (^) and alternatives (|) but can't seem to nail it.

    Thanks in advance.
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    May 2007
    Posts
    765
    Rep Power
    929
    If you're using a regex language that supports it, a lookahead is the construct you need. In perl that would look something like:
    Code:
    /Chelyabinsk[.,: -](?!oblast)/
    Otherwise you'll probably have to capture the next word and check if it's valid.
    sub{*{$::{$_}}{CODE}==$_[0]&& print for(%:: )}->(\&Meh);
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2003
    Posts
    243
    Rep Power
    44
    Thanks - but how can I do that in PHP?
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2003
    Posts
    243
    Rep Power
    44
    It's OK, I improvised and it works, thanks again.

    preg_match("/$placename*+(?! oblast| region| krai| okrug)/i", $text)

IMN logo majestic logo threadwatch logo seochat tools logo