#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2009
    Posts
    3
    Rep Power
    0

    RegEx to match a string only if it does'nt have a particular string within it


    Hi All

    Requesting help on this one. I am working on a JAVA based tool 'webMethods' which provides built-in functionality to replace char/string from a given input, it also excepts RegEx to match the search string.

    My requirement to create a RegEx which can be supplied to the above built-in functionality to match any string (that can contain new lines and other white space characters) only if it doesnt have a particular word say BRANCH.

    Examples of positive matches can be-

    * The bank of America.
    * The #123 bank.
    * @$%# (*%&.

    Examples of negative matches can be-

    * BRANCH
    * The #123 BRANCH of this bank.
    * This is the last BRANCH.
    * BRANCH BRANCH


    I tried using the expression [\s\S\s]*(?!BRANCH)[\s\S\s]*, but this does'nt work for all the scenarios.


    Thanks !!
  2. #2
  3. Transforming Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    14,127
    Rep Power
    9398
    Using a regular expression to do this is silly.
    Code:
    ^((?!BRANCH).)*$
  4. #3
  5. No Profile Picture
    User 165270
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2005
    Posts
    497
    Rep Power
    937
    As already mentioned by requinix, regex isn't well suited to negate something (except a single character). Regex is more intended to match strings, not "not match" them.

    Anyway, if you find requinix' answer a bit confusing, you may find this approach a bit easier to comprehend:

    Code:
    ^(?!.*?BRANCH).*$
  6. #4
  7. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2009
    Posts
    3
    Rep Power
    0
    Thanks a lot for the RegEx pattern and please accept my apologies for late response.

    This pattern is working for all the possible cases except for those where there is newline in the string.For example-

    Positive match-

    * The
    #123 of this bank.

    Negative match -

    * The
    #123 BRANCH of this
    bank. .

    Is there a way we can add the newline option in the RegEx pattern.

    Thanks !!


    arsh
  8. #5
  9. No Profile Picture
    User 165270
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2005
    Posts
    497
    Rep Power
    937
    Originally Posted by arshsidhu
    Thanks a lot for the RegEx pattern and please accept my apologies for late response.
    No problem.

    Originally Posted by arshsidhu
    This pattern is working for all the possible cases except for those where there is newline in the string.For example-

    Positive match-

    * The
    #123 of this bank.

    Negative match -

    * The
    #123 BRANCH of this
    bank. .

    Is there a way we can add the newline option in the RegEx pattern.

    Thanks !!


    arsh
    That is because the DOT meta character matches any character except new line characters. So, when your input consists of multiple lines and the first line does not have your predefined "forbidden" string, it will fail (as you have noticed).
    To overcome this, you would have to "tell" the regex engine to let the DOT meta character match any character possible (so, including new line characters!). You can do that by adding the DOT-ALL flag ("(?s)") to your regex. So, here's requinix' proposal (I like it better than what I proposed) including the DOT-ALL flag:

    Code:
    ^(?s)((?!BRANCH).)*$
    Good luck.

    Comments on this post

    • Annie79 agrees
  10. #6
  11. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2009
    Posts
    3
    Rep Power
    0

    Smile


    Originally Posted by prometheuzz
    No problem.

    That is because the DOT meta character matches any character except new line characters. So, when your input consists of multiple lines and the first line does not have your predefined "forbidden" string, it will fail (as you have noticed).
    To overcome this, you would have to "tell" the regex engine to let the DOT meta character match any character possible (so, including new line characters!). You can do that by adding the DOT-ALL flag ("(?s)") to your regex. So, here's requinix' proposal (I like it better than what I proposed) including the DOT-ALL flag:

    Code:
    ^(?s)((?!BRANCH).)*$
    Good luck.


    Thanks a lot , the last pattern worked for me.
    I was not aware of the DOT-ALL flag and was trying to add '\n' to the pattern, something like this .....
    Code:
    ^(?:(?!BRANCH)[\s.\s]*\n?)*$
    but it wasn't looking good either

    Thanks again !!
  12. #7
  13. No Profile Picture
    User 165270
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2005
    Posts
    497
    Rep Power
    937
    Originally Posted by arshsidhu
    Thanks a lot , the last pattern worked for me.
    I was not aware of the DOT-ALL flag and was trying to add '\n' to the pattern, something like this .....
    Code:
    ^(?:(?!BRANCH)[\s.\s]*\n?)*$
    but it wasn't looking good either

    Thanks again !!
    You're welcome.
  14. #8
  15. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2004
    Posts
    233
    Rep Power
    76
    You dont need a regex to do this. Use InString

    Code:
    mystr = "bank of america"
    
    if instr(lcase(mystr), "bank")
    {
        // string contains the word "bank"
    }
    else
    {
        // string does not contain the word "bank"
    }
    (syntax varies based on what language you're using. If JavaScript then its not built in, so you will need a prototype, which can be easily found by googling "javascript instring prototype")

IMN logo majestic logo threadwatch logo seochat tools logo