Thread: Help with regex

    #1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2010
    Posts
    11
    Rep Power
    0

    Help with regex


    Hi, I need help to write a regex which will match a text if the text contains a certain substring but not if it contains another substring.
    That is, I want a match if the text contains "&q=" except if the text contains "&q=user+desc".
    The text could be
    "?action=save&q=all&list=true" - which should match.
    "?action=save&q=user+desc&list=false" - which should not match

    Thanks for any pointers!
  2. #2
  3. --
    Devshed Expert (3500 - 3999 posts)

    Join Date
    Jul 2012
    Posts
    3,959
    Rep Power
    1014
    Hi,

    why make it so damn complicated? Simply check the string twice: First you check if it contains the wanted string. Then you check if it contains the unwanted string. It's really simple.

    Yeah, theoretically, you can write a complicated regex hack to stuff those two checks into a single expression. But why would you do that?

    Comments on this post

    • Laurent_R agrees : I fully agree.
    The 6 worst sins of security ē How to (properly) access a MySQL database with PHP

    Why canít I use certain words like "drop" as part of my Security Question answers?
    There are certain words used by hackers to try to gain access to systems and manipulate data; therefore, the following words are restricted: "select," "delete," "update," "insert," "drop" and "null".
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2010
    Posts
    11
    Rep Power
    0
    The reason I would like a single regex, is because the text is being fed to an application which has one regex configuration parameter. The application uses the single regex to decide whether or not to process the text.
  6. #4
  7. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2010
    Posts
    11
    Rep Power
    0
    I have been able to "hack" the following together, from bits and pieces of info I found...

    [^(?=.*?(&q=))((?!&q=user\+desc).)*$
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Location
    Paris area, France
    Posts
    843
    Rep Power
    496
    Originally Posted by xdzgor
    The reason I would like a single regex, is because the text is being fed to an application which has one regex configuration parameter. The application uses the single regex to decide whether or not to process the text.
    I fully agree with Jacques and I can hardly believe that there is no way of filtering out bad data before hand.
  10. #6
  11. --
    Devshed Expert (3500 - 3999 posts)

    Join Date
    Jul 2012
    Posts
    3,959
    Rep Power
    1014
    Interestingly, I've heard this justification several times, so maybe applications today actually do have broken processing tools and force their users to come up with weird regex workarounds.

    Well, then I have bad news: Not everything can be expressed with a single regex. You'll run into this same issue again and again, and some day, there simply won't be a workaround.

    So if you have any chance to fix the underlying problem, do it. A validation module has to be smarter than "Gimme a regex, I'll try to match the input against it". It must take custom functions so that you can specify more "complex" checks. Anything else is just the laziness of the original programmer.
    The 6 worst sins of security ē How to (properly) access a MySQL database with PHP

    Why canít I use certain words like "drop" as part of my Security Question answers?
    There are certain words used by hackers to try to gain access to systems and manipulate data; therefore, the following words are restricted: "select," "delete," "update," "insert," "drop" and "null".
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2010
    Posts
    11
    Rep Power
    0
    Thanks for the comments. The application actually allows us to write a "custom analysis" class (implement an interface), which is used to pre-process the text. The one class supplied is a "regex analysis" which can accept one regex which is used to accept or deny the text.
    I guess I could easily write my own analysis class which combines two of the regex classes...
  14. #8
  15. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Location
    Paris area, France
    Posts
    843
    Rep Power
    496
    Yes, a (zero-width) negative look ahead assertion ($! pattern) is probably going to be the right solution for your specific problem within one regex only, but can't you call your validation class twice?
  16. #9
  17. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2010
    Posts
    11
    Rep Power
    0
    I can now. The application instantiates a configured "pre-process" class, and calls its "analyze" method.
    I've written my own "pre-process" class, which allows calling multiple other "pre-process" classes - so I can easily use several simple regexes.
    Thanks.

IMN logo majestic logo threadwatch logo seochat tools logo