#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2011
    Posts
    6
    Rep Power
    0

    Some mistake with quantifier and look-ahead assertion


    Hi,

    I don't unterstand why my expression doesn't work and hope somebody can help me.

    In the string "(*.ini; *.exe; )" the first "; " should be found, the second not. The search expression is "; *(?!\))". But this way the second semicolon without space is matched. There should only be one match: the first "; ".

    What's the mistake and how can I do it right? Thanks
  2. #2
  3. Did you steal it?
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    14,071
    Rep Power
    9398
    Regular expressions always try to match. Your space* means that it will try to match as many spaces as possible, but if the expression fails later on, the engine will backtrack and match fewer spaces.
    In your case, the space* matches zero spaces (which is allowed because *=0 or more) and the negative lookahead passes (because the next character is a space, not a closing parenthesis).

    Using space+ will fix this instance, but you'll have the same problem if there are two or more spaces.

    What are you trying to match? The filename patterns?
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2011
    Posts
    6
    Rep Power
    0
    Thanks for the help. Now it's clear to me.

    I want to check the validity of a string you can pass to a file-save dialog like "All supported formats (*.abc; *.def; *.ghi)". The * in the search expression is right, there shall be allowed 0 to infinity spaces. So the problem remains. Is there a solution?
  6. #4
  7. Did you steal it?
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    14,071
    Rep Power
    9398
    The dialogs I know use the two parts separately: the display text and the actual filename filter...

    Anyways, how about
    Code:
    \([^\s;)]+(;\s*[^\s;)]+)*\)$
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2011
    Posts
    6
    Rep Power
    0
    This works for the part in parantheses. But I have the same issue on a larger scale: elements seperated by a separator.

    You can see the function I use, if you google for "autoit reference FileSaveDialog". Multiple filters can be stated in the form of
    Code:
    All (*.*)|Text files (*.txt)
    So how can I build an expression in which I only have to write a subpattern once with a seperator after it, but the seperator only being matched if the end does not follow? To make an easy example:
    Code:
    string|string|string
    has to match,
    Code:
    string|string|stng
    (stands for error in subpattern) and
    Code:
    string|string|string|
    not.
  10. #6
  11. Did you steal it?
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    14,071
    Rep Power
    9398
    Before I do anything else:

    Is that the actual string you're working with? The full story? Is there anything else you haven't said that will invalidate anything I say because I'm giving the right answer to the wrong question?
    Because twice now you've changed the problem. Before it was just filename patterns, then it was a filter, and now it's a set of filters, and I don't really feel like answering one question after another if you don't actually need all those answers.
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2011
    Posts
    6
    Rep Power
    0
    I wrote a program in which you can export some kind of log file. The user could want any extension he preferes. Therefore in the INI file for strings in a specific language you can set the "filter" parameter of the FileSaveDialog function.

    So there could be, e.g.
    Code:
    All Files (*.*)|Text files (*.txt)|Log files (*.log)|Supported files (*.txt; *.log)
    The issue is the same on different scales:
    - Between ^ and $ there are elements with a specific pattern, separated by | (e.g. "All Files (*.*)")
    - Nested in the other pattern between ( and ) there are elements with a specific pattern, separated by ";\s*" (e.g. "*.log").

    What I want to do is to validate the string. I thought there must be a way I don't have to write every subpattern twice.
  14. #8
  15. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2011
    Posts
    6
    Rep Power
    0
    In case somebody reads this thread and wants to get an answer:
  16. #9
  17. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2011
    Posts
    6
    Rep Power
    0
    I found one and documented it there: http://regexlib.com/REDetails.aspx?regexp_id=3325.

IMN logo majestic logo threadwatch logo seochat tools logo