#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2018
    Posts
    5
    Rep Power
    0

    Regular Expression Combining Does not start with AND a contains in same group


    Help!

    I want to say first that I am pretty new to regular expressions. I have played with them a bit but I am no where from an expert on full understanding.

    What I want to do is I want to create a group that does two separate functionalities when searching a PDF File name:

    1) Does Not Contain "Confirm"
    2) Starts with "SOW" OR "[TEST] SOW"

    This means if a PDF named "SOW TEST Confirm REPORT.PDF" OR "[TEST] SOW TEST Confirm REPORT.PDF" it would be excluded but if it was named "SOW TEST REPORT.PDF" OR "[TEST] SOW REPORT.PDF" it would be included.

    Here is what I have written and it works when I remove the negative "Does not contain Confirm" but when I add it in, no matter how it either hangs (never does it) or it does it in a way that makes everything negative.

    Here is the code without #1 above in it:

    Code="^(^C://Users/All Users/TEST\\SOW.*\.pdf|^C://Users/All Users/TEST\\\[TEST\] SOW.*\.pdf).*$"

    This works exactly as expected. Anything that starts with "SOW" or "[TEST] SOW" is returned. Anything that doesn't is not.

    Now I want to add in the "Does not contain Confirm". I believe the syntax is:

    ?!Confirm.*\.pdf

    I have tried:

    Code="^((?!Confirm.*\.pdf|^C://Users/All Users/TEST\\SOW.*\.pdf|^C://Users/All Users/TEST\\\[TEST\] SOW.*\.pdf).)*$"

    But that doesn't filter it properly. It basically turns everything to a negative context by where I put the parenthesis at the end.

    If I move the parenthesis up:

    Code="^((?!Confirm.*\.pdf)|^C://Users/All Users/TEST\\SOW.*\.pdf|^C://Users/All Users/TEST\\\[TEST\] SOW.*\.pdf).*$"

    It doesn't filter anything and brings everything back (basically ignores my 3 parameters). How do I get he code to do what I want?

    Thank you,
    J
  2. #2
  3. Impoverished Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    16,817
    Rep Power
    9646
    You can't just throw syntax at a regular expression and expect it to work. Have you tried to understand what stuff you wrote?

    The first one
    Code:
    ^(^C://Users/All Users/TEST\\SOW.*\.pdf|^C://Users/All Users/TEST\\\[TEST\] SOW.*\.pdf).*$
    means
    Start at the beginning of the string. Then either (a) it must be the beginning of the string followed by C:/Users/All Users/TEST\SOW and stuff and .pdf, or (b) it must be the beginning of the string followed by C:/Users/All Users/TEST\[TEST] SOW and stuff and .pdf. Then there is more stuff and the end of the string.
    Did you notice the redundancy?

    Then there's
    Code:
    ^((?!Confirm.*\.pdf|^C://Users/All Users/TEST\\SOW.*\.pdf|^C://Users/All Users/TEST\\\[TEST\] SOW.*\.pdf).)*$
    Start at the beginning of the string. There must not be either (a) Confirm and stuff and .pdf, or (b) the beginning of the string followed by C:/Users/All Users/TEST\SOW and stuff and .pdf, or (c) the beginning of the string followed by C:/Users/All Users/TEST\[TEST] SOW and stuff and .pdf; after all that there's a single character. Repeat the previous sentence zero or more times until the string ends.
    See how that means "not be either (a) or (b) or (c)"? That's because the (?!) works on everything in there. (?!a|b|c) means neither a nor b nor c.

    Blindly moving the parentheses changes the meaning quite a bit.
    Code:
    ^((?!Confirm.*\.pdf)|^C://Users/All Users/TEST\\SOW.*\.pdf|^C://Users/All Users/TEST\\\[TEST\] SOW.*\.pdf).*$
    Start at the beginning of the string. Then either (a) there is no Confirm and stuff and .pdf, or (b) the beginning of the string followed by C:/Users/All Users/TEST\SOW and stuff and .pdf, or (c) the beginning of the string followed by C:/Users/All Users/TEST\[TEST] SOW and stuff and .pdf. Then there is more stuff and the end of the string.
    It's better than the second one but is still a far cry from being correct.

    Let's start over from scratch.

    String starts with that initial path, right?
    Code:
    ^C:/Users/All Users/TEST\\
    Next it has to be either SOW or [TEST] SOW. Or in other words, an optional [TEST] followed by SOW.
    Code:
    (\[TEST\] )?SOW
    Then stuff until the .pdf extension, and the end of the string.
    Code:
    .*\.pdf$
    All together
    Code:
    ^C:/Users/All Users/TEST\\(\[TEST\] )?SOW.*\.pdf$
    Now we can add in the bit about not containing Confirm. If it's in the string anywhere it must be after the path and SOW prefix.
    Code:
    ^C:/Users/All Users/TEST\\(\[TEST\] )?SOW (?!___).*\.pdf$
    But what to put in there? If you put Confirm then all you're doing is saying that it can't be immediately after SOW. Which is wrong. You have to say that Confirm cannot appear at that point or anywhere after.
    There is no "not at this point or anywhere after" operator in regular expressions. It's only "not at this point". You have to use that to say "not at this point, and not at this point, and not at this point..." until the end. Which means repeating.
    Code:
    ^C:/Users/All Users/TEST\\(\[TEST\] )?SOW ((?!___))*.*\.pdf$
    Now if you put Confirm it will repeat that test over and over, which is good except nothing actually changes after each test. The regex has to advance. That means using a dot to match a character. Not in the (?!) because you want "not Confirm", and not "not Confirm and a character", but in the repetition part.
    Code:
    ^C:/Users/All Users/TEST\\(\[TEST\] )?SOW ((?!Confirm).)*.*\.pdf$
    Except now there's that .* still in there. The Confirm won't be in the ((?!)) part but it could be in the .* part. Since you already have a sort of .* with the negation you can get rid of the original .*
    Code:
    ^C:/Users/All Users/TEST\\(\[TEST\] )?SOW ((?!Confirm).)*\.pdf$

    Comments on this post

    • jpmuus47 agrees : It worked! Thank you so much.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2018
    Posts
    5
    Rep Power
    0
    Ok what you state makes sense. Thank you for your help!

    Now it doesn't seem to work as intended.

    I tried using:

    ^C:/Users/All Users/TEST\\(\[TEST\] )?SOW ((?!Confirm).)*\.pdf$

    and it didn't do anything like it didn't recognize it.

    I then tried to remove the confirm and try it:

    ^C:/Users/All Users/TEST\\(\[TEST\] )?SOW.*\.pdf$

    still didn't like it. I then set it back to what i had without the confirm:

    ^(^C://Users/All Users/TEST\\SOW.*\.pdf|^C://Users/All Users/TEST\\\[TEST\] SOW.*\.pdf).*$

    and it works.
  6. #4
  7. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2018
    Posts
    5
    Rep Power
    0
    Ok i figured it out! After the C: it needed an escape character so it needs 2 //.

    You wrote:

    ^C:/Users/All Users/TEST\\(\[TEST\] )?SOW ((?!Confirm).)*\.pdf$

    This worked:

    ^C://Users/All Users/TEST\\(\[TEST\] )?SOW ((?!Confirm).)*\.pdf$

    THANK YOU SO MUCH!!!
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2018
    Posts
    5
    Rep Power
    0
    I tried to apply what I learned above and it is semi working in a different circumstance but not fully working as i expected. I am trying:

    TEST ((?!Confirm).)*\.pdf

    so that it includes anything with TEST but not with Confirm.

    Of these 3 files:

    1. TEST Report.pdf
    2. FMU TEST Confirm.pdf
    3. FMU TEST.pdf

    the 1st and 3rd should be found. It is only finding the 1st like it is doing a "Starts with TEST". Can you tell me why my above code isn't finding #3?
  10. #6
  11. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2018
    Posts
    5
    Rep Power
    0
    I figured out it was the space after TEST.

    TEST((?!Confirm).)*\.pdf

    However, now i have another issue. If i do the 3 files plus a new one:

    1. TEST Report.pdf
    2. FMU TEST Confirm.pdf
    3. FMU TEST.pdf
    4. FMU Confirm TEST.pdf

    it finds the 1st and 3rd and correclty doesn't fine the 2nd. However, it doesn't find the 4th. Why doesn't it find the 4th? It seems like it is doing an "Does not end with Confirm".
  12. #7
  13. Impoverished Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    16,817
    Rep Power
    9646
    I don't know what regex you're trying now because I would not have expected what you came up with to match #3. At least not if it was based on what I posted.

IMN logo majestic logo threadwatch logo seochat tools logo