Thread: Regex question

    #1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2016
    Posts
    4
    Rep Power
    0

    Question Regex question


    Hi Forum. I have a question regarding regex string which I cannot get to work properly.
    I want my regex to match 10 digits number in my text header. My regex should support text before and after my digits.
    It should NOT match if I have more than 10 digits. And this part dosent work for me. What should I change in my regex?

    Bold Black= are always same value
    Bold Red= can be different digits but maximum of 10 digits and should support - and _ in and spaces between digits.

    Auto:Abc0909802323
    Auto:Abc0909802323

    My regex:
    [Aa][uU][tT][oO]:.*(0[1-9]|[1-2][0-9]|3[01])[ -\.\/\_]{0,1}(0[0-9]|1[0-2])[ -\.\/\_]{0,1}\d{2}[ -\.\/\_]{0,1}\d{4}
  2. #2
  3. Lazy Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    16,333
    Rep Power
    9645
    So "Auto:" (case-insensitive), stuff that is not digits, up to 10 digits with possible -s and _s, then stuff that is not digits?

    I can't really comment on what you need to change with your regex because it doesn't agree with your description... According to that,
    Code:
    ^Auto:\D*(\d[-_]?){,9}\d\D*$
    1. "Auto:" at the beginning of the string
    2. \D* to match zero or more non-digits
    3. \d[-_]? to match one digit possibly followed by a hyphen or underscore
    4. Repeat that up to 9 times, then match another digit
    5. Zero or more non-digits and the end of the string
    If your regex engine can do a case-insensitive flag then that's nicer than writing out all those [Aa]s. If not then try prepending (?i) to the regex. If that doesn't work either then do the [Aa] thing.

    But it looks like the numbers are supposed to form a date? That's quite a different different than merely "10 digits" (not least because of the count). If there's a problem with yours then it's probably (because I don't know what "doesn't work for me" means) the ".*" being too permissive - try \D* instead.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2016
    Posts
    4
    Rep Power
    0
    HI requinix, Thank you for your reply! My first post was maybe not so precious. Yes. that is correct. It is date format. Followed by 4 digit random number ( eu social security number).
    My own Regex works. But it also matches if I type more than 10digits(social security number). And I am not interested to match that. I am only interested to match if I only see the social security number in the subject with the correct length.

    Facts:
    1. The subject will always begin with "Auto:"
    2. There can be random text in between and the social security number Example "Auto:abc0109862423"
    3. It should only match if the social security number are 10 digits length. Meaning this should not be matched. Auto:01098624232456
    4. hyphen or underscore and spaces should be supported in the social security number. Example Auto:01-09-86-2423 or Auto:01 09 86 2423 or

    Day
    Month
    Yeah
    4 digit Randomnumber

    0809861234

    Thanks again !
  6. #4
  7. Lazy Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    16,333
    Rep Power
    9645
    So I'm assuming that the "random text" cannot have digits? Otherwise it could have a 10-digit number in there. And because the text can come before or after, you wouldn't know which 10-digit number was the social security number.
    Or there can't be text after - like all your examples so far have been.

    Like I said, the fix could simply be swapping the .*, which will match anything, with a \D*, which will only match non-digits.
    Oh, there is one more thing though: hyphens in a character set [] will indicate a range unless you put them at the beginning or end or escape them. Additionally . doesn't have its special meaning inside them, and _ is never special at all. (/ could be special depending how you're using this regex.)
    Code:
    [Aa][uU][tT][oO]:\D*(0[1-9]|[1-2][0-9]|3[01])[- .\/_]{0,1}(0[0-9]|1[0-2])[- .\/_]{0,1}\d{2}[- .\/_]{0,1}\d{4}
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2016
    Posts
    4
    Rep Power
    0
    Thanks again requinix.

    Correct. I am not looking for digits in the "random text" I expect only letters before and after the social security number?

    If I take
    Code:
    [Aa][uU][tT][oO]:\D*(0[1-9]|[1-2][0-9]|3[01])[- .\/_]{0,1}(0[0-9]|1[0-2])[- .\/_]{0,1}\d{2}[- .\/_]{0,1}\d{4}
    and test on https://regex101.com/. It also matches if I write Auto:0809861234333343. Which should not be match because the length.
  10. #6
  11. Lazy Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    16,333
    Rep Power
    9645
    Then you need to add a \D*$ at the end - it works like the \D* before but also requires that there be nothing else after the non-digits.
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2016
    Posts
    4
    Rep Power
    0
    Thanks, It actually works! I was thinking about to make it even more sharp. What should do If I want to support digits before and after my social security number? But it should Only match if the
    social security number exist in this format and length 0908862411 or 090886-2411 in the subject.

    Example. (This should be matched)
    Auto:Casenr. 5460551 - something abz, 030286-1234 or
    Auto:something abz,. 030286-1234 Casenr 123332234343434354353535

    Thank you !
  14. #8
  15. Lazy Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    16,333
    Rep Power
    9645
    What if one of those numbers is also 10 digits long and happens to match the social security number format? Like
    Code:
    Auto:asdhnj0123456789dfulhg030286-1234
    Is that not a problem? Would you only care that there is a number in there that looks like it could be an SSN?

IMN logo majestic logo threadwatch logo seochat tools logo