December 28th, 2012, 12:45 PM
Need a Regex Guru
i'm working with an xsd that cannot change that has a regex restriction value that determines the validation format. i am taking this xsd and automagically generating a wrapper class which then is generically parsed via reflection.
what i'd like to know, is it possible to parse an example of the following regex strings (these could be anything since i won't know what string i'll encounter):
to obtain it's constiuent parts to create a format string thusly: "yyyy-mm-ddThh:mm:ss" ?
or better yet, dynamically create a new replacement regex based on the validation regex?
i know i'm probably talking about a lexer/parser/tokenizer etc, but just looking at this it *should* be possible, right?
please note, the point is to determine the format strictly from the regex validation string and it cannot change. i'd liek to dynamically parse it to create a replace regex string or generate a format string to use with a DateTime object etc..
i appreciate your help in advance..
December 28th, 2012, 05:45 PM
I understand what you are talking about, but not really what you want.
is clearly a right way to match yyyy-mm-dd and check that the date is within some valid constraints, but do you need to be so strict?
Wouldn't something like this:
be sufficient to match a validly formatted expression looking as a date?
(I did not say a valid date, but your quite complicated expression will also fail to see that 2012-02-29 or even 2012-02-31 is not a valid date. So, the bottom line, is: do you want to validate a date format, in which case my much simpler expression might be sufficient, or do you want to validate a date, in which case a regular expression is probably not what you are looking for).
December 28th, 2012, 05:57 PM
so to be a bit more in-depth, the regex string above isn't mine, it's part of an overall xsd (xsd:restriction/xsdattern) that needs to be untouched, bugs and all unfortunately ;(
Originally Posted by Laurent_R
which is another reason why i would like to be able to dynamically generate a new replace string based off of their validation string. since my code walks the object via reflection, i cannot obtain enough information dynamically to discern what the format is supposed to be. thus my choices are to somehow be able to figure out what format is inferred by the validation regex, or to hard code the format into my dynamic code..
we will be using this automation scheme to download the xsd -> autogen the wrapper classes from it -> generate a gui with the appropriate controls per type -> serializing the object after manipulation -> calling a webservice with the resultant xml. using it this way means that if they ever make changes all we ever have to do is obtain the xsd, and then everything else is autogen'd from it, thus no recoding etc..
unfortunately, i have no control over the regexes contained in the pattern elements, they are what they are ;(
December 29th, 2012, 03:27 AM
Just taking again only the first part of your regex to simplify the discussion:
will match, for example, "2012-12-29", but will not capture the year. "((0[1-9])|(1))" will capture "12" and "((0[1-9])|([0-9])|(3))" will capture "29". From there on, you can try to figure out what substitution will be possible.
January 2nd, 2013, 12:27 PM
i am by no stretch of the imagination completely up on regex, let alone being a guru..
Originally Posted by Laurent_R
would you happen to have something to start me off? or a link that might deal with something like this?
January 3rd, 2013, 06:57 AM
These are useful tutorials on regular expression in Perl:
More detailed tutorial:
Very detailed reference:
I think it will explain many things to you, even if you are using another programming language than Perl (most modern regex packages are directly derived from the Perl regexes), but you can also look into the tutorials for your own language (it is not Perl).