October 7th, 2012, 02:12 PM
Join Date: Oct 2012
Time spent in forums: 15 m 19 sec
Reputation Power: 0
Match quoted strings excluding the quotes..?
Is it possible to write a simple regex that matches the contents of a quoted
string _excluding_ the quotes?
Consider the following text:
| This is a quoted string "string1", and this is another string: "string2"
Per the tutorial I am reading, the textbook regex that matches "string1", then
"string2" including the quotes would go something like:
| " : a double quote,
| followed by..
| [^"] : any character that is not a double quote,
| + : 1 to n times,
| followed by..
| " : a double quote
I tried using zero-length matches for the opening/closing quotes, but with the
above sample text, I ended up with something that also matches:
| ',and this is another string: '
Which is clearly not what I want..
My (limited) understanding of what is going on is that excluding the closing
quote from the match by specifying a zero-length match causes the regex engine
to start at/before the closing quote's location when looking for the next match.
I can think of something clumsy that would involve doing my zero-length match on
(1) start-of-text followed by 1 to n 'non-quotes', or
(2) a double-quoted string followed by 1 to n 'non-quotes',
... followed in both cases by my target string's opening quote..
Looks like this strategy might work, but I'm wondering if anyone knew of
a simple/obvious solution to this.. something that would force the regex to
consume the zero-length matched closing quote, before it starts looking for the
next match, perhaps..?
Not really concerned about implementation at this point.. I believe the tutorial
I'm working with uses a perl-compatible syntax.