Regex Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming Languages - MoreRegex Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old October 7th, 2012, 02:12 PM
Henry Adams Henry Adams is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2012
Posts: 3 Henry Adams User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 15 m 19 sec
Reputation Power: 0
Match quoted strings excluding the quotes..?

Is it possible to write a simple regex that matches the contents of a quoted
string _excluding_ the quotes?

Consider the following text:

| This is a quoted string "string1", and this is another string: "string2"

Per the tutorial I am reading, the textbook regex that matches "string1", then
"string2" including the quotes would go something like:

| "[^"]+"

Match:

| " : a double quote,
|
| followed by..
|
| [^"] : any character that is not a double quote,
| + : 1 to n times,
|
| followed by..
|
| " : a double quote

I tried using zero-length matches for the opening/closing quotes, but with the
above sample text, I ended up with something that also matches:

| ',and this is another string: '

Which is clearly not what I want..

My (limited) understanding of what is going on is that excluding the closing
quote from the match by specifying a zero-length match causes the regex engine
to start at/before the closing quote's location when looking for the next match.

I can think of something clumsy that would involve doing my zero-length match on
either:

(1) start-of-text followed by 1 to n 'non-quotes', or

(2) a double-quoted string followed by 1 to n 'non-quotes',

... followed in both cases by my target string's opening quote..

Looks like this strategy might work, but I'm wondering if anyone knew of
a simple/obvious solution to this.. something that would force the regex to
consume the zero-length matched closing quote, before it starts looking for the
next match, perhaps..?

Not really concerned about implementation at this point.. I believe the tutorial
I'm working with uses a perl-compatible syntax.

Thanks..!

Reply With Quote
  #2  
Old October 7th, 2012, 03:42 PM
spacebar208's Avatar
spacebar208 spacebar208 is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Apr 2012
Location: spaceBAR Central
Posts: 191 spacebar208 User rank is Sergeant Major (2000 - 5000 Reputation Level)spacebar208 User rank is Sergeant Major (2000 - 5000 Reputation Level)spacebar208 User rank is Sergeant Major (2000 - 5000 Reputation Level)spacebar208 User rank is Sergeant Major (2000 - 5000 Reputation Level)spacebar208 User rank is Sergeant Major (2000 - 5000 Reputation Level)spacebar208 User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 2 Days 9 h 55 m 30 sec
Reputation Power: 41
Try this and see if it is what you want:
Code:
echo 'This is a quoted string "string1", and this is another string: "string2"' | perl -wnE 'say for /"([0-9a-zA-Z]*)"/g'

Reply With Quote
  #3  
Old October 7th, 2012, 06:38 PM
Henry Adams Henry Adams is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2012
Posts: 3 Henry Adams User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 15 m 19 sec
Reputation Power: 0
Quote:
Originally Posted by spacebar208
Try this and see if it is what you want:
Code:
echo 'This is a quoted string "string1", and this is another string: "string2"' | perl -wnE 'say for /"([0-9a-zA-Z]*)"/g'


That was quick..!

I made it more general to (hopefully) include special characters, etc. like so:

| % echo 'ascii string: "string_1", unicode string: "κορδόνι"' | perl -wnE 'say for /"([^"]*)"/g
| string1
| κορδόνι

I then I noticed that if I remove the parentheses, I get the following:

| % echo 'ascii string: "string_1", unicode string: "κορδόνι"' | perl -wnE 'say for /"[^"]*"/g
| "string1"
| "κορδόνι"

So it's the parentheses that do the trick..!

I'm not familiar with perl's intricacies, but perhaps you could direct me to the
perl doc that describes this feature..?

Thanks for help..!

Reply With Quote
  #4  
Old October 8th, 2012, 12:33 AM
spacebar208's Avatar
spacebar208 spacebar208 is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Apr 2012
Location: spaceBAR Central
Posts: 191 spacebar208 User rank is Sergeant Major (2000 - 5000 Reputation Level)spacebar208 User rank is Sergeant Major (2000 - 5000 Reputation Level)spacebar208 User rank is Sergeant Major (2000 - 5000 Reputation Level)spacebar208 User rank is Sergeant Major (2000 - 5000 Reputation Level)spacebar208 User rank is Sergeant Major (2000 - 5000 Reputation Level)spacebar208 User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 2 Days 9 h 55 m 30 sec
Reputation Power: 41
It's called a "capturing group", Just google 'Perl Regular Expressions", This is just one link you can look at:
http://www.tutorialspoint.com/perl/..._expression.htm

Reply With Quote
  #5  
Old October 8th, 2012, 01:43 PM
Henry Adams Henry Adams is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2012
Posts: 3 Henry Adams User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 15 m 19 sec
Reputation Power: 0
"capturing group" is exactly what I was looking for..

Thanks again for help.

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming Languages - MoreRegex Programming > Match quoted strings excluding the quotes..?

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap