Regex Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming Languages - MoreRegex Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old December 30th, 2011, 02:55 AM
WhiteRau WhiteRau is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Dec 2011
Posts: 5 WhiteRau User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 44 m 46 sec
Reputation Power: 0
Question Match multiple versions of city names?

i need a regex to match the following possible variations of city name patterns:

city
city st. town
st. city
big city
twin-city
some-town-city

the `st.` can be literal if necessary. case insensitive.

this is what i have so far, though some of it was built by RegexBuddy and i have no idea what the ?: means...
Code:
[a-z]+(?:[\s]?)(?:[\.]?[\s]?)?[a-z]*


thanks gang!

WR!

Reply With Quote
  #2  
Old December 30th, 2011, 06:33 AM
requinix's Avatar
requinix requinix is offline
Still alive
Dev Shed God 16th Plane (12500 - 12999 posts)
 
Join Date: Mar 2007
Location: Washington, USA
Posts: 12,855 requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)  Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1
Time spent in forums: 5 Months 1 Week 5 Days 4 h 51 m 44 sec
Reputation Power: 8977
Send a message via AIM to requinix Send a message via MSN to requinix Send a message via Yahoo to requinix Send a message via Google Talk to requinix
Except for alternation (by writing a regex that means "this or this or this"), only the first three can be combined together. The others are about matching completely different strings.
Code:
(st. city|city( st. town)?)

(?:...) means that the subpattern isn't "remembered" for later. Check our resources sticky for more information.

Reply With Quote
  #3  
Old December 30th, 2011, 10:25 AM
WhiteRau WhiteRau is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Dec 2011
Posts: 5 WhiteRau User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 44 m 46 sec
Reputation Power: 0
ah! thank you very much! i didn't think i could do it in one pattern, so thanks for verifying that. dang, eh?

i've done some reading since last night and was wondering if this was something i could do with word boundaries...?

bah. in the end i suppose i could just test against anything NOT alphabetic or dot/dash...

thank you so much.

WR!

Reply With Quote
  #4  
Old December 30th, 2011, 01:40 PM
requinix's Avatar
requinix requinix is offline
Still alive
Dev Shed God 16th Plane (12500 - 12999 posts)
 
Join Date: Mar 2007
Location: Washington, USA
Posts: 12,855 requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)  Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1
Time spent in forums: 5 Months 1 Week 5 Days 4 h 51 m 44 sec
Reputation Power: 8977
Send a message via AIM to requinix Send a message via MSN to requinix Send a message via Yahoo to requinix Send a message via Google Talk to requinix
Word boundaries are the difference between "ark" and "market".

What kind of text are you dealing with? Do you know specifically which cities/states/etc. you're searching for?

Reply With Quote
  #5  
Old December 30th, 2011, 02:18 PM
WhiteRau WhiteRau is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Dec 2011
Posts: 5 WhiteRau User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 44 m 46 sec
Reputation Power: 0
i'm just trying to validate city names and make sure they don't input garbage. the dot, dash and space are the only allowed chars in city names, so i thought i'd test for those, but dump anything else. since we're generating legal docs, i thought it would be a nice feature to make sure they don't live in 123land or some crap like that.

i suppose, at the end of the day, if the client is that stupid, then we'll take their money and be done with it... :P

hope that helped.

WR!

Reply With Quote
  #6  
Old December 30th, 2011, 04:35 PM
ragax's Avatar
ragax ragax is offline
Turn left at the third duck
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Dec 2011
Location: Nelson, NZ
Posts: 93 ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level)ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level)ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level)ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level)ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level)ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level)ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 1 Day 24 m 37 sec
Reputation Power: 92
Hi Whiterau,

requinix is quite right that we need alternation on this one.

Here is a pattern that matches the sample text you gave. It groups options #3, 4 and 5 on one line of the alternation. The other options have their own line.

Just dump this in the pattern window of regexbuddy:
Code:
(?ix)
^[A-Z]+ # take the first word
(?:$| # just city
(?:-|[.]?[ ])[A-Z]+$| # st. city, big city or twin city
[ ][A-Z]+\.[ ][A-Z]+$|  # city st. town
(?:-[A-Z]+){2}$   # some-town-city
)


Then dump this in the test string window:
Code:
city
city st. town
st. city
big city
twin-city
some-town-city


Make sure that RB is set to line-by-line.
To see what each line does, remove the "Z": some of the strings will be unmatched.

Please let me know if this is what you are looking for.

Wishing you both a fun weekend and a fruitful new year.

Reply With Quote
  #7  
Old January 4th, 2012, 07:15 PM
WhiteRau WhiteRau is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Dec 2011
Posts: 5 WhiteRau User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 44 m 46 sec
Reputation Power: 0
Lightbulb

absolutely brilliant. thank you so much. there's no way i would have solved this one on my own. this padawan has much to learn in the ways of RegEx. lol.

what is the (?ix) actually do? i see that it is a mode modifier but JavaScript does not support those... any way around that you know of? is it looking for a line-break? there aren't any. the list is just possible entry types. they will be evaluated individually as the client enters the name in the form.

thank you so much.

WR!

Reply With Quote
  #8  
Old January 4th, 2012, 11:16 PM
ragax's Avatar
ragax ragax is offline
Turn left at the third duck
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Dec 2011
Location: Nelson, NZ
Posts: 93 ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level)ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level)ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level)ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level)ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level)ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level)ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 1 Day 24 m 37 sec
Reputation Power: 92
Hi WhiteRau, thrilled this is working for you.

Quote:
what is the (?ix) actually do?


The x is for the "comment mode" or "whitespace mode" that enabled me to write the regex on multiple lines (easier to read). If you remove the comments (everything after the #) you can bring it back to one line. The i is for "case insensitive". In javascript you can use the /i modifier instead, so you can get rid of (?ix) and get everything on one line.

Reply With Quote
  #9  
Old January 5th, 2012, 09:25 AM
WhiteRau WhiteRau is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Dec 2011
Posts: 5 WhiteRau User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 44 m 46 sec
Reputation Power: 0
Talking

thank you so much. if you don't mind me asking, how long have you been doing RegEx? everytime i think i have a grip on it, it explodes... having RegExBuddy is helping a LOT.

may i ask what resources you'd recommend for learning more? the O'Reilly book is in my sights, but is there anything else?

thanks for your time!

WR!

Reply With Quote
  #10  
Old January 5th, 2012, 04:06 PM
ragax's Avatar
ragax ragax is offline
Turn left at the third duck
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Dec 2011
Location: Nelson, NZ
Posts: 93 ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level)ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level)ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level)ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level)ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level)ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level)ragax User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 1 Day 24 m 37 sec
Reputation Power: 92
Hi again WhiteRau!

Quote:
may i ask what resources you'd recommend for learning more?

The same question came up on another thread a few days ago, so instead of doing a half job of repeating myself, I thought I would write a comprehensive answer to which I could refer time and again. So here is my detailed answer on the Regex Resources thread.

The learning curve is steep, but that's a good thing. If you apply yourself, you can know as much as I do in about a month!


Wishing you a beautiful weekend,

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming Languages - MoreRegex Programming > Match multiple versions of city names?

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap