Python Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsProgramming LanguagesPython Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old July 12th, 2004, 11:52 AM
ryankask ryankask is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Dec 2001
Posts: 52 ryankask User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 29 m 44 sec
Reputation Power: 7
Need help with re module

Could anyone write a quick reg expression script to extract anything in between some sort of tag? For example, <title>this needs to become a python var</title> .....thanks!

Reply With Quote
  #2  
Old July 12th, 2004, 03:45 PM
DevCoach DevCoach is offline
Contributing User
Dev Shed Beginner (1000 - 1499 posts)
 
Join Date: Feb 2004
Location: London, England
Posts: 1,205 DevCoach User rank is Captain (20000 - 30000 Reputation Level)DevCoach User rank is Captain (20000 - 30000 Reputation Level)DevCoach User rank is Captain (20000 - 30000 Reputation Level)DevCoach User rank is Captain (20000 - 30000 Reputation Level)DevCoach User rank is Captain (20000 - 30000 Reputation Level)DevCoach User rank is Captain (20000 - 30000 Reputation Level)DevCoach User rank is Captain (20000 - 30000 Reputation Level)DevCoach User rank is Captain (20000 - 30000 Reputation Level)DevCoach User rank is Captain (20000 - 30000 Reputation Level) 
Time spent in forums: 1 Week 5 Days 17 h 39 m 53 sec
Reputation Power: 262
There are two answers to this question: (a) yes, it is trivial, and (b) no, it is virtually impossible.

If you do not care about nested tags then the answer is trivial:

regex = r'<title>.*?</title>'

will match anything between <title> tags. However if there is the possibility that tags will be nested then it cannot be done with a single regex (as far as I know) and needs more complex parsing.

Dave - The Developers' Coach

Reply With Quote
  #3  
Old July 12th, 2004, 06:59 PM
Strike Strike is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Dec 2001
Location: Houston, TX
Posts: 383 Strike User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 h 41 m 27 sec
Reputation Power: 7
Send a message via ICQ to Strike Send a message via AIM to Strike Send a message via Yahoo to Strike
And another caveat: If you're going to be heavy HTML/XML/SGML parsing, you should use the standard Python libraries for doing so. Parsing via regular expressions is often not the best solution simply because the regexes themselves are generally pretty ugly. Using simple string processing and/or *ML parsing libs like the ones mentioned above are often better solutions.
__________________
Debian - because life's too short for worrying.
Best. (Python.) IRC bot. ever.

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesPython Programming > Need help with re module


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 1 hosted by Hostway