Python Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming LanguagesPython Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old September 19th, 2012, 11:20 AM
Rich43 Rich43 is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2012
Posts: 4 Rich43 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 43 m 38 sec
Reputation Power: 0
Custom Lenient URL Regex

I need a very very lenient URL regular expression that works with Python's re module. Before you say it, I have Googled and tried many of the ones available on the web.

Writing a robust regular expression like this myself would be way beyond my current abilities.

If you Google for "improved_regex_for_matching_urls" you will find the most popular one, it has a couple of flaws:
* It cannot cope with subdomains
* It will match www . google . com (Ignore spaces added due to new user forum rule block.) fine but not google.com

Perhaps someone could modify the above one or write another one from scratch. I believe other people might find this solution useful, perhaps for making a wiki/CMS.

Reply With Quote
  #2  
Old September 19th, 2012, 07:17 PM
b49P23TIvg's Avatar
b49P23TIvg b49P23TIvg is offline
Contributing User
Dev Shed Loyal (3000 - 3499 posts)
 
Join Date: Aug 2011
Posts: 3,389 b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level)b49P23TIvg User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 1 Month 2 Weeks 3 Days 14 h 22 m 25 sec
Reputation Power: 383
I avoid writing complicated regular expressions, preferring instead a divide-to-conquer approach.
Another good thing to avoid---the ill-defined problem.

Perhaps you could write in Backus-Naur form what you consider a valid address?

Perhaps you could use the popular regular expression, and if it fails prepend www. , try again.
__________________
[code]Code tags[/code] are essential for python code!

Reply With Quote
  #3  
Old September 19th, 2012, 07:20 PM
Rich43 Rich43 is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2012
Posts: 4 Rich43 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 43 m 38 sec
Reputation Power: 0
I remembered that I have a old friend that uses regexes, and he wrote me this wonderful little one:
Code:
(?i)([^a-z0-9]|^)((http|https)://)?(?P<domain>([0-9a-z]+\.)*[^\.][0-9a-z]*\.[a-z]{2,5})([^a-z0-9]|$)

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesPython Programming > Custom Lenient URL Regex

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap