September 19th, 2012, 12:20 PM
Custom Lenient URL Regex
I need a very very lenient URL regular expression that works with Python's re module. Before you say it, I have Googled and tried many of the ones available on the web.
Writing a robust regular expression like this myself would be way beyond my current abilities.
If you Google for "improved_regex_for_matching_urls" you will find the most popular one, it has a couple of flaws:
* It cannot cope with subdomains
* It will match www . google . com (Ignore spaces added due to new user forum rule block.) fine but not google.com
Perhaps someone could modify the above one or write another one from scratch. I believe other people might find this solution useful, perhaps for making a wiki/CMS.
September 19th, 2012, 08:17 PM
I avoid writing complicated regular expressions, preferring instead a divide-to-conquer approach.
Another good thing to avoid---the ill-defined problem.
Perhaps you could write in Backus-Naur form what you consider a valid address?
Perhaps you could use the popular regular expression, and if it fails prepend www. , try again.
[/code] are essential for python code and Makefiles!
September 19th, 2012, 08:20 PM
I remembered that I have a old friend that uses regexes, and he wrote me this wonderful little one: