March 9th, 2005, 05:56 PM
if a string matches a regex...
Im trying to work out if a string 'fits' a regular expression.
I have the regular expression '/(\w+/)*(\w+).(\w+)' and if a string 'matches' it then i would like it to return true, else if it doesnt return false.
I was thinking something along the lines of:
but i cannot find out how to tell if the string exactly matches the expression, I can only print it.
path = """/path/to/file.jpg"""
r = r'/(\w+/)*(\w+).(\w+)'
ext_r = re.compile(r)
ext = ext_r.findall(path)
Correct me if im wrong but it should find expressions like '/directory/file.ext' with many directory levels and one file.ext at the end.
for instance i have the string /absolute/path/image.jpg and the regex should find this then return true, else if the string was:
'/alsolute/path/image' or '/absolute/path' or 'abolsolute/path/image.jpg' then it should return false.
Sorry, i am totally new to both python and regex and I have tried the re methods, but i just get object links returned to me.
cheers for any help.
March 9th, 2005, 06:21 PM
You can use the methods without needing them to return True/False. Python can use more than just True/False in conditional statements. For instance full/empty lists, and so on:
If you actually want the groups from your regex, then you do need the returned objects:
>>> import re
>>> re.match("\d", "1")
<_sre.SRE_Match object at 0x0156D3A0>
>>> re.match("\d", "a")
>>> if re.match("\d", "1"):
... print "hi"
Though it doesn't seem to produce exactly what you're looking for.
>>> result = re.match("/(\w+/)*(\w+).(\w+)", "/dir/x/file.jpg")
>>> if result:
... print result.groups()
('x/', 'file', 'jpg')
You would need to build something into the regex to mean "match this and only this" - perhaps change it to include start and end of line anchors ($ and ^, IIRC).
if the string exactly matches the expression
I would be tempted to use something much more simple:
>>> path = "/path/to/file.jpg"
>>> print path.split('/')
['', 'path', 'to', 'file.jpg']
Last edited by sfb; March 9th, 2005 at 06:25 PM.
March 9th, 2005, 09:53 PM
cheers, but really i was looking for something that , as you said, exactly matches the regex. and it does need to return true or false, as a function, for this purpose.
after testing your solution, im thinking my regex statement is incoreect, because it will on print false (i edited it) if only a "/" is in the 'path'.
is there a way of make all portions of the regex statement compulsory. i.e. the path must have a "file.jpg" portion as well as "/" (root) and then *possibly* a directory in the form of "directory/".
March 11th, 2005, 02:53 PM
What if the file has no extension?
What if it is file.jpg.vbs?
What if the system is one where / isn't the path separator?
What if the directory name has spaces or dots in the name?
Does the path have to actually exist on the system?
Do you care about invalid characters for a real path - e.g. Windows wont allow <>\? and so on in file or folder names...
pattern = "^/(.+\/)*(.+)$"
A string starting with a /
(Optionally followed by:
One or more characters and a /)
Repeated as many times as possible
Ending in one or more characters.
"/a" or "/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/a"
Is that what you're looking for?
I think you will have to do the True/False yourself with:
if re.match(pattern, text):
Comments on this post
March 13th, 2005, 06:05 PM
Nope - you can do much simpler:
return bool(re.match(pattern, text))