April 16th, 2013, 02:18 AM
-
Regular expression matching
if( $numb =~ m/545958\s[<i> (.*) </i> ]<font color='green'></font><br><center><p><span class="Estilo15">****************************************</span></p></center>/g) {
my $result = $1;
how to use this on perl .. can't extract $1 help
April 16th, 2013, 06:37 AM
-
Are you sure your regular expression matched your text? Without seeing what the contents of $numb are, it's hard to say what your problem is.
BTW I've changed the thread title: please try to describe your problem in the title.
April 16th, 2013, 08:24 AM
-
That regex will generate this warning.
Unmatched [ in regex; marked by <-- HERE in m/545958\s[ <-- HERE <i> (.*) </
You have several syntax issues i.e., failure to escape several key characters.
April 16th, 2013, 12:49 PM
-
In addition, something like:
does not make sense, since square brackets define character classes. Or, if you want to match an actual '[' in the input, you need to escape it.
I add that something like:
is usually a very bad idea, because it may match, for example, a longer string, such as "<i> foo</i>bar some other words<i>baz</i>", which could lead to a failure of the overall regex depending on how it is built. BTW, it would also match the string "<i></i>", because the * quantifier is 0 or more of the preceding character.
If you want to make sure to capture the words between these HTML tags, you should at lerast use something like:
so that the match will stop at the next opening of a tag.
Finally, using regex to parse HTML is usually not a good idea, there are several good modules to do that much better than regexes. But if you nonetheless insist on using regexes, do it only on very limited HTML strings where you really know very well the structure (certainly not on Web pages), and... try to do it right.