1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2013
    Rep Power

    Regular expression matching

    if( $numb =~ m/545958\s[<i> (.*) </i> ]<font color='green'></font><br><center><p><span class="Estilo15">****************************************</span></p></center>/g) {
    my $result = $1;

    how to use this on perl .. can't extract $1 help
  2. #2
  3. kill 9, $$;
    Devshed Supreme Being (6500+ posts)

    Join Date
    Sep 2001
    Shanghai, An tSín
    Rep Power
    Are you sure your regular expression matched your text? Without seeing what the contents of $numb are, it's hard to say what your problem is.

    BTW I've changed the thread title: please try to describe your problem in the title.
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Apr 2009
    Rep Power
    That regex will generate this warning.
    Unmatched [ in regex; marked by <-- HERE in m/545958\s[ <-- HERE <i> (.*) </
    You have several syntax issues i.e., failure to escape several key characters.
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Paris area, France
    Rep Power
    In addition, something like:

    [<i> (.*) </i> ]
    does not make sense, since square brackets define character classes. Or, if you want to match an actual '[' in the input, you need to escape it.

    I add that something like:

    <i> (.*) </i>
    is usually a very bad idea, because it may match, for example, a longer string, such as "<i> foo</i>bar some other words<i>baz</i>", which could lead to a failure of the overall regex depending on how it is built. BTW, it would also match the string "<i></i>", because the * quantifier is 0 or more of the preceding character.

    If you want to make sure to capture the words between these HTML tags, you should at lerast use something like:

    so that the match will stop at the next opening of a tag.

    Finally, using regex to parse HTML is usually not a good idea, there are several good modules to do that much better than regexes. But if you nonetheless insist on using regexes, do it only on very limited HTML strings where you really know very well the structure (certainly not on Web pages), and... try to do it right.

IMN logo majestic logo threadwatch logo seochat tools logo