#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2009
    Posts
    113
    Rep Power
    16

    Regex for <A> tag


    OK so i'm scraping a page for a links text.

    here is the link:
    Code:
    <A class="Title" onclick="" href="javascript:toggleDetails('1', 1, 0, '', 0, 0, '9781551103761', '14303')" target="">Bears </A>
    I'm using Rematch to check but can't find a regex expression that will detect this link. The only thing that will remain the same is the <A class="Title" so that is what i need to detect then save the whole link in my array.

    Code:
    <cfset arrTitles = REMatch(
    '[\?&]<a class="Title"',
    objGet.FileContent
    ) />
    Note this doesn't detect the link. I've never been good with Regex expressions so if anybody can lend a helping hand that'd be great.

    Thanks,
    DSFX
  2. #2
  3. No Profile Picture
    Moderator

    Join Date
    Jun 2002
    Location
    Raleigh, NC
    Posts
    5,286
    Rep Power
    968
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2009
    Posts
    113
    Rep Power
    16
    Originally Posted by kiteless
    I'd refer to Google for something like this: http://tinyw.in/ja21
    Thanks for the links, i wasn't sure if regex was the same in JS and CF so i was trying to look specifically for CF regex examples, with little luck.

    I think i've got it working now:

    Code:
    <cfsavecontent variable="s">
    This is some text. It is true that <a class="test" href="http://www.cnn.com">Harry Potter</a> is a good
    magician, but the real <a href="http://www.coldfusionjedi.com">question</a> is how he would stand up
    against Godzilla. That is what I want to <a href="http://www.adobe.com">see</a> - a Harry Potter vs Godzilla
    grudge match. Harry has his wand, Godzilla has his <a href="http://www.cfsilence.com">breath</a>, it would
    be <i>so</i> cool.
    </cfsavecontent>
    
    <cfset matches = reMatch('<[aA]\s.class="test".*?>.*?</[aA]>',s)>
    This picks up only the link with the test class.

    Thanks for pointing me in the right direction Kite.
    Last edited by dsfx; June 27th, 2013 at 01:53 PM.
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2009
    Posts
    113
    Rep Power
    16
    As a side note i found an awesome regex tool that's helped me emensley while dealing with this stuff.

    http://gskinner.com/RegExr/

    It's all live and has most of the regex commands handy on the side bar with an explenation on what they do and how they work.
  8. #5
  9. No Profile Picture
    Moderator

    Join Date
    Jun 2002
    Location
    Raleigh, NC
    Posts
    5,286
    Rep Power
    968

IMN logo majestic logo threadwatch logo seochat tools logo