#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2003
    Posts
    80
    Rep Power
    11

    I've almost got this...


    I'm trying to do a preg_replace on some text, that follows the following format:

    "Link text goes in here":http://www.mylink.com

    Which I want to be replaced with:

    <a href="http://www.mylink.com">Link text goes here</a>

    Here's what I've got so far:

    Code:
    $text = 'The catalyst for the event was the release, on Monday, of the "Winograd Commission&rsquo;s report":http://www.vaadatwino.org.il/pdf/press%20release%20april%2030-yd-final.pdf, detailing the failures of Israel&rsquo;s leadership during the war against Hezbollah last summer.<br><br>Eli Khoury, in my opinion, is one of modern Lebanon\'s great men. He is an ideologically tireless and physically brave advocate for Lebanese independence, a champion of Lebanon\'s nascent civil society, a successful businessman, and a sophisticated analyst of local and regional politics. Michael "interviewed him for this blog":http://www.michaeltotten.com/archives/001380.html in February, and Michael and I spent some time with him in Beirut last December.<br><br>The catalyst for the event was the release, on Monday, of the "Winograd Commission&rsquo;s report":http://www.vaadatwino.org.il/pdf/press%20release%20april%2030-yd-final.pdf detailing the failures of Israel&rsquo;s leadership during the war against Hezbollah last summer.';
    
    $match = '/"([^:]+)":(http[^\s]*){1}/';
    
    $replacement = '<a href="$2">$1</a>';
    
    $newtext = preg_replace($match, $replacement, $text);
    This regex works great if the url is followed immediately by a space -- but if there's a comma or period, then those items get attached to the url, and then you get a 404 error when you try to click on the link. Example:

    Blah, blah, blah "This is my sample text":http://www.mystupidlink.com/blah.html. And some even more blahing right here.

    Ends up like:

    Blah, blah, blah <a href="http://www.mystupidlink.com/blah.html.">This is my sample text</a> And some even more blahing right here.

    I'm totally stumped on how to account for the space/period/comma in the regex. Any suggestions?
  2. #2
  3. Come play with me!
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    13,742
    Rep Power
    9397
    The easiest way would be with a negative lookbehind. Basically means "just before this point there could not have been ___".
    Code:
    /"([^:]+)":(http[^\s]*)(?<![!:;"',.?])/
    a) The engine will go as far as it can go and then realize "oh wait, it can't end in a period", so it'll backtrack a character.
    b) I included normal punctuation in that character set.
    c) You don't need a {1} - that means "one of the previous things" which is implied when you put that thing there in the first place.
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2003
    Posts
    80
    Rep Power
    11
    Thank you -- that was a huge help!
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2003
    Posts
    80
    Rep Power
    11
    It's me again -- I ran into an issue with the negative lookbehind. If there's a string of text like this:

    The catalyst for the "Super Special" event was the release, on Monday, of the "Winograd Commission&rsquo;s report":http://www.vaadatwino.org.il/pdf/press%20release%20april%2030-yd-final.pdf, detailing the failures of Israel&rsquo;s leadership during...

    Then the lookbehind is placing the link all the way back to the first set of quotes (in Super Special), instead of back to the first quote. I'm totally stumped as how to fix this. Any suggestions of what I should try?
  8. #5
  9. Come play with me!
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    13,742
    Rep Power
    9397
    Try
    Code:
    /"([^:"]+)":(http[^\s]*)(?<![!:;"',.?])/

IMN logo majestic logo threadwatch logo seochat tools logo