#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2007
    Location
    Sweden
    Posts
    48
    Rep Power
    8

    Replacing URL's in included site


    Hello,

    I am working on including a site with php using file_get_contents().
    I have xampp set up so I run everything locally.
    The page that I have included is located at sub.domain.dev while I am running the script from the main domain (domain.dev).
    Of course doing this will cause all images in the included html page not to load since the img source is not a full url but "img/bg.gif".

    As a regex newbie I have managed to replace the img src url of the html page with the full path so they load with this *embarrassing* code....
    PHP Code:
    $site_content file_get_contents('http://sub.domain.dev/');

    $match '/src="images/';
    $url 'http://sub.domain.dev/';
    $replace 'src="'.$url.'img';

    $site_content preg_replace($match,$replace,$site_content);

    echo 
    $site_content
    Though not fancy, it does work...

    However, the css won't load due to the wrong path and even if I replaced it will the full path the image url's in the css would be pointing to the wrong source.

    So I was thinking if I could get the css file content the same way as I did with the html file_get_contents(), run a preg_replace on all url's and paste it directy into the $site_content it would work.

    So I see it like this...

    1. Get css file contets with file_get_contents()
    2. Locate the css tag in the html content variable and remove it from the string (preg_replace?)
    3. Replace all url's in the content with the full http://sub.domain.dev/ with preg_replace
    4. insert the css content into the head of the html content variable.

    Would anyone be willing to help me with the regex patterns for this? I find them to be difficult...
    Overall help and pointer also greatly appreciated, perhaps theres a better way of doing this

    Cheers,
    Marcus.
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2007
    Location
    Sweden
    Posts
    48
    Rep Power
    8
    Ok so I wrote this and got it working.
    Ideas on how to optimize this is ofc welcomed, specially the regex!

    Cheers

    PHP Code:
    $URL 'http://sub.domain.dev/'// domain URL
    $site_content file_get_contents('http://sub.domain.dev/'); // get page HTML
    $css_content file_get_contents('http://sub.domain.dev/style.css'); // get CSS page

    // Convert URL's in HTML to full path
    $htmlImgURLPattern '/img src="/';
    $htmlImgURLReplace 'img src="'.$URL.'';

    $html_replaced preg_replace($htmlImgURLPattern,$htmlImgURLReplace,$site_content);

    // Convert URL's in CSS to full path
    $cssImgURLPattern "/url\('/";
    $cssImgURLReplace "url('$URL";

    $css_replaced preg_replace($cssImgURLPattern,$cssImgURLReplace,$css_content);

    // Replace style link with style tag and insert converted CSS
    $linkTagPattern '/<link\s(.*)rel=("stylesheet").*>/';
    $linkTagReplace '<style type="text/css">'.$css_replaced.'</style>';
    $resultPage preg_replace($linkTagPattern,$linkTagReplace,$html_replaced);

    echo 
    $resultPage

IMN logo majestic logo threadwatch logo seochat tools logo