August 21st, 2009, 01:45 PM
Replacing URL's in included site
I am working on including a site with php using file_get_contents().
I have xampp set up so I run everything locally.
The page that I have included is located at sub.domain.dev while I am running the script from the main domain (domain.dev).
Of course doing this will cause all images in the included html page not to load since the img source is not a full url but "img/bg.gif".
As a regex newbie I have managed to replace the img src url of the html page with the full path so they load with this *embarrassing* code....
Though not fancy, it does work...
$site_content = file_get_contents('http://sub.domain.dev/');
$match = '/src="images/';
$url = 'http://sub.domain.dev/';
$replace = 'src="'.$url.'img';
$site_content = preg_replace($match,$replace,$site_content);
However, the css won't load due to the wrong path and even if I replaced it will the full path the image url's in the css would be pointing to the wrong source.
So I was thinking if I could get the css file content the same way as I did with the html file_get_contents(), run a preg_replace on all url's and paste it directy into the $site_content it would work.
So I see it like this...
1. Get css file contets with file_get_contents()
2. Locate the css tag in the html content variable and remove it from the string (preg_replace?)
3. Replace all url's in the content with the full http://sub.domain.dev/ with preg_replace
4. insert the css content into the head of the html content variable.
Would anyone be willing to help me with the regex patterns for this? I find them to be difficult...
Overall help and pointer also greatly appreciated, perhaps theres a better way of doing this
August 23rd, 2009, 02:09 PM
Ok so I wrote this and got it working.
Ideas on how to optimize this is ofc welcomed, specially the regex!
$URL = 'http://sub.domain.dev/'; // domain URL
$site_content = file_get_contents('http://sub.domain.dev/'); // get page HTML
$css_content = file_get_contents('http://sub.domain.dev/style.css'); // get CSS page
// Convert URL's in HTML to full path
$htmlImgURLPattern = '/img src="/';
$htmlImgURLReplace = 'img src="'.$URL.'';
$html_replaced = preg_replace($htmlImgURLPattern,$htmlImgURLReplace,$site_content);
// Convert URL's in CSS to full path
$cssImgURLPattern = "/url\('/";
$cssImgURLReplace = "url('$URL";
$css_replaced = preg_replace($cssImgURLPattern,$cssImgURLReplace,$css_content);
// Replace style link with style tag and insert converted CSS
$linkTagPattern = '/<link\s(.*)rel=("stylesheet").*>/';
$linkTagReplace = '<style type="text/css">'.$css_replaced.'</style>';
$resultPage = preg_replace($linkTagPattern,$linkTagReplace,$html_replaced);