|
|
|||||||||
|
|||||||||
| |||||||||
|
|
|
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
#1
|
|||
|
|||
|
"included" html
If you include an html page from another site, (for example a list of something), into an html page on your site, and the method (php cURL) leaves the included pages head, title, etc tags in the middle of the html page on your site does that hurt anything as far as the search engines go?
|
|
#2
|
||||
|
||||
|
It's not the best thing to do. Apart from breaking HTML standards by having two <head> and ,body> sections, you'll also run the risk of having the SE's see this in your code and realise that you're scraping it from somewhere else.
If you're using cURL in PHP it's a pretty easy thing to do to write a small wrapper function that will strip off everything up to the end of the <body> tag, and after the start of the </body> tag. That way you'll be sure to have no issues with it. |
|
#3
|
|||
|
|||
|
Quote:
Can you give me a hint where to start. My code is: PHP Code:
|
|
#4
|
||||
|
||||
|
It's easy. Once you've got the returned values from your curl calls, just use the PHP string index functions to determine where the finishing ">" character is after the body tag, and remove everything before that. Then find the position of the closing "</body" tag and strip everything after there. It's 5-6 lines of code, and not to hard. Should be a good exercise for you.
![]() Also, you will want to use this to get the page returned to the script. Code:
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); $page_contents = curl_exec($ch); |
![]() |
| Viewing: Dev Shed Forums > Web Design > Search Engine Optimization > "included" html |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|
|
|
|