XML Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsProgramming Languages - MoreXML Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
SlickEdit: Code in over 40 languages across 7 platforms. SlickEdit’s unmatched power, speed, and flexibility allows even the most accomplished developers to write better code faster. Download a free trial today!
  #1  
Old February 3rd, 2003, 09:57 AM
poring poring is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Dec 2001
Posts: 68 poring User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 5 h 39 m 9 sec
Reputation Power: 7
rss & html entities

I'm trying to syndicate my news system into the rss format, but there is a problem when there is an html entity in my news content. here is an exemple of error message I get:

Code:
Reference to undefined entity 'agrave'. Error processing resource 'http://localhost/mytest/index.rss'. Line 5, Position 17 
 

  <title>Tarte &agrave; la fraise</title>
----------------^


Here is the source code:

Code:
<?xml version="1.0"?>
<rss version="0.91">
	<channel>
		<title>Tarte &agrave; la fraise</title>
		<link>http://localhost/mytest/</link>
		<description></description>
		<lastBuildDate>Mon, 03 Feb 2003 15:55:02 GMT</lastBuildDate>
		<docs>http://backend.userland.com/rss091</docs>
		<managingEditor></managingEditor>
		<language>en</language>

				<item>
			<title>This is a title!</title>
			<description>This is<br>a<br><br>body.</description>
			<link>http://localhost/mytest/index.php?action=viewcom&id=37</link>
		</item>
	</channel>
</rss>


Can anyone please help? Thanks.

Reply With Quote
  #2  
Old February 3rd, 2003, 11:05 AM
bricker42 bricker42 is offline
Moderator =(8^(|)
Dev Shed Intermediate (1500 - 1999 posts)
 
Join Date: Feb 2002
Location: Sacramento, CA
Posts: 1,710 bricker42 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 20 m 38 sec
Reputation Power: 8
Send a message via AIM to bricker42
The simplest way is to replace your entities with ascii hex codes. &agrave; = %E0.

Here's a list (scroll down) http://www.bbsinc.com/iso8859.html

They render exactly the same as the html entities, but don't need to be defined in xml.
__________________
-james

Reply With Quote
  #3  
Old February 4th, 2003, 09:39 PM
AlexNinjaFighte AlexNinjaFighte is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2001
Location: Pittsburgh PA USA
Posts: 105 AlexNinjaFighte User rank is Corporal (100 - 500 Reputation Level)AlexNinjaFighte User rank is Corporal (100 - 500 Reputation Level)AlexNinjaFighte User rank is Corporal (100 - 500 Reputation Level)AlexNinjaFighte User rank is Corporal (100 - 500 Reputation Level) 
Time spent in forums: 16 h 30 m 20 sec
Reputation Power: 8
A timely topic for me too.

My rss generating script (created by jpenn -- php forum) encountered an "&" and choked. I tried htmlentities() and htmlspecialchar() in the hope of sliding by, but no dice.

I'm assuming that we must create a mega preg_replace script like the one below except replacing with ascii?
PHP Code:
// $document should contain an HTML document.
// This will remove HTML tags, javascript sections
// and white space. It will also convert some
// common HTML entities to their text equivalent.

$search = array ("'<script[^>]*?>.*?</script>'si",  // Strip out javascript
                 
"'<[\/\!]*?[^<>]*?>'si",           // Strip out html tags
                 
"'([\r\n])[\s]+'",                 // Strip out white space
                 
"'&(quot|#34);'i",                 // Replace html entities
                 
"'&(amp|#38);'i",
                 
"'&(lt|#60);'i",
                 
"'&(gt|#62);'i",
                 
"'&(nbsp|#160);'i",
                 
"'&(iexcl|#161);'i",
                 
"'&(cent|#162);'i",
                 
"'&(pound|#163);'i",
                 
"'&(copy|#169);'i",
                 
"'&#(\d+);'e");                    // evaluate as php

$replace = array ("",
                  
"",
                  
"\\1",
                  
"\"",
                  
"&",
                  
"<",
                  
">",
                  
" ",
                  
chr(161),
                  
chr(162),
                  
chr(163),
                  
chr(169),
                  
"chr(\\1)");

$text preg_replace ($search$replace$document); 
as is shown in the manual http://www.php.net/manual/en/function.preg-replace.php
or is there a more direct way?

Alex

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming Languages - MoreXML Programming > rss & html entities


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 5 hosted by Hostway