SunQuest
           XML Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsProgramming Languages - MoreXML Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
SlickEdit: Code in over 40 languages across 7 platforms. SlickEdit’s unmatched power, speed, and flexibility allows even the most accomplished developers to write better code faster. Download a free trial today!
  #1  
Old August 22nd, 2003, 06:41 AM
hirez hirez is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Aug 2003
Posts: 2 hirez User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Question encoding

> hi
> i have a stylesheet that should remove unnecessary tags
> with non-breakable-spaces like '<p> </p>'
> my input file is XP-Word-HTML which is >completely invalid :)
> so I JTidy it and convert it to xhtml with UTF-8 encoding
> next i want to use my stylesheet to filter out all unnecessary tags
> including empty elements & those with nbsps.
>
> i've tried:
>
> (1) test="normalize-space(.)" on each p node
>
> and
>
> (2) declaring nbsp as an ENTITY and then finding it between p tags
>
> test="p=nbsp"
>
> but neither works - i still get <p>&#xAO;</p> in my output-xml
>
> any ideas ?
> ez & thanks for replys
> hirez
>
> STYLESHEET
> -------------------------------------------------------------------------
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE stylesheet [
> <!ENTITY nbsp "*">
> ]>
> <xsl:stylesheet
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
> version="1.0">
> <xsl:output method="xml" indent="no" encoding="UTF-8"/>
>
> <!-- p elements: remove empty tags -->
> <xsl:template match="p">
> <xsl:choose>
> <xsl:when test="not(normalize-space(.))">
> <empty>
> <xsl:apply-templates/>
> </empty>
> </xsl:when>
> <xsl:otherwise>
> <content>
> <xsl:apply-templates/>
> </content>
> </xsl:otherwise>
> </xsl:choose>
> </xsl:template>
> </xsl:stylesheet>
>
> -------------------------------------------------------------------
> INPUT FILE
> is an xhtml file
> (formerly winXPword-html after a JTidy)
> looks something like this if i view it in UTF-8:
>
> <p>valid text</p>
>
> <p> </p>
>
> <p>valid text</p>
> -----------------------------------
>
> BUT looks like this if i view it
> in ISO-8859-1 (latin 1)
> whats this ? (Acirc and a space ?)
>
> <p>valid text</p>
>
> <p>Â </p>
>
> <p>valid text</p>
> -----------------------------------------
>
> OUTPUT FILE
> my ouput looks like this:
>
> <?xml version="1.0" encoding="UTF-8"?>
> ....
> <p>valid text</p>
>
> <p> </p><p> </p>
>
> <p>valid text</p>
> ....

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming Languages - MoreXML Programming > encoding


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 5 hosted by Hostway