data to and from a XML-document with SAX
Hello out there!
First I want to say that I am a beginner in XML with PHP. Lets make a long story short: Now I have a site using textfiles for storing data (text and all sorts of tags like form and a-tags). Now I was thinking about making a single XML document for storing all this data instead using meny files. I started to read about SAX in Beginning PHP4 and the examples worked just fine to me. Fine I thought, then this can be used to store text-data at least. But what if I would want to store form-tags, link-tags, bold-tags or other sorts of tags and text all together in one element? Okey you may say that a link-tag in one xml-element i valid if I defined that in the DTD... but I don't have time do make a DTD to work for all sorts of tags and Entitys that may be present in the xml-element. So therefore I come up whit the ide that I,before I put the data in the element, could convert < to &_lt; and " to &_amp; and so on(the underline is intended to be there). Now I could put what ever data I wanted to be in a xml-element whitout defining any DTD for that... And then when I wanted the data from the xml-document I just have to convert it back again. This would work I thought... but not. I could not use the entity-sign for < and & in the element... when I parsed the document to a page all data behind the ampersand-sign where cut of... and I don't know wy that is??! I know you proffessional programmers may think that I solved this in a not so good way... but can someone please help me to develop this system in the mos easy way? Can you give a description of how YOU would solve this problem.
Maybe &_lt; needs to be &_amp;lt; ?
It does not work anyway... as sone as there is a ampersand-sign (& it gets all messed up in the php-sax-parser and I don't know wy this is happening... according to the books I have red, you can put something like &_amp; in a element if the content of the element is declared as #PCDATA. I will try to validate my document to see if there is anything else I have missed....
...to be continued
I have had exactly the same problem with PERL.
I think the problem stemms from the local encoding settings, character sets and things I don't understand!
My solution was to replace every "non_word" character with:
"MySpecialString - the ascii character number - MySpecialString"
In PERL it is easy useing regular expresions (\w) matches "word charatures" and ord($char) gives the ascii number.
I then swap it a all bacck at the other end.
So the xml text that represents a space, for instance, is
I am sure PHP will have very simmilar functions.
June 13th, 2002, 01:39 AM
Yes I have heard about some other people that hade the same problem to. I solved it by replace all ampersand signs with something like you described it, and then convert it back again. This work just fine... but it's not a "good loocking sulotion" I think. But does it have to be this way than it does.
Another thing that the parser did not like was the newline character "\n" or windows "\r\n" I don't know witch one of them. All I know is that as sone as I hit enter when I type in a element all data behind that would be cut off. This I solved by replace all "\n" and "\r" whit ""(nothing). Then it all will work just fine... but still not a "good loocking sulotion".
I would like to walidate my xml-document. Is there anyone that know where to get some good validator? I have tried to use XML for Java by IBM but I did not get it to work... don't really know how to install it properly....