#1
  1. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2002
    Location
    Sweden
    Posts
    3
    Rep Power
    0

    data to and from a XML-document with SAX


    Hello out there!

    First I want to say that I am a beginner in XML with PHP. Lets make a long story short: Now I have a site using textfiles for storing data (text and all sorts of tags like form and a-tags). Now I was thinking about making a single XML document for storing all this data instead using meny files. I started to read about SAX in Beginning PHP4 and the examples worked just fine to me. Fine I thought, then this can be used to store text-data at least. But what if I would want to store form-tags, link-tags, bold-tags or other sorts of tags and text all together in one element? Okey you may say that a link-tag in one xml-element i valid if I defined that in the DTD... but I don't have time do make a DTD to work for all sorts of tags and Entitys that may be present in the xml-element. So therefore I come up whit the ide that I,before I put the data in the element, could convert < to &_lt; and " to &_amp; and so on(the underline is intended to be there). Now I could put what ever data I wanted to be in a xml-element whitout defining any DTD for that... And then when I wanted the data from the xml-document I just have to convert it back again. This would work I thought... but not. I could not use the entity-sign for < and & in the element... when I parsed the document to a page all data behind the ampersand-sign where cut of... and I don't know wy that is??! I know you proffessional programmers may think that I solved this in a not so good way... but can someone please help me to develop this system in the mos easy way? Can you give a description of how YOU would solve this problem.

    Best regards

    Andreas
  2. #2
  3. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2002
    Posts
    3
    Rep Power
    0
    Maybe &_lt; needs to be &_amp;lt; ?
  4. #3
  5. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2002
    Location
    Sweden
    Posts
    3
    Rep Power
    0

    Unhappy


    It does not work anyway... as sone as there is a ampersand-sign (&amp it gets all messed up in the php-sax-parser and I don't know wy this is happening... according to the books I have red, you can put something like &_amp; in a element if the content of the element is declared as #PCDATA. I will try to validate my document to see if there is anything else I have missed....

    ...to be continued
  6. #4
  7. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2002
    Posts
    1
    Rep Power
    0
    I have had exactly the same problem with PERL.
    I think the problem stemms from the local encoding settings, character sets and things I don't understand!
    My solution was to replace every "non_word" character with:
    "MySpecialString - the ascii character number - MySpecialString"
    In PERL it is easy useing regular expresions (\w) matches "word charatures" and ord($char) gives the ascii number.
    I then swap it a all bacck at the other end.
    So the xml text that represents a space, for instance, is
    "MySpecialString32MySpecialString".
    I am sure PHP will have very simmilar functions.
  8. #5
  9. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2002
    Location
    Sweden
    Posts
    3
    Rep Power
    0
    Yes I have heard about some other people that hade the same problem to. I solved it by replace all ampersand signs with something like you described it, and then convert it back again. This work just fine... but it's not a "good loocking sulotion" I think. But does it have to be this way than it does.

    Another thing that the parser did not like was the newline character "\n" or windows "\r\n" I don't know witch one of them. All I know is that as sone as I hit enter when I type in a element all data behind that would be cut off. This I solved by replace all "\n" and "\r" whit ""(nothing). Then it all will work just fine... but still not a "good loocking sulotion".

    I would like to walidate my xml-document. Is there anyone that know where to get some good validator? I have tried to use XML for Java by IBM but I did not get it to work... don't really know how to install it properly....

IMN logo majestic logo threadwatch logo seochat tools logo