#1
  1. No Profile Picture
    Python/RDF Freak
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2003
    Posts
    14
    Rep Power
    0

    parsing xml with strange characters


    Hello,

    I am trying to parse a xml. In this xml file I have the following entry:
    <option id='2'>occupé</option>

    I parse with the following functions:

    Code:
    from xml.dom import minidom 
    xmldoc = minidom.parse(file)
    file is the file with the entry.

    I tried to change some encodings in my xml file, but sometimes I get an error and otherwise I don't get an error, but the entry is not shown.

    Is there a way to solve this in python or do I have to do something with my xml file (but that belongs to the xml forum)

    thanks anyway,
    greetings,
    Johie
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2003
    Location
    Norway
    Posts
    41
    Rep Power
    12
    There is a way to specify encodings in Python files, see
    http://www.python.org/peps/pep-0263.html
    but I'm not sure if it solves your problem. I had a similar problem that wasn't solved by this or by changing the encoding of the xml file. I still got an error in a function in the expat library (which is the basis for the Python xml functions). You can use entities instead, &eacute; in your case.
    Good web hosting info - articles about web hosting
    hb's web dev blog
  4. #3
  5. No Profile Picture
    Python/RDF Freak
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2003
    Posts
    14
    Rep Power
    0
    Hello,

    thanks, but its already solved now , someone at my company helped me very good.

    it was all decoding en encoding stuff, I solved it with the utf-8 encoding.

    Have a nice weekend,
    Johie

IMN logo majestic logo threadwatch logo seochat tools logo