I've got a code that uses HTMLParser . I prefer python version
Python 2.7.3 (default, Sep 26 2012, 21:51:14)
[GCC 4.7.2] on linux2
because it reads the html cleanly.
These python versions detect error
html.parser.HTMLParseError: malformed start tag, at line 1993, column 565
2.6.6 (r266:84292, Apr 3 2012, 13:01:54)
[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)]
and
Python 3.2.3 (default, Oct 19 2012, 19:53:16)
[GCC 4.7.2] on linux2
HTML source.
I suppose the answer is "install python 2.7 on the redhad system, dumbhead". The code also works with python3 using the strict=False option. HTMLParser.__init__(self,strict=False)
Thanks, Dave.