How should I deal with an XMLSyntaxError in Python's lxml while parsing a large XML file?

The right thing to do here is make sure that the creator of the XML file makes sure that: A.) that the encoding of the file is declared B.) that the XML file is well formed (no invalid characters control characters, no invalid characters that are not falling into the encoding scheme, all elements are properly closed etc.) C.) use a DTD or an XML schema if you want to ensure that certain attributes/elements exist, have certain values or correspond to a certain format (note: this will take a performance hit).

The codecs Python module suply an EncodedFile class that works as a wrapper to a file - you should pass an object of this class to lxml, set to replace unknown characters with XML char entities.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions