What is an XML parser? Using Expat?

It took a while to wrap my head around XML parsing (though I do it in Perl, not C). Basically, you register callback functions. The parser will ping your callback for each node and pass in a data structure containing all kinds of juicy bits (like plaintext, any attributes, children nodes, etc).

You have to maintain some kind of state information--like a hash tree you plug stuff into, or a string that contains all the guts, but none of the XML Just remember that XML is not linear and it doesn't make much sense to parse it like a long hunk of text. Instead, you parse it like a tree. Good luck.

It took a while to wrap my head around XML parsing (though I do it in Perl, not C). Basically, you register callback functions. The parser will ping your callback for each node and pass in a data structure containing all kinds of juicy bits (like plaintext, any attributes, children nodes, etc).

You have to maintain some kind of state information--like a hash tree you plug stuff into, or a string that contains all the guts, but none of the XML. Just remember that XML is not linear and it doesn't make much sense to parse it like a long hunk of text. Instead, you parse it like a tree.

Good luck.

Expat is an even-driven parser. You have to write code to deal with tags, attributes etc. And then register the code with the parser. There is an article here which describes how to do this.

Regarding reading from a socket, depending on your platform you may be able to treat the socket like like a file handle. Otherwise, you need to do your own reading from the socket and then pass the data to expat explicitly. There is an API to do this.

However, I'd try to get it working with ordinary files first.

Instead of expat, you might want to have a look at libxml2, which is probably already included in your distribution. It's a lot more powerful than expat, and gives you all sorts of goodies: DOM (tree mode), SAX (streaming mode), XPath (indispensable to do anything complex with XML IMHO) and more. It's not as lightweight as expat, but it's a lot easier to use.

Well, you chose the most complicated XML parser (event-driven parsers are more difficult to handle). Why Expat and not libxml?

Expat is a library, written in C, for parsing XML documents. XML::Parser, and other open-source XML parsers. Benchmark article, it's very fast.

Standard for reliability, robustness and correctness. Language for SGML), XP (a Java XML parser package), and XT (a Java XSL engine). The XML specification.

This article is based on a test version, Version 19990709. Expat is a stream-oriented parser. Functions with the parser and then start feeding it the document.

Start parsing before you have the whole document. Parse really huge documents that won't fit into memory.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions