Theodore Ts'o wrote: > The original code which was contributed to me by Andreas Dilger. It > uses a hand-coded XML parser which is quite small (2.7k of text in a > shared library). In contrast, libxml2.so is 610k text, 43k data, and > 3.5k BSS, which just boggles the mind. There is probably a lot of stuff in XML that Andreas' code doesn't support. I also have written my own subset-of-XML parser for a specific application, and it also is quite small. If all you want to do is handle simple, self-contained documents that consist of tags with attributes and PCDATA, for example: <tag1 att1='foo' att2='bar'> <tag2 att3='baz'/> <tag3 att4='quux'>pcdata</tag3> </tag1> then it's not much work at all to tokenize that and generate either SAX-style callback events for each element, or a DOM-like tree of element objects. It's when you get into all the other things that XML defines, like a wide variety of character codings, external entities, DTD parsing and validation, etc., etc., etc. that your code base starts getting pretty big. I admit, though, 610k of text still seems pretty huge. Craig
Attachment:
pgp5qnQvnYpTM.pgp
Description: PGP signature