Newbie: parsing simple XML with C/C++

S

Steven Feil

I'm looking for simple examples of XML parsing using C/C++ that could
be applicable to both Unix and Windows programming. I am wanting to
parse an XML structure that is basically flat. The Information that I
wish to extract would be held within XML attributes. It is important
that the order of the document be preserved in the parsing process.


Here is a document similar to the type of XML that will be generated.

<collector name="John Tomas">
<cd title="Moon Light City" artist="Fred Ziffle" />
<book title="Hiking Big Bend" author="Laurence Parent" pages="171" />
<cd title="Ups and Downs" artist="Hank Kimble" />
<cd title="Eat My Shorts" artist="Bart Simpson" />
<book title="100 things to cook" author="Tom Duly" pages="100" />
</collector>

My program would need to know what order the information was
in. Specifically the cd "Moon Light City" was found first followed by
the book "Hiking Big Bend" and so forth. I would also need to extract
all the data, noting that the data for a cd is different than that for
a book.

I've looked into xpath, but what I've read about it so far seams much
more complicated than what I need. Most of the XML books I've seen
seem to concentrate on the DTD and interoperability between HTML and
XML. I've written a DTD that I believe is right, but I haven't tested
it yet.

Any websites or steering on what I should search the Internet for
would be greatly appreciated.
 
P

Patrick TJ McPhee

% I'm looking for simple examples of XML parsing using C/C++ that could
% be applicable to both Unix and Windows programming.

You should use an XML parser, rather than trying to write code that does
the parsing. There are a bunch of them out there -- my current favourite
is libxml (http://xmlsoft.org), but my old favourite (expat -- don't
have a URL) works well and is a little lighter-weight. I didn't like
using xerces, from the apache project, partly because I find the name
irritating, and partly because I was doing some non-standard things with
it and ran into problems with version changes.

There are two styles of parser interfaces. You can write a bunch of
callback functions which get passed bits of the file as they're parsed
and do what you want with them, or you can have the parser give you a
tree structure, then have your code walk the tree when it wants to get
at the data. Expat originally supported the first style of interface
only, while most of the more recent parsers support both styles. I believe
there's an expat project on source forge which has likely moved the
interface forward a bit from the version I use.

% I've looked into xpath, but what I've read about it so far seams much
% more complicated than what I need.

It depends on what you need to do with the data. If your goal is to
find a specific CD, say, XPath can save you a lot of coding. If you
look at all the data, it doesn't really give you anything.

% Most of the XML books I've seen
% seem to concentrate on the DTD and interoperability between HTML and
% XML. I've written a DTD that I believe is right, but I haven't tested
% it yet.

You don't need a DTD, although it doesn't hurt to define your data
model up front, and the DTD is a reasonable way of documenting it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top