Parse String or tokenizer

S

Slickuser

I'm new to C++.

How do I parse the information below back from a string in memory?
I do not want to output this to a XML file and parse the file. Thanks.

<UserRoot>
<Header>
<Transmit>True</Transmit>
<Exception>None</Exception>
</Header>
<Body>
<InfoScheme Version="5.0">
<FunctionCall>GetUserName</FunctionCall>
<Login id="1" user="a" name="Jack" location="newyork:usa" />
<Login id="2" user="b" name="Ken" location="tokyo:japan" />
<Login id="3" user="c" name="Jen" location="xx:yy" />
</InfoScheme>
</Body>
</UserRoot>

Information want to get back:

Transmit = True
Exception = None
Unique user with id, name and location.
 
S

Syron

Am 21.02.2010 04:50, schrieb Sam:
1) Use some XML parsing library. There are plenty of XML parsing
libraries for various platforms.

I personally recommend tinyxml, as this is a very easy-to-use library.
 
S

Slickuser

You have two options:

1) Use some XML parsing library. There are plenty of XML parsing libraries
for various platforms.

2) Write your own XML parser. Depending upon your required level of
sophistication (grokking entity references, namespaces, or DTDs), this may
not necessarily be very complicated. If you're new to C++, this will be an
excellent vehicle for you to learn how to use the STL.

 application_pgp-signature_part
< 1KViewDownload

That entire XML file will be a single string. Is there some kind
search like Perl regular expression?
 
S

Slickuser

There are also various libraries that implement regular expressions, such as
the PCRE library. However, attempting to use regular expressions to parse
XML documents is a very common newbie mistake.

Regular expressions cannot be used to parse XML documents. It may seem to
be an easy hack, at first, but in the long run, your code will be guaranteed
to become a feature article onhttp://www.thedailywtf.com

Even if you put together a bare bones XML parser yourself, that does little
more than carve out the starting and the closing tags, ignoring namespaces,
custom entities, and DTDs, that'll still work better than any regular
expression hack.

 application_pgp-signature_part
< 1KViewDownload

Thanks for the advice. Where do I start to write the bare bone XML
parsing as I describe above?
 
P

Philip Potter

That entire XML file will be a single string. Is there some kind
search like Perl regular expression?

<OT>
Have you heard of XPath? It's a way of searching XML documents which was
designed with XML in mind. There exist C++ XPath libraries (a quick
google shows TinyXPath.

As others have said, you generally shouldn't use regexes to parse XML;
there are some cases where it is a useful tactic, however, such as when
the XML is in a fixed form which is guaranteed not to change. If your
XML document will always have exactly one Header with exactly one
Transmit and one Exception, and it will always have exactly one Body, I
can't say that regexes would be an entirely bad solution.

The problem with using regexes and XML starts when you can have
recursive document structures (for example, another Body somewhere
within a Body tag), or when attributes can look like tags (such as
<Login name="<Body>"> ), or in a few other nasty cases. Regexes are only
appropriate if you know such things Can't Happen now, and Can't Happen
ever. Note that if your regex gets sufficiently confused by malicious
user input, it could become a security issue.
</OT>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,234
Latest member
SkyeWeems

Latest Threads

Top