Parse String or tokenizer

Discussion in 'C++' started by Slickuser, Feb 21, 2010.

  1. Slickuser

    Slickuser Guest

    I'm new to C++.

    How do I parse the information below back from a string in memory?
    I do not want to output this to a XML file and parse the file. Thanks.

    <UserRoot>
    <Header>
    <Transmit>True</Transmit>
    <Exception>None</Exception>
    </Header>
    <Body>
    <InfoScheme Version="5.0">
    <FunctionCall>GetUserName</FunctionCall>
    <Login id="1" user="a" name="Jack" location="newyork:usa" />
    <Login id="2" user="b" name="Ken" location="tokyo:japan" />
    <Login id="3" user="c" name="Jen" location="xx:yy" />
    </InfoScheme>
    </Body>
    </UserRoot>

    Information want to get back:

    Transmit = True
    Exception = None
    Unique user with id, name and location.
     
    Slickuser, Feb 21, 2010
    #1
    1. Advertising

  2. Slickuser

    Syron Guest

    Am 21.02.2010 04:50, schrieb Sam:
    > 1) Use some XML parsing library. There are plenty of XML parsing
    > libraries for various platforms.


    I personally recommend tinyxml, as this is a very easy-to-use library.
     
    Syron, Feb 21, 2010
    #2
    1. Advertising

  3. Slickuser

    Slickuser Guest

    On Feb 20, 7:50 pm, Sam <> wrote:
    > Slickuser writes:
    > > I'm new to C++.

    >
    > > How do I parse the information below back from a string in memory?
    > > I do not want to output this to a XML file and parse the file. Thanks.

    >
    > > <UserRoot>
    > >    <Header>
    > >            <Transmit>True</Transmit>
    > >            <Exception>None</Exception>
    > >    </Header>
    > >    <Body>
    > >            <InfoScheme Version="5.0">
    > >                    <FunctionCall>GetUserName</FunctionCall>
    > >                    <Login id="1" user="a" name="Jack" location="newyork:usa" />
    > >                    <Login id="2" user="b" name="Ken" location="tokyo:japan" />
    > >                    <Login id="3" user="c" name="Jen" location="xx:yy" />
    > >            </InfoScheme>
    > >    </Body>
    > > </UserRoot>

    >
    > > Information want to get back:

    >
    > > Transmit   = True
    > > Exception  = None
    > > Unique user with id, name and location.

    >
    > You have two options:
    >
    > 1) Use some XML parsing library. There are plenty of XML parsing libraries
    > for various platforms.
    >
    > 2) Write your own XML parser. Depending upon your required level of
    > sophistication (grokking entity references, namespaces, or DTDs), this may
    > not necessarily be very complicated. If you're new to C++, this will be an
    > excellent vehicle for you to learn how to use the STL.
    >
    >  application_pgp-signature_part
    > < 1KViewDownload


    That entire XML file will be a single string. Is there some kind
    search like Perl regular expression?
     
    Slickuser, Feb 21, 2010
    #3
  4. Slickuser

    Slickuser Guest

    On Feb 21, 5:59 am, Sam <> wrote:
    > Slickuser writes:
    > > That entire XML file will be a single string. Is there some kind
    > > search like Perl regular expression?

    >
    > There are also various libraries that implement regular expressions, such as
    > the PCRE library. However, attempting to use regular expressions to parse
    > XML documents is a very common newbie mistake.
    >
    > Regular expressions cannot be used to parse XML documents. It may seem to
    > be an easy hack, at first, but in the long run, your code will be guaranteed
    > to become a feature article onhttp://www.thedailywtf.com
    >
    > Even if you put together a bare bones XML parser yourself, that does little
    > more than carve out the starting and the closing tags, ignoring namespaces,
    > custom entities, and DTDs, that'll still work better than any regular
    > expression hack.
    >
    >  application_pgp-signature_part
    > < 1KViewDownload


    Thanks for the advice. Where do I start to write the bare bone XML
    parsing as I describe above?
     
    Slickuser, Feb 22, 2010
    #4
  5. On 21/02/2010 08:02, Slickuser wrote:
    > That entire XML file will be a single string. Is there some kind
    > search like Perl regular expression?


    <OT>
    Have you heard of XPath? It's a way of searching XML documents which was
    designed with XML in mind. There exist C++ XPath libraries (a quick
    google shows TinyXPath.

    As others have said, you generally shouldn't use regexes to parse XML;
    there are some cases where it is a useful tactic, however, such as when
    the XML is in a fixed form which is guaranteed not to change. If your
    XML document will always have exactly one Header with exactly one
    Transmit and one Exception, and it will always have exactly one Body, I
    can't say that regexes would be an entirely bad solution.

    The problem with using regexes and XML starts when you can have
    recursive document structures (for example, another Body somewhere
    within a Body tag), or when attributes can look like tags (such as
    <Login name="<Body>"> ), or in a few other nasty cases. Regexes are only
    appropriate if you know such things Can't Happen now, and Can't Happen
    ever. Note that if your regex gets sufficiently confused by malicious
    user input, it could become a security issue.
    </OT>
     
    Philip Potter, Feb 22, 2010
    #5
  6. Slickuser

    Jeff Flinn Guest

    Leigh Johnston wrote:
    >
    >
    > "Slickuser" <> wrote in message
    > news:...
    >> On Feb 21, 5:59 am, Sam <> wrote:
    >>> Slickuser writes:
    >>> > That entire XML file will be a single string. Is there some kind
    >>> > search like Perl regular expression?
    >>>
    >>> There are also various libraries that implement regular expressions,
    >>> such as


    ....

    >>
    >> Thanks for the advice. Where do I start to write the bare bone XML
    >> parsing as I describe above?

    >
    > A recursive descent parser is probably the easiest way.
    > http://en.wikipedia.org/wiki/Recursive_descent_parser


    And see the boost spirit parsing library at
    http://www.boost.org/doc/libs/1_42_0/libs/spirit/doc/html/index.html
    which allows you to create a EBNF like Parsing Expression Grammar(PEG)
    natively in C++. Also the boost serialization library has a mini xml
    grammar that is used in it's xml archive implementation.

    Jeff
     
    Jeff Flinn, Feb 22, 2010
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Knackeback
    Replies:
    5
    Views:
    2,870
    John Harrison
    May 11, 2004
  2. Christopher Benson-Manica

    String tokenizer comments desired

    Christopher Benson-Manica, May 12, 2004, in forum: C++
    Replies:
    5
    Views:
    520
    Christopher Benson-Manica
    May 13, 2004
  3. Java Guy

    string tokenizer.

    Java Guy, Jun 17, 2004, in forum: C++
    Replies:
    4
    Views:
    1,331
    Chris Theis
    Jun 18, 2004
  4. Replies:
    19
    Views:
    1,143
    Daniel Vallstrom
    Mar 15, 2005
  5. beth
    Replies:
    0
    Views:
    112
Loading...

Share This Page