XML as a stream protocol.

J

jbi130

Hello,

We're redesigning a custom binary application protocol built on TCP
and I thought we should evaluate some XML options. I have a few
questions on how to best handle a stream of XML data.

The first option I looked at is similar to XMPP. After looking at
XMPP we may be able to use some of it, but since we'd end up writing
our own implementation we are not worried about sticking to a spec.
So, our first option would use a complete document in each direction
as the whole connection.. eg)

<connection>
<message>
...
</message>
...
</connection>
close socket.

What is the best parse model for this? Because the <messages> are
never going to be that big, DOM would be nice b/c it seems simpler to
code. But can DOM be applied to just a section of the whole document?
Is our only (standard) option here to use SAX?

The other option I thought of is to add some framing so each message
is its own document..

NUMBER_OF_BYTES\r\n
<message>
...
</message>

Where NUMBER_OF_BYTES is the length of the following "document" so it
can be read in from the socket then passed to a DOM parser.

So what is the better way? What works better with the available XML
parsers? Our applications are writtin in Python, C and Java.

Sorry if this is all basic, but my XML experience is not about 2 days
old.

Thanks.
 
P

Patrick TJ McPhee

[...]

% So, our first option would use a complete document in each direction
% as the whole connection.. eg)

[...]

% What is the best parse model for this? Because the <messages> are
% never going to be that big, DOM would be nice b/c it seems simpler to
% code. But can DOM be applied to just a section of the whole document?

DOM parses the entire document before giving the tree back to you.

% Is our only (standard) option here to use SAX?

Standard is a relative term, but yes.


% The other option I thought of is to add some framing so each message
% is its own document..

% NUMBER_OF_BYTES\r\n
% <message>
% ...
% </message>

I like this approach better. Each message really _is_ its own document,
so it ought to be represented as such.

% So what is the better way? What works better with the available XML
% parsers? Our applications are writtin in Python, C and Java.

Using python, you'll probably want to use libxml as the parser, and you
might as well also use it for C. This approach will work well with
the parser -- you hand it a buffer with the document in it, and it
hands you a tree back. With java, there's more standardisation, and
any parser ought to handle it OK.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top