Looking for XML linearization information

  • Thread starter Generic Usenet Account
  • Start date
G

Generic Usenet Account

Hello,

Are there are tools/W3C standards/design patterns etc. for linearizing
XML content? Basically I want to send information, which is natively
in XML, to a resource constrained device that does not have XML
awareness. In other words, the resource constrained device does not
do any DOM or SAX processing of XML.

Thanks in advance,
Baht
 
B

BGB

Hello,

Are there are tools/W3C standards/design patterns etc. for linearizing
XML content? Basically I want to send information, which is natively
in XML, to a resource constrained device that does not have XML
awareness. In other words, the resource constrained device does not
do any DOM or SAX processing of XML.

depends on what exactly you are wanting...


if a library:

one option is to use (or write) an XML library, but depending on memory
resources, this may be too memory-hungry (for example, a lot of XML as
DOM nodes will eat up a large chunk of memory even on desktop PCs).

if one adjusts the implementation to their needs, they can do a DOM-like
implementation which needs a lot less memory than standard DOM (if one
omits namespaces and doubly-linked structures, and uses ASCII or UTF-8
rather than UTF-16, a fair bit can be saved).


SAX could be better, as it can allow a small implementation which does
not require in-memory storage.


if a binary interchange:
well, WBXML could work.

http://en.wikipedia.org/wiki/WBXML

there is EXI, but EXI looks likely to require a more complex
implementation (but is entropy/huffman coded so could save some bytes).

http://en.wikipedia.org/wiki/Efficient_XML_Interchange


also maybe relevant:
http://msdn.microsoft.com/en-us/library/cc219210(PROT.10).aspx


for my uses, I rolled my own format (which I call SBXE) which is
structurally vaguely similar to WBXML, but in general is more compact in
my tests (for generic/schema-free operation, which is my main use-case),
and is simpler and faster to decode than textual XML. its main
difference from WBXML is that tags/strings are defined inline and go
into MRU lists, and when in the list is referenced by its MRU index (a
variant of "move to front" was used).

it also responds favorably to deflate.

some info (if server stays up...):
http://cr88192.dyndns.org/2010-10-27_SBXE11.txt

it was first defined/implemented around 2005, but I forgot about it for
several years due to not having much use for it at the time.

I designed a new variant which could be (potentially) more compact, but
the improvement was likely modest and not worth the hassle of having to
re-implement it.

looking, there are a few holes in the spec...
the UVLI (unsigned variable-length integer) scheme is like this:
0..127: 0xxxxxxx
127..16383: 10xxxxxx xxxxxxxx
16384.. ...: 110xxxxx xxxxxxxx xxxxxxxx
....

note: high-bits/bytes come first.

with sign folding (for VLI) being into the LSB, so:
0, -1, 1, -2, 2, -3, 3, ...
 
J

Joe Kesselman

Are there are tools/W3C standards/design patterns etc. for linearizing
XML content? Basically I want to send information, which is natively
in XML, to a resource constrained device that does not have XML
awareness. In other words, the resource constrained device does not
do any DOM or SAX processing of XML.

This sounds like a standard data-extraction problem -- read the
document, parse out the portions which will be meaningful to the device,
format them in a way that the device's software will understand, and
send them along.

If the format you need to generate is textual in nature,
XPath/XSLT/XQuery may be useful in doing the data extraction and
formatting. If the format you need is binary, you're going to be writing
your own code; the X* tools may be useful at the extraction end but
formatting's going to be up to you and/or whatever libraries are
available for talking to that specific device.

Without knowing exactly what the device is prepared to handle, I don't
think I can offer much more specific advice than that.

--
Joe Kesselman,
http://www.love-song-productions.com/people/keshlam/index.html

{} ASCII Ribbon Campaign | "may'ron DaroQbe'chugh vaj bIrIQbej" --
/\ Stamp out HTML mail! | "Put down the squeezebox & nobody gets hurt."
 
P

Peter Flynn

Hello,

Are there are tools/W3C standards/design patterns etc. for linearizing
XML content? Basically I want to send information, which is natively
in XML, to a resource constrained device that does not have XML
awareness. In other words, the resource constrained device does not
do any DOM or SAX processing of XML.

There is a useful GPL'd tool called lxprintf, part of the LTXML2 package
from Edinburgh. This reads an XML file, extracts specific nodes
(elements, attributes) and then outputs values you specify in XPath
notation, formatted with a printf-like specification.

To re-use Frank's example:

<book>
<references>This if ref #1</references>
<references>This if ref #2</references>
<references>This if ref #3</references>

<title>Book Title One</title>
<author>Joe Blog</author>
<price>10.50</price>
</book>

$ lxprintf -e 'references' "%s\n" '.' test.xml
This if ref #1
This if ref #2
This if ref #3

Or perhaps

$ lxprintf -e book "%s/%s/$%s\n" author title price test.xml
Joe Blog/Book Title One/$10.50

///Peter (followups reset to c.t.x)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,763
Messages
2,569,562
Members
45,038
Latest member
OrderProperKetocapsules

Latest Threads

Top