Efficient serialization of binary data within XML documents?

N

Noozer

From what I can find online, the most efficient way to store binary data in
an XML document is to use Base64 serialization. This results in a 25% waste
of storage space, since computers store data in 8 bit bytes (or even more
wasted when other codepages are considered).

Is there a more efficient way to store binary data within an XML document?
Perhaps a "8 bit character" codepage that contains 256 "real" characters,
and that would have a common base regardless of language or dialect?

I'm trying to figure out a method to store data, such as an music track,
with album art, lyrics and custom user data that still works well with the
XML model.
 
A

Anthony Jones

Noozer said:
From what I can find online, the most efficient way to store binary data in
an XML document is to use Base64 serialization. This results in a 25% waste
of storage space, since computers store data in 8 bit bytes (or even more
wasted when other codepages are considered).

Is there a more efficient way to store binary data within an XML document?
Perhaps a "8 bit character" codepage that contains 256 "real" characters,
and that would have a common base regardless of language or dialect?

I'm trying to figure out a method to store data, such as an music track,
with album art, lyrics and custom user data that still works well with the
XML model.

You can't use place raw binary in an XML file. XML is ultimately a text
based stream.

First are you sure the 33% bloat is really a problem? You get massive
amounts to storage for peanuts these days.

If you're concerned about bandwidth usage that's a different matter.
However I don't think XML is really a sensible messaging medium to use for
the transmitting (or indeed storage) of multimedia content .

Have you considered using a Zip file to contain media files and an XML
manifest a la M$'s new office document formats.
 
J

Joseph Kesselman

(Not crossposted to microsoft.public.xml because -- despite its name --
it does not accept posts from the public, or at least not from that
portion of the public using non-Microsoft tools.)


Anthony said:
However I don't think XML is really a sensible messaging medium to use for
the transmitting (or indeed storage) of multimedia content .

A better solution might be for the XML document to contain a URI that
can be used to retrieve the multimedia content separately -- the same
way (X)HTML currently handles images, and for much the same reasons.

If you must include binary info within an XML document: Base64 encoding
is simple and effective, available as off-the-shelf code, and
automatically avoids the cases where XML would have trouble representing
the character. Yes, you could come up with a tighter representation
than Base64 which still avoids the problematic characters/ranges, but
the computational overhead would be nontrivial. If you're worried about
bloat, I'd suggest that you compress the data before encoding it (good
practice anyway).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,022
Latest member
MaybelleMa

Latest Threads

Top