Size of XML-Document

V

Vollerthun

Hi Guis.
I try to figure out how many bytes a given XML-Document
(org.w3c.dom.Document)
occupies.

To be more specific: The final XML-Document is generated, but
basically consists of several XML-Files that all live on hard disk.
The resulting file is sent to a client that needs to know the (upper
bound) size of the document beforehand (as to allocate memory or
something like that)

At the moment I simply add up the sizes of every single file I read,
but that's not the elegant way, is it?

What happens if I start to generate XML-Documents in memory rather
than reading "real" files? I'll tell you what: I'm having a #@%&#
problem. I'd have to serialize that document to disk, find out the
size of that document and add it to the overall size. (Mark it for
deletion afterwards)


I seem to be incapable of finding any meaningful google-result neither
in the web nor in the groups. Same at java.sun.com

If you have any idea where I could keep on searching or probably have
a hint on how to proceed I'd be very pleased.

tom
 
B

Ben_

What happens if I start to generate XML-Documents in memory rather
than reading "real" files?
With Xerces, you can serialize DOM to memory (to a StringWriter for example,
but also any Writer or OutputStream, in fact). See
http://xml.apache.org/xerces-j/apiDocs/org/apache/xml/serialize/XMLSerializer.html.
Then you can get the length.

See also, http://java.sun.com/xml/jaxp/faq.html:
"
Q. How do I output/marshal/serialize a DOM tree into a stream?
Currently, there is only one way to do this using JAXP but it requires using
the transform component. See the list of implementations. This note may also
be useful.
Note there are several implementation-dependent ways of doing this such as
using the org.apache.xml.serialize package in Xerces or the
XmlDocument.write(OutputStream) method in Crimson, but this ties your
application to particular parsers and is thus non-portable.
In the future, DOM Level 3 should also provide this feature and it will
likely be incorporated into a future version of JAXP.
[ This page was last updated Apr-12-2003 ]
"
 
S

Sudsy

Vollerthun said:
What happens if I start to generate XML-Documents in memory rather
than reading "real" files? I'll tell you what: I'm having a #@%&#
problem. I'd have to serialize that document to disk, find out the
size of that document and add it to the overall size. (Mark it for
deletion afterwards)

You could always write it to a ByteArrayOutputStream and then use
the size() method. Or you could write your own OutputStream which
would merely swallow the data while counting the bytes. Something
like this, perhaps?

public class CountOutputStream extends OutputStream {
long count = 0L;

public void write( int b ) {
count++;
}

public void write( byte[] b ) {
count += b.length;
}

public void write( byte[] b, int off, int len ) {
count += len;
}

public long size() {
return( count );
}
}

You gets what you pay for, right?
 
V

Vollerthun

Yeeha, that's it.
Actually it was Ben who put me on the right track, but I kinda like
Sudsy's idea of rewriting the outputstream as well. I didn't know it
could be made that easy.

Anyway, I have to stay away from Xerces (must stay downwards
compatible to 1.3:(, but the javax.xml.Transformer, that I took
instead, uses a Result for transformation.

Now I'm using the Transformer together with a StringWriter in a
StreamResult as Ben said and that perfectly does the job for me.
Sometimes it helps if you read the API, but I must have read over that
Writer-Constructor of the StreamResult a dozen times without seeing
it.

Thank you both for your advice.
(I think I'll try and subclass a outputstream anyway, just for the fun
of it. Thanks, sudsy;)

tom
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,430
Messages
2,571,676
Members
48,796
Latest member
Greg L.

Latest Threads

Top