Dynamically-sized memory buffer?

R

Robert Mischke

I'm curious: Is there a common "best practice" to store binary data in
memory, when the amount of data is not known in advance? For example,
think of retrieving data from an URL, and there is nothing like a http
"content-length" field, and you don't want to use a temporary file.

My current idea is providing a container class (let's call it
DynamicMemBuffer) that uses a linked list of fixed-size byte[] buffers
internally. So a new DynamicMemBuffer starts with a single internal
buffer of size n. When n bytes have been added/written to the
DynamicMemBuffer, a new internal buffer is created, added to the
linked list, and writing of new bytes is directed into this buffer.
From the outside, the DynamicMemBuffer looks like a continuous byte
stream held in memory. As an interface, standard getInputStream() and
getOutputStream() methods could be provided.

While all of this is quite simple to implement, I'm reluctant to do so
because I can't imagine this problem hasn't been solved a thousand
times before.

So are there better approaches and/or ready-to-use classes out there?


Thanks,
Robert
 
D

dar7yl

Robert Mischke said:
I'm curious: Is there a common "best practice" to store binary data in
memory, when the amount of data is not known in advance? For example,
think of retrieving data from an URL, and there is nothing like a http
"content-length" field, and you don't want to use a temporary file.

Looks like a ring-buffer to me. Yet another stab at reinventing the wheel.
The buffer classes available in the standard java libraries are sufficient
for most applications. If you wanted absolute control over efficiency, you
would probably be coding in C/C++ or assembly anyway. Start with the
generic, and if that's not enough performance, think about optimization
after you've identified where the bottlenecks really are.

regards,
Dar7yl
 
R

Robert Mischke

dar7yl said:
Looks like a ring-buffer to me. Yet another stab at reinventing the wheel.
The buffer classes available in the standard java libraries are sufficient
for most applications.

A ring buffer would not do what I need to do here, as these must be
created for a fixed capacity. Perhaps choosing the word "buffer" here
was misleading. Although what I need is a FIFO buffer, what sets it
apart is that is must have a basically unlimited capacity, which a
ring buffer does not provide.

Imagine your input data can be anything from a few bytes to several
megabytes, and you want to store all the input before you start
reading from the buffer at all - for example, if you only want to
process the data *after* you know the transfer completed successfully.

Robert
 
R

Robert Mischke

Jacob said:
Did you check the java.nio and the java.nio.channels packages?

I know the nio package, but it doesn't meet the requirements here. All
the buffer classes are fixed-capacity, which rules them out; and the
channels are not designed for storage, either.

What I need is a buffer that will read in ALL of the available input,
as long as memory is available. Blocking the data source (for example,
the URL connection) until the data sink (the application) has consumed
some of the data is not an option here - it must all be kept in memory
as a whole. And as I don't see how any fixed-sized buffer could work,
I'm looking for a tried-and-tested dynamic-size solution.

Robert
 
A

Andrey Kuznetsov

My current idea is providing a container class (let's call it
DynamicMemBuffer) that uses a linked list of fixed-size byte[] buffers
internally. So a new DynamicMemBuffer starts with a single internal
buffer of size n. When n bytes have been added/written to the
DynamicMemBuffer, a new internal buffer is created, added to the
linked list, and writing of new bytes is directed into this buffer.
From the outside, the DynamicMemBuffer looks like a continuous byte
stream held in memory. As an interface, standard getInputStream() and
getOutputStream() methods could be provided.

UnifiedIO has exactly what you need.

see http://uio.dev.java.net
 
C

Chris Uppal

Robert said:
Imagine your input data can be anything from a few bytes to several
megabytes, and you want to store all the input before you start
reading from the buffer at all - for example, if you only want to
process the data *after* you know the transfer completed successfully.

If you aren't wanting to treat the thing as a pipe then a
java.io.ByteArrayWriteStream seems to be the kind of thing you are looking for.

-- chris
 
R

Robert Mischke

Chris Uppal said:
java.io.ByteArrayWriteStream

Hmm.. are you sure this class exists in the standard API? There is no
class of that name in the 5.0 docs, and the only page at all that
comes up on a google search for "ByteArrayWriteStream" is

www.pocketsmalltalk.com/doc/ reference/ByteArrayWriteStream.html


?


Robert
 
C

Chris Uppal

Robert said:
Hmm.. are you sure this class exists in the standard API?

Sorry, my typo -- should be java.io.ByteArrayOutputStream.

(Getting confused with Smalltalk class names ;-)

-- chris
 
R

Robert Mischke

Chris Uppal said:
java.io.ByteArrayOutputStream

Hmm... indeed, I overlooked this one when scanning the Java API.
("This class implements an output stream in which the data is written
into a byte array. The buffer automatically grows as data is written
to it.")

From what is visible in the javadoc, it seems it only uses one
internal byte array, which may lead to a lot of copying and buffer
swapping when the data gets big, which is what I wanted to avoid with
my idea. However, I'll test its performance first, and judge then :)

Thanks a lot,
Robert
 
C

Chris Uppal

From what is visible in the javadoc, it seems it only uses one
internal byte array, which may lead to a lot of copying and buffer
swapping when the data gets big,

So create it with a large initial buffer (see the constructors).

-- chris
 
R

Robert Mischke

Chris Uppal said:
So create it with a large initial buffer (see the constructors).

Of course, but this in turn is wasteful when I have to hold lots of
small files in memory. Remember, the premise is that the size of the
data to get is completely unknown - if it was known, I'd create a
byte[size] buffer and be done :)

This "unknown size" dilemma was why I came up with the "linked list of
fixed buffers" idea at all.

Robert
 
C

Chris Uppal

Robert said:
Of course, but this in turn is wasteful when I have to hold lots of
small files in memory.

A megabyte or so isn't /that/ expensive today. why worry ?

I'm normally opposed to object pooling and re-use, but this sounds as if it
might be a valid application of the idea. So you could use a big
ByteArrayOutputStream, and reset() and re-use it as needed.

Of course, you could write a custom buffer class instead, but that's not what
you were asking about.

(Come to think of it, you could use a List<ByteArrayOutputStream> and put the
old stream onto the list when it reached a certain size, and start a new one
;-)

-- chris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,431
Messages
2,571,677
Members
48,796
Latest member
Greg L.

Latest Threads

Top