Performance Issues in Random Access Files

G

gwlucas

I have an application where I need to read data from random posiitions
within file. In this case performance is VERY important. The problem I
am running into is that the RandomAccessFile class appears to have
serious performance issues. Based on a bit of experimentation, I
suspect that it doesn't implement any kind of intrinsic buffering.

Anyway, I can think of five or six ways of working around this
problem, but I can't believe that there isn't a standard solution. As
a rule, I like to stick to "recognized practices of the community"
even when it would be more fun to "roll my own." So... is there a
canonical solution? Would somebody be able to point me in the right
direction?

Background

I have a file containing blocks of data that need to be accessed at
random. I never know which block I am going to need to read next but,
within each block, the data is read in sequence. The pattern of access
is somewhat similar to an old fashioned ISAM file. Thus I would like
to set the file pointer to the beginning of the block, read some
integers, read some doubles, etc, all in sequence. Later, I would jump
to another file position and do the same. Normally, I would accomplish
these sequential reads with a BufferedInputStream. But since I do have
the random access component, that doesn't seem to be an option
(apparently, you cant wrap a BufferedInputStream around a random
access file).

Thanks in advance for your help.

Gary
 
G

gwlucas

I have an application where I need to read data from random posiitions
within file. In this case performance is VERY important. [snip]

An addenda... Strictly for testing purposes, I have implemented a
wrapper around RandomAccessFile following a suggestion posted here by
Tom Anderson on 10 Oct 2002 which
makes a class that looks like an InputStream but makes pass-through
calls to RandomAccessFile. It works, but definitately falls into the
kludge category.

I was wondering whether the java.nio classes might be a solution here.
I've never used them and just assumed that they were related to
network I/O, but there does appear
to be some stuff related to FileChannels that might be relevant.
Frankly, the whole java.nio API looks rather unfathomable.

Thanks again.

Gary
 
R

Roedy Green

I have an application where I need to read data from random posiitions
within file. In this case performance is VERY important. The problem I
am running into is that the RandomAccessFile class appears to have
serious performance issues. Based on a bit of experimentation, I
suspect that it doesn't implement any kind of intrinsic buffering.

see http://mindprod.com/jgloss/nio.html

I would try implementing this with nio. If the file in not too
enormous, you can use the memory mapping features to tap into the
system caching.
 
E

Esmond Pitt

I was wondering whether the java.nio classes might be a solution here.

They are indeed. What you want is FileChannel.map(), then you just deal
directly with a MappedByteBuffer. This nly works reasonably for fixed
length files of course ... but that's much the same for any file-mapping
API.
 
G

gwlucas

Roedy and Esmond,

Thank you both for your help (Roedy, I am a long time admirer of your
contributions to the Java community).

I plan on looking into Java NIO. Roedy's web site provides a link to a
pretty extensive tutorial.

In the mean time, I tried some experiments using Tom Andersion's 2002
suggestion (which pre-dates NIO) of wrapping RandomAccessFile in a
container class that allows it to be wrapped in a BufferedInputStream.

Just using a quick-and-dirty approach, one that makes very wasteful
use of object creation, I was achieved a factor of 9 improvement in
speed. It would be interesting to see if it would gets better if I
took a careful approach. Anyway, it turns out that RandomAccessFile is
a pretty lousey implementation... a big step back from the C language
standard i/o implementation of fread/fwrite that was created in the
1970's.

Gary
 
E

Esmond Pitt

a pretty lousey implementation... a big step back from the C language
standard i/o implementation of fread/fwrite that was created in the
1970's.

That's because it doesn't share stdio's fundamental problem, i.e. the
user-side buffering, which makes it useless for multi-user I/O. RAF
omits the user-side buffering, which you can add yourself, as you have,
and leaves the resulting multi-user problem up to you too. As shipped,
RAF works multi-user.

Re Java, the bigger mystery to me is why InputStream and OutputStream
are abstract classes instead of interfaces so that RandomAccessFile
could extend them? and/or why aren't there adapters so you could get
buffered streams out of an RAF?
 
J

Joshua Cranmer

Esmond said:
Re Java, the bigger mystery to me is why InputStream and OutputStream
are abstract classes instead of interfaces so that RandomAccessFile
could extend them?

My guess is so that every implementation wouldn't have to rewrite the
near-functionally equivalent overloaded read()/write() methods.
> and/or why aren't there adapters so you could get
buffered streams out of an RAF?

I think part of this was the impetus for NIO. But I haven't really dealt
with this kind of problem, so I can't say.
 
P

Patricia Shanahan

Joshua said:
My guess is so that every implementation wouldn't have to rewrite the
near-functionally equivalent overloaded read()/write() methods.


I think part of this was the impetus for NIO. But I haven't really dealt
with this kind of problem, so I can't say.

Maybe attack the problem using a byte[] to represent the record? Do a
readFully to fill a byte[] the same size as the record. Wrap the byte[]
in a byteArrayInputStream and a DataInputStream to access the fields in
the record.

Patricia
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top