Re: reading a file that's actively being written to

G

Gordon Beaton

I have this app where I try to ftp to a box and read a file that is
actively being written to. After reading the first forty or fifty
lines a call to in.avaiable() on the InputStream even though the
file file that is accessed is still being written to. I am wondering
if there is a way of making the InputStream realize that there is
more data to be read.

Not really.

When you open the file, data structures (in the kernel or in libc, I'm
really not sure which) are initialized with the size of the file at
that particular moment. That information doesn't change even though
the file does, so you will get EOF at *that* point and you can't read
past it without closing the file and reopening it.

To continue reading from a known point that isn't the start of the
file, you should use a RandomAccessFile instead. Make a note of how
far you got, close and reopen the file, then seek to that position to
begin reading.

BTW your method of testing for EOF is flawed. If you have read the
last byte of the file, then all subsequent calls to available() will
return 0, "read" will never be assigned -1 and you will continue to
loop forever. In general, using available() makes it more difficult to
read from a stream because of that, and it gains you nothing. Simply
call read(). It will wait until there is something to read and return
EOF (-1) when the end of the stream has been reached. There will no
longer be any need for Thread.sleep() either.

I wonder if you've considered how you intend determine when the file
has finally stopped growing (EOF won't look any different).

/gordon
 
G

Gordon Beaton

Thanks Gordon, I thought something like that might have been the
case. Your explanation helped me understand this a lot better. As
for the RandomAccessFile I am not sure I have that option since
these are logs that are being generated all written to all day long.

From your last sentence I can't see what should prevent the
RandomAccessFile from working. The difference between a
RandomAccessFile and a FileInputStream is really only that the
RandomAccessFile lets you jump forward to a specific location.

Anyway perhaps another solution will work (depending on your
platform):

Have whatever is creating the logfile write to a named pipe instead. A
named pipe looks like a file anyway to the program, so it shouldn't
mind.

Your application can read directly from the pipe (using a
FileInputStream and InputStreamReader) then write to the "real"
logfile itself. That way you have no "false EOF" issues to deal with.
You will see everything the other program writes to the log, and you
will get EOF only when it has closed the file.

/gordon
 
B

Bryan Castillo

Gordon Beaton said:
Not really.

The problem is on the FTP server side most likely. The ftp server is
the one that is getting to the end of the file and sending the EOF.
There is no way (I know of) to force the FTP server to keep looking
for new data. However, some servers support File Resume, where the
byte offset can be specified for the transfer. You might be able to
use
this with a size command and query the ftp server every so often.
Jakartas Commons/Net FTPClient does have a method setRestartOffset.
I don't know which FTP library the op was using.
When you open the file, data structures (in the kernel or in libc, I'm
really not sure which) are initialized with the size of the file at
that particular moment. That information doesn't change even though
the file does, so you will get EOF at *that* point and you can't read
past it without closing the file and reopening it.

In C, python and perl you don't have to close and reopen the file.
You can use seek. If you seek 0 bytes from the current position, EOF
is reset. I thought you could use the mark and reset methods of
InputStream to
do this. Unfortunately, FileInputStream does not support mark and
reset.
Isn't there a way to do this in java?
To continue reading from a known point that isn't the start of the
file, you should use a RandomAccessFile instead. Make a note of how
far you got, close and reopen the file, then seek to that position to
begin reading.

Unfortunately, the code is executing on the client, retrieving the
file via FTP.
BTW your method of testing for EOF is flawed. If you have read the
last byte of the file, then all subsequent calls to available() will
return 0, "read" will never be assigned -1 and you will continue to

http://java.sun.com/j2se/1.4.1/docs/api/java/io/InputStream.html#read(byte[])

Returns:
the total number of bytes read into the buffer, or -1 is there is no
more data because the end of the stream has been reached.

loop forever. In general, using available() makes it more difficult to
read from a stream because of that, and it gains you nothing. Simply
call read(). It will wait until there is something to read and return
EOF (-1) when the end of the stream has been reached. There will no
longer be any need for Thread.sleep() either.

I wonder if you've considered how you intend determine when the file
has finally stopped growing (EOF won't look any different).

/gordon


In another thread you mentioned using a named pipe. I tried using a
fifo
(Unix mkfifo command) to create a named pipe. I then tried to get the
file
via FTP and received a 550 error "not a plain file". Perhaps some FTP
Servers support this?

I think you were assuming the InputStream was a FileInputStream
though,
instead of a TCP connection????
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,733
Messages
2,569,440
Members
44,830
Latest member
ZADIva7383

Latest Threads

Top