convenient way to read text file multiple times without reopening it

T

Tomas Mikula

Hi,

I suppose that a convenient way to read text file is through FileReader.
But FileReader does not support the reset().
So, as a temporal remedy, I am opening the file twice:

Reader fr = new FileReader("filename");
......
fr.close();
......
fr = new FileReader("filename");
......


Of course this is not a clean way. At least, the file could possibly be
removed or renamed by another program between the two calls to "new
FileReader()". So I want to open the file once and then read it
(sequentially) twice.
I suppose Java provides a convenient way for doing it, but I'm unable to
figure it out.

Thank you for pointing me the right way!

Tomas
 
M

Matt Humphrey

Tomas Mikula said:
Hi,

I suppose that a convenient way to read text file is through FileReader.
But FileReader does not support the reset().
So, as a temporal remedy, I am opening the file twice:

Reader fr = new FileReader("filename");
.....
fr.close();
.....
fr = new FileReader("filename");
.....


Of course this is not a clean way. At least, the file could possibly be
removed or renamed by another program between the two calls to "new
FileReader()". So I want to open the file once and then read it
(sequentially) twice.
I suppose Java provides a convenient way for doing it, but I'm unable to
figure it out.

Thank you for pointing me the right way!

On some OS's there's no guarantee the file will be locked while you are
reading it so if you are seriously concerned about preventing the file from
changing you'll have to know more about how your OS handles that problem.
Java NIO includes file locking, but how it works is up to the OS.

Presuming, however, that if you open a file for reading it will naturally be
locked for writing, you can always solve your problem by doing the following
so that the overlapping read locks block out write access.

Reader reader1 = new FileReader ("filename");

// Read as much as you like

Reader reader2 = new FileReader ("filename");
reader1.close ();

// Read it again

reader2.close ();


It may also be possible (I haven't tried this but it's a fairly easy test)
to open your file as a FileInputStream, get the stream's FileChannel and
then reset the position of that channel.

Matt Humphrey http://www.iviz.com/
 
T

Tomas Mikula

On some OS's there's no guarantee the file will be locked while you are
reading it so if you are seriously concerned about preventing the file from
changing you'll have to know more about how your OS handles that problem.

I am not seriously concerned about it.
I just want to make sure I read the same file.
Presuming, however, that if you open a file for reading it will naturally be
locked for writing, you can always solve your problem by doing the following
so that the overlapping read locks block out write access.

Reader reader1 = new FileReader ("filename");

// Read as much as you like

Reader reader2 = new FileReader ("filename");
reader1.close ();

// Read it again

reader2.close ();

Without locking this will not work (at least not on Unix-like systems) and if
write lock does not restrain from file removal (which is write access to
directory, not the file), then even lock will not help.
Imagine this scenario:
- my program opens the file
- somebody else renames or removes the file (the file will disappear from
directory, but it's open file descriptors still work)
- optionally: another file with the same name is created
- I try to open the file with filename again, but either it does not exist or
is another file


So if I open the file just once, I can be sure I'm still working with the same
file.

It may also be possible (I haven't tried this but it's a fairly easy test)
to open your file as a FileInputStream, get the stream's FileChannel and
then reset the position of that channel.

I just tried it and it works when I am reading from that FileInputStream,
but not when I create InputStreamReader from that FileInputStream.
Here is my code:

public static void main(String[] args) throws IOException{
FileInputStream is = new FileInputStream("input");
FileChannel fc = is.getChannel();
InputStreamReader reader = new InputStreamReader(is);
for(int j=0;j<2;++j){
for(int i=0; i<5; ++i){
int x = is.read(); /* (X) */
if(x==-1)
System.exit(1);
System.out.print((char) x);
}
fc.position(0);
}
}

File "input" contains text "0123456789".
When I run this program, it outputs
"0123401234" as expected.

But when I change the line marked with /* (X) */ to
int x = reader.read();
then I get this output:
"0123456789", so the fc.position(0) had no effect.

Maybe there is some buffering in InputStreamReader,
but that I would expect only from BufferedReader.
 
Z

Zig

Without locking this will not work (at least not on Unix-like systems)
and if
write lock does not restrain from file removal (which is write access to
directory, not the file), then even lock will not help.
Imagine this scenario:
- my program opens the file
- somebody else renames or removes the file (the file will disappear from
directory, but it's open file descriptors still work)
- optionally: another file with the same name is created
- I try to open the file with filename again, but either it does not
exist or
is another file

I have a vague memory in Windows that when you force deletion of an open
file, it will invalidate the file descriptors. Thus even if you have a
stream created, reads after the point of deletion will throw an
IOException.

If you really need to be able to reproduce a read, you might consider
using File.createTempFile, copy the source to the temp, and then just
re-read from temp file, since the probability of other apps mucking with
temp files might be somewhat lower.
It may also be possible (I haven't tried this but it's a fairly easy
test)
to open your file as a FileInputStream, get the stream's FileChannel and
then reset the position of that channel.

I just tried it and it works when I am reading from that FileInputStream,
but not when I create InputStreamReader from that FileInputStream.
Here is my code:

public static void main(String[] args) throws IOException{
FileInputStream is = new FileInputStream("input");
FileChannel fc = is.getChannel();
InputStreamReader reader = new InputStreamReader(is);
for(int j=0;j<2;++j){
for(int i=0; i<5; ++i){
int x = is.read(); /* (X) */
if(x==-1)
System.exit(1);
System.out.print((char) x);
}
fc.position(0);
}
}

File "input" contains text "0123456789".
When I run this program, it outputs
"0123401234" as expected.

But when I change the line marked with /* (X) */ to
int x = reader.read();
then I get this output:
"0123456789", so the fc.position(0) had no effect.

Maybe there is some buffering in InputStreamReader,
but that I would expect only from BufferedReader.

Think of InputStreamReader as a bridge from byte[] to char[].
BufferedReader provides the decoded char buffer (char[]), but the
InputStreamReader is still responsible for the binary reading, and thus it
does maintain it's own private byte[] buffer.

When it comes to streams, once you wrap one stream with another stream,
any further direct changes to the underling stream (without going through
the outer stream first) usually result in "undefined" behavior of the
outer stream. Thus, it's usually best to avoid making such changes.

For your case, if you are using NIO, you can use CharsetDecoder. You
should be able to create your own ByteBuffer and CharBuffer, and use
FileChannel.position + CharsetDecoder.reset to reset yourself back to a
state. Alternatively, you might use
java.nio.channels.Channels.newReader( <your FileChannel>, <your
CharsetDecoder> ): just use the postion + reset on the channel + decoder
and throw away the Reader without closing it - eventually closing the
channel when you are done.

HTH,

-Zig
 
M

Mike Schilling

Tomas Mikula said:
Hi,

I suppose that a convenient way to read text file is through FileReader.
But FileReader does not support the reset().
So, as a temporal remedy, I am opening the file twice:

Reader fr = new FileReader("filename");
.....
fr.close();
.....
fr = new FileReader("filename");
.....


Of course this is not a clean way. At least, the file could possibly be
removed or renamed by another program between the two calls to "new
FileReader()". So I want to open the file once and then read it
(sequentially) twice.
I suppose Java provides a convenient way for doing it, but I'm unable to
figure it out.

Thank you for pointing me the right way!

Can you read the whole thing into memory?

Alternatively, you can open it as a RandomAccessFile, which does allow you
to reset the file offset back to 0.
 
N

Nigel Wade

Tomas said:
Hi,

I suppose that a convenient way to read text file is through FileReader.
But FileReader does not support the reset().

Could you create a FileInputStream, and then create your FileReader from the
FileReader(FileDescriptor) constructor, using FileInputStream.getFD() to get
the FileDescriptor?

FileInputStream also doesn't support reset() (does any file class actually
support a reset() method?), but you can "coerce" it by using NIO, namely
FileInputStream.getChannel().position(0);

Not neat, but at least it keeps the file open.

So, as a temporal remedy, I am opening the file twice:

Reader fr = new FileReader("filename");
.....
fr.close();
.....
fr = new FileReader("filename");
.....


Of course this is not a clean way. At least, the file could possibly be
removed or renamed by another program between the two calls to "new
FileReader()".

Indeed. If that is a possibility then you need to keep the file open.
So I want to open the file once and then read it
(sequentially) twice.
I suppose Java provides a convenient way for doing it, but I'm unable to
figure it out.

I couldn't figure out a convenient way.
Thank you for pointing me the right way!

I wouldn't call this "the right way", more a case of "a way that works".
 
M

Matt Humphrey

Tomas Mikula said:
I am not seriously concerned about it.
I just want to make sure I read the same file.

Then just use the same file name. If you really think the file might be
deleted or modified while you are reading it, you will have to take some
action to preserve the integrity of what you're reading. I have plenty of
programs where I assume that files will not change or be deleted between
accesses even though it is physically possible for an external program or
idiotic user to do so. Those cases are far too rare to worry about
safeguarding.
Without locking this will not work (at least not on Unix-like systems) and
if
write lock does not restrain from file removal (which is write access to
directory, not the file), then even lock will not help.

Yes, I said implicit locking is needed for that example to work.
Imagine this scenario:
- my program opens the file
- somebody else renames or removes the file (the file will disappear from
directory, but it's open file descriptors still work)
- optionally: another file with the same name is created
- I try to open the file with filename again, but either it does not exist
or
is another file

If you think this scenario could actually occur and you have a system that
does not honor any locking, I don't know how you will solve the simpler
problem in which you open a file for reading while some other user or
program opens it for writing and proceeds to make changes. That case is not
preventable. You can't even read the file into memory or copy the file to a
secure area because some other person may try to modify it while you are
copying it. The upcoming example demonstrates this.
So if I open the file just once, I can be sure I'm still working with the
same
file.

This is not true if the OS does not honor locking because the file contents
can be changed while you're reading it.

On Windows XP, I used the following program to continuously read a file
slowly while I used notepad to modify the file and eventually delete it.
When I saved new file contents (about 18K of repeating but identifiable
text), the program would output the old contents up to about 8K at which
point it would shift to the new contents. The OS appears to buffer one 8K
block and the file status is checked only when the buffer is refilled. The
program could not tell that the file contents had changed--it saw only a
continuous stream of characters. When I truncated the file, it would simply
pick up EOF immediately and when I deleted the file, the program threw an
exception.

public static void main(String[] args) throws IOException{
FileInputStream is = new FileInputStream("test.txt");
FileChannel fc = is.getChannel();

for(int j=0;j<20;++j){
InputStreamReader reader = new InputStreamReader(is);
while (true) {
int x = reader.read ();
if (x == -1) break;
System.out.print((char) x);
try {
Thread.sleep (10);
} catch (InterruptedException ex) {
System.err.println ("Oops");
}
}
reader.close ();

fc.position(0);
System.out.println ("****");
}
}
It may also be possible (I haven't tried this but it's a fairly easy
test)
to open your file as a FileInputStream, get the stream's FileChannel and
then reset the position of that channel.

I just tried it and it works when I am reading from that FileInputStream,
but not when I create InputStreamReader from that FileInputStream.
Here is my code:

public static void main(String[] args) throws IOException{
FileInputStream is = new FileInputStream("input");
FileChannel fc = is.getChannel();
InputStreamReader reader = new InputStreamReader(is);
for(int j=0;j<2;++j){
for(int i=0; i<5; ++i){
int x = is.read(); /* (X) */
if(x==-1)
System.exit(1);
System.out.print((char) x);
}
fc.position(0);
}
}

After you set fc.position(0), reopen a new InputStreamReader--don't reuse
the old one. That is, put the new InputStreamReader inside your first for
loop.
File "input" contains text "0123456789".
When I run this program, it outputs
"0123401234" as expected.

But when I change the line marked with /* (X) */ to
int x = reader.read();
then I get this output:
"0123456789", so the fc.position(0) had no effect.

Maybe there is some buffering in InputStreamReader,
but that I would expect only from BufferedReader.

The answer to your original question is, yes you can read the same file
twice, but without OS file locking there is no way to ensure that even a
single read will be uncontaminated by external writers. If you can assert
that a single read is correct, read the file into memory or copy it to a
secure location.

Matt Humphrey http://www.iviz.com/
 
A

Andreas Leitgeb

Tomas Mikula said:
I am not seriously concerned about it.
I just want to make sure I read the same file.

The same file, or the same content?

Between these reads, someone else might append data to the file,
or truncate it, or both, or whatever.

Do you want to be sure, you get the same *data* twice, then you're
probably best off to keep the content in memory during first read
and read from memory next time.
If you're interested in changes on the file, then why worry about
whether the file itself changed or whether it got replaced with a
different one? Just reopen it by name, and see what's there, this
time.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,731
Messages
2,569,432
Members
44,832
Latest member
GlennSmall

Latest Threads

Top