Weird File Problem - A Detective Story

C

Chris Parker

Here is a very simple example of the problem:

irb(main):001:0> i = 0
=> 0
irb(main):002:0>
File.open("tcpdump/al2ak_contents.dat"){|file|file.each_byte{|byte|
i+=1};print i}
794=> nil
irb(main):003:0> File.size("tcpdump/al2ak_contents.dat")
=> 3329
irb(main):004:0>
File.open("tcpdump/al2ak_contents.dat"){|file|file.each_byte{|byte|
};file.eof?}
=> true

i should be equal to the size of the file. This is definitely a real
difference. The file actually has 3329 bytes in it. That is what the
OS thinks and something close to that is what opening the file shows.

irb(main):019:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|};file.pos}
=> 3329

This implies that some bytes are being ignored or that each_byte just
sets pos to file size at the end.

irb(main):028:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i);file.read;}
=> "\r\016\f\021\017\022\023\024\023\022\017\030\030"
irb(main):029:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i);file.read;file.eof}
=> true
irb(main):030:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i);file.read;file.eof;file.pos}
=> 1306
irb(main):031:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i);file.read.length}
=> 13

So we are at eof after reading i bytes, as shown above, but I seek to i
and then read another 13 bytes before eof again. But look at the huge
chance in pos, which didn't go to 3329 this time. Let's try going to
i+13

irb(main):021:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i+13);file.read;}
=> ""
irb(main):022:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i+13);file.read;file.eof;}
=> true
irb(main):023:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i+13);file.read;file.eof;file.pos}
=> 1319

That sort of makes sense. If 1306 was actually the end of file, then
adding 13 to it would produce exactly these results. Adding 14 to i
has the same results as 13. Adding 15 to i has an different outcome:

irb(main):043:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i+15);file.read}
=>
"\030\030#\"\"\"#''''''''''\001\t\010\010\t\n\t\v\t\t\v\016\v\r\v\016\021\016
\016\016\016\021\023\r\r\016\r\r\023\030\021\017\017\017\017\021\030\026\027\024
\024\024\027\026"
irb(main):044:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i+15);file.read;file.eof}
=> true
irb(main):045:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i+15);file.read;file.eof;file.pos}
=> 1321

So adding 15 to i returns a new string of bytes and gets an eof again
and doesn't move the pos past 1306 + 15.

I am at a loss for explaining what is going on here. Why can't
each_byte get to the true eof? And why can I seek to the true eof (I
didn't show it, but it worked) if each_byte can't make it there?

Any help is appreciated. I'll try to answer any questions as well.

Sincerely,

Chris Parker
 
W

William James

Chris said:
Here is a very simple example of the problem:

irb(main):001:0> i = 0
=> 0
irb(main):002:0>
File.open("tcpdump/al2ak_contents.dat"){|file|file.each_byte{|byte|
i+=1};print i}
794=> nil
irb(main):003:0> File.size("tcpdump/al2ak_contents.dat")
=> 3329
irb(main):004:0>
File.open("tcpdump/al2ak_contents.dat"){|file|file.each_byte{|byte|
};file.eof?}
=> true

i should be equal to the size of the file. This is definitely a real
difference. The file actually has 3329 bytes in it. That is what the
OS thinks and something close to that is what opening the file shows.

Is this under windoze?
 
R

Robert Klemme

Chris Parker said:
Here is a very simple example of the problem:

irb(main):001:0> i = 0
=> 0
irb(main):002:0>
File.open("tcpdump/al2ak_contents.dat"){|file|file.each_byte{|byte|
i+=1};print i}
794=> nil
irb(main):003:0> File.size("tcpdump/al2ak_contents.dat")
=> 3329
irb(main):004:0>
File.open("tcpdump/al2ak_contents.dat"){|file|file.each_byte{|byte|
};file.eof?}
=> true

i should be equal to the size of the file. This is definitely a real
difference. The file actually has 3329 bytes in it. That is what the
OS thinks and something close to that is what opening the file shows.

irb(main):019:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|};file.pos}
=> 3329

This implies that some bytes are being ignored or that each_byte just
sets pos to file size at the end.

irb(main):028:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i);file.read;}
=> "\r\016\f\021\017\022\023\024\023\022\017\030\030"
irb(main):029:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i);file.read;file.eof}
=> true
irb(main):030:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i);file.read;file.eof;file.pos}
=> 1306
irb(main):031:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i);file.read.length}
=> 13

So we are at eof after reading i bytes, as shown above, but I seek to
i and then read another 13 bytes before eof again. But look at the
huge chance in pos, which didn't go to 3329 this time. Let's try
going to i+13

irb(main):021:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i+13);file.read;}
=> ""
irb(main):022:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i+13);file.read;file.eof;}
=> true
irb(main):023:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i+13);file.read;file.eof;file.pos}
=> 1319

That sort of makes sense. If 1306 was actually the end of file, then
adding 13 to it would produce exactly these results. Adding 14 to i
has the same results as 13. Adding 15 to i has an different outcome:

irb(main):043:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i+15);file.read}
=>
"\030\030#\"\"\"#''''''''''\001\t\010\010\t\n\t\v\t\t\v\016\v\r\v\016\021\016
\016\016\016\021\023\r\r\016\r\r\023\030\021\017\017\017\017\021\030\026\027\024
\024\024\027\026"
irb(main):044:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i+15);file.read;file.eof}
=> true
irb(main):045:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i+15);file.read;file.eof;file.pos}
=> 1321

So adding 15 to i returns a new string of bytes and gets an eof again
and doesn't move the pos past 1306 + 15.

I am at a loss for explaining what is going on here. Why can't
each_byte get to the true eof? And why can I seek to the true eof (I
didn't show it, but it worked) if each_byte can't make it there?

Any help is appreciated. I'll try to answer any questions as well.

Sincerely,

Chris Parker

I guess this is a binary file. Those you should open in binary mode -
regardless of platform even if some platfors don't make a difference there.
Please recheck with open mode "rb" and let us know the outcome.

Kind regards

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,152
Latest member
LorettaGur
Top