ARGF.eof? behavior

M

Mike Kasick

Hi folks,

In Ruby 1.8, I know that:

$ ruby -e 'while !ARGF.eof?; puts ARGF.readline; end' /tmp/foo /tmp/bar

prints every line in /tmp/foo, but not /tmp/bar. However, in Ruby 1.9:

$ ruby1.9 -e 'p ARGF.eof?' /tmp/foo
true

Which means that lines from neither /tmp/foo nor /tmp/bar would be printed
in the first example. Is this an expected change in behavior? Seems to be
consistent for both 1.9.1p0 and the 1.9.2 svn trunk I just compiled.

If it is, it's not that big of a deal, except I'm not sure how to "switch
ARGF" to the next file without calling ARGF.gets or ARGF.readline.

For example, is there a method to complete the following code, so as to
print lines from all files listed in ARGV?

while !ARGV.empty?
# ARGF.some_method_to_advance_to_next_file

while !ARGF.eof?
puts ARGF.readline
end
end

I know I can use ARGF.each, gets, or readline. However, I'm really calling
a parsing method that takes ARGF as an argument and calls readline on my
behalf. I'd like to able to distinguish between EOFErrors due to reaching
EOF before parsing (no more data records), and EOFErrors due to reaching
EOF during parsing (an incomplete data record).

Thanks!
 
H

Heesob Park

Hi,

2009/7/24 Mike Kasick said:
Hi folks,

In Ruby 1.8, I know that:

$ ruby -e 'while !ARGF.eof?; puts ARGF.readline; end' /tmp/foo /tmp/bar

prints every line in /tmp/foo, but not /tmp/bar. =C2=A0However, in Ruby 1= 9:

$ ruby1.9 -e 'p ARGF.eof?' /tmp/foo
true

Which means that lines from neither /tmp/foo nor /tmp/bar would be printe= d
in the first example. =C2=A0Is this an expected change in behavior? =C2= =A0Seems to be
consistent for both 1.9.1p0 and the 1.9.2 svn trunk I just compiled.

If it is, it's not that big of a deal, except I'm not sure how to "switch
ARGF" to the next file without calling ARGF.gets or ARGF.readline.

For example, is there a method to complete the following code, so as to
print lines from all files listed in ARGV?

=C2=A0while !ARGV.empty?
=C2=A0 =C2=A0# ARGF.some_method_to_advance_to_next_file

=C2=A0 =C2=A0while !ARGF.eof?
=C2=A0 =C2=A0 =C2=A0puts ARGF.readline
=C2=A0 =C2=A0end
=C2=A0end
You can use ARGF.each for both 1.8.x and 1.9.x.

Try
$ ruby -e 'ARGF.each{|l|puts l}' /tmp/foo /tmp/bar

Regards,

Park Heesob
 
M

Mike Kasick

You can use ARGF.each for both 1.8.x and 1.9.x.

Try
$ ruby -e 'ARGF.each{|l|puts l}' /tmp/foo /tmp/bar

Right, I understand that this works in this particular example. Perhaps
a more indepth example helps illustrate the problem better.

Presume I have a method, "parse", that parses data records from an IO
stream. It looks something like this:

def parse(io)
first = io.readline
... # Code to validate first
second = io.readline
...
third = io.readline
...

ParsedThing.new(first, second, third)
end

The idea is to call "parse ARGF" only when I know there's data left in
the stream to be parsed. Otherwise if I get an EOFError its meaning is
ambiguous--there could be an incomplete data record (i.e., could parse
"first" and "second", but got an EOF while reading "third"), or there
could be no more records in the file.

ARGF.each isn't going to work since ARGF is being used as an external
iterator by the parse method, and calling ARGF.gets/readline outside the
parse method strips the first line of a record. I'm looking for a
non-destructive file advance operation, if that makes sense.
 
M

Mike Kasick

I'm looking for a non-destructive file advance operation, if that makes
sense.

Two things:

- Turns out Ruby 1.9's ARGF.eof? behavior was a bug, now fixed in svn
trunk. A workaround is to call "ARGF.file" (or another ARGF accessor)
before the while loop.

- ARGF.skip is the non-destructive file advance operation that I was
looking for. ARGF.close works too, but can also close $stdin which may
not be preferred. Problem is that neither currently works when followed
by ARGF.eof?. I submitted a patch to fix that to the issue tracker.

Unfortunately this means that the behavior of ARGF with regard to
close/eof/skip changes somewhat between patchlevels on Ruby 1.8.7 & 1.9.1.

To answer my original question (and for the benefit of anyone else looking
to do something similar), the following code includes the appropriate
workarounds to work on, I believe, all 1.8/1.9 versions:

# Print (or whatever) every line from all files listed in ARGV or $stdin.

loop do
current = ARGF.file

while !ARGF.eof?
puts ARGF.readline # Or whatever.
end

ARGF.skip.file == current and break
end
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,072
Latest member
trafficcone

Latest Threads

Top