Making File.open work on gzipped files

M

Martin Hansen

Hello all,

I am trying to write a class for a file parser with an class method for
opening files for reading:

class Parser
def self.open
ios = File.open(*args)
parser = self.new(ios)

if block_given?
begin
yield parser
ensure
ios.close
end

return true
else
return parser
end
end
end


This works nicely, but I would like it to work on gzipped files too.

I was thinking about checking the file type using a system call ->
`file`.match("gzip") and if that is true then possibly using popen with
"|gzip -f". But I have no idea how to get that working in this block
context?

Cheers,


Martin
- possibly by detecting the file type using `file` - and then somehow
modify
 
B

Brian Candler

This works nicely, but I would like it to work on gzipped files too.

Ruby's Zlib library should do that nicely for you.

But if you want to do it using the external gzip program, then

ios = IO.popen("gzip -dc '#{filename}'")

should be all you need - plus a bit of checking that filename doesn't
include single quote.
 
M

Martin Hansen

Thanks Brian,

I did look at ruby's zlib and wondered why there is no method to check
if a file is zipped or not - one could perhaps could fix something by
rescuing the Zlib exception?

Looking at the docs there is a couple of TODOs. Perhaps this is another
one.


Martin
 
R

Robert Klemme

Thanks Brian,

I did look at ruby's zlib and wondered why there is no method to check
if a file is zipped or not - one could perhaps could fix something by
rescuing the Zlib exception?

Exactly, just try to open with GzipReader and if that throws just work
with the regular file which you have opened already.
Looking at the docs there is a couple of TODOs. Perhaps this is another
one.

What todo do you mean?

Cheers

robert
 
B

Brian Candler

Robert said:
Exactly, just try to open with GzipReader and if that throws just work
with the regular file which you have opened already.

Remember to rewind it too.
Zlib::GzipFile::Error: not in gzip format
from (irb):3:in `initialize'
from (irb):3:in `new'
from (irb):3
from :0=> "root:x:0:0:root:/root:/bin/bash\n"
 
M

Martin Hansen

Thanks Brian and Robert. The below snippet appears to be working nicely
- though I am not sure that the file is closed if zipped?



class Parser
def self.open(*args)
ios = File.open(*args)

begin
ios = Zlib::GzipReader.new(ios)
rescue
ios.rewind
end

parse = self.new(ios)

if block_given?
begin
yield parse
ensure
ios.close
end

return true
else
return parse
end
end
end
 
R

Robert Klemme

2010/8/18 Martin Hansen said:
Thanks Brian and Robert. The below snippet appears to be working nicely
- though I am not sure that the file is closed if zipped?



class Parser
=A0def self.open(*args)
=A0 =A0ios =3D File.open(*args)

=A0 =A0begin
=A0 =A0 =A0ios =3D Zlib::GzipReader.new(ios)
=A0 =A0rescue
=A0 =A0 =A0ios.rewind
=A0 =A0end

=A0 =A0parse =3D self.new(ios)

=A0 =A0if block_given?
=A0 =A0 =A0begin
=A0 =A0 =A0 =A0yield parse
=A0 =A0 =A0ensure
=A0 =A0 =A0 =A0ios.close
=A0 =A0 =A0end

=A0 =A0 =A0return true
=A0 =A0else
=A0 =A0 =A0return parse
=A0 =A0end
=A0end
end

I would apply these changes:

1. refactor opening code (everything before "if block_given?") into a
separate method which returns either IO or GzipReader.

2. fold parse and ios into one (i.e. the value returned from the other meth=
od).

See http://ruby-doc.org/core/classes/Zlib/GzipFile.html#method-M007448

3. In case of block_given? do not return true but rather nothing (i.e.
what the block returned). This is more flexible.

That way your code will become simpler.

Kind regards

robert

--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
 
B

Brian Candler

Martin said:
Thanks Brian and Robert. The below snippet appears to be working nicely
- though I am not sure that the file is closed if zipped?

Good question, but the documentation for GzipReader#close gives you the
answer:

"Closes the GzipFile object. This method calls close method of the
associated IO object. Returns the associated IO object."
 
R

Robert Klemme

2010/8/18 Brian Candler said:
Good question, but the documentation for GzipReader#close gives you the
answer:

"Closes the GzipFile object. This method calls close method of the
associated IO object. Returns the associated IO object."

That's why I suggested my item 1. :)

Cheers

robert
 
M

Martin Hansen

I must be missing something :eek:/

#<Parse:0x000001008a5920>
parse.rb:14:in `open': undefined method `close' for
#<Parse:0x000001008a5920 @io=#<File:/etc/passwd>> (NoMethodError)
from parse.rb:38:in `<main>'

Test code below.


Martin

require 'zlib'
require 'pp'

class Parse
def self.open(*args)
ios = self.zopen(*args)

if block_given?
begin
yield ios
ensure
ios.close
end
else
return ios
end
end

def initialize(io)
@io = io
end

private

def self.zopen(*args)
ios = File.open(*args)

begin
ios = Zlib::GzipReader.new(ios)
rescue
ios.rewind
end

self.new(ios)
end
end

Parse.open("/etc/passwd") do |ios|
puts ios
end
 
M

Martin Hansen

Of cause! And inserting the close method the right place inside the
class even makes everything work!

Thanks a zillion guys!


Martin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,053
Latest member
BrodieSola

Latest Threads

Top