What is the best way to edit a file to eliminate a line using Ruby?

  • Thread starter Steve [RubyTalk]
  • Start date
S

Steve [RubyTalk]

This sounds an easy task, but I'm certain that I'm yet to find the most
elegant solution.

I have a text file which I want to process using ruby in order to update
it. I want to remove the single line which matches a regexp for which I
have a definition. I'd prefer not to explicitly use temporary files -
however (and this is important) I also don't want to risk loosing data
with corruptions if the ruby process is killed unexpectedly... and I
definitely don't want a file other than the one with/without the line
I'm deleting to be read by any other process.

Is there something in a library which would make this task easy?
 
W

William James

Steve said:
This sounds an easy task, but I'm certain that I'm yet to find the most
elegant solution.

I have a text file which I want to process using ruby in order to update
it. I want to remove the single line which matches a regexp for which I
have a definition. I'd prefer not to explicitly use temporary files -
however (and this is important) I also don't want to risk loosing data
with corruptions if the ruby process is killed unexpectedly... and I
definitely don't want a file other than the one with/without the line
I'm deleting to be read by any other process.

Is there something in a library which would make this task easy?

ruby -i.bak -ne 'print if $_ !~ /foo/' stuff.txt
 
S

Steve [RubyTalk]

William said:
ruby -i.bak -ne 'print if $_ !~ /foo/' stuff.txt
Coo... that's a new one to me... very nifty.

Unfortunately, I mislead you... I want to transform a file from within a
cgi script... which means I need to use standard out to generate other
feedback to the user. Is a similar facility available within a ruby
program without executing a new ruby process?
 
M

Mike Fletcher

steve_rubytalk said:
Coo... that's a new one to me... very nifty.

Unfortunately, I mislead you... I want to transform a file from within a
cgi script... which means I need to use standard out to generate other
feedback to the user. Is a similar facility available within a ruby
program without executing a new ruby process?

File.new( "stuff.txt" ) do | f |
f.each do |line|
print unless line =~ /foo/
end
end

Or if you needed to rewrite it to a different file

File.new( "stuff.txt" ) do | in |
File.new( "newstuff.txt", "w" ) do |out|
in.each { | line | out.print line unless line =~ /foo/ }
end
end
 
W

William James

Mike said:
File.new( "stuff.txt" ) do | f |
f.each do |line|
print unless line =~ /foo/
end
end

IO.foreach('stuff'){|s| print s unless s =~ /foo/}
 
J

Jeffrey Schwab

Mike said:
File.new( "stuff.txt" ) do | f |
f.each do |line|
print unless line =~ /foo/
end
end

Or if you needed to rewrite it to a different file

File.new( "stuff.txt" ) do | in |
File.new( "newstuff.txt", "w" ) do |out|
in.each { | line | out.print line unless line =~ /foo/ }
end
end


File.new doesn't take a block. Use File.open. Also, "in" is a keyword,
so the above code produces a syntax error. With fixes:

# Like grep -v.

File.open( "stuff.txt" ) do |input|
File.open( "newstuff.txt", "w" ) do |output|
input.each { |line| output.print line unless line =~ /foo/ }
end
end
 
S

Steve [RubyTalk]

Mike said:
File.new( "stuff.txt" ) do | in |
File.new( "newstuff.txt", "w" ) do |out|
in.each { | line | out.print line unless line =~ /foo/ }
end
end
That's remarkably similar to my current rough-n-ready approach - the
one I consider inelegant...(N.B. the example above doesn't address the
problem of atomically replacing stuff.txt with newstuff.txt.) I was
thinking that something like this would be preferable:

FileModify.open 'stuff.txt' { |mfile| mfile.delete(/foo/) }

Of course, I've just invented FileModify off the top of my head, and I
imagine it being 'transactional' - i.e. any exception arising in the
block would prevent any change to stuff.txt. I'd prefer not to go
around re-inventing the wheel if FileModify (or something similar)
already exists. I don't need it to be desperately scalable or quick -
on the other hand, reliability _is_ a key concern and I'd prefer to use
the neatest possible syntax.
 
Z

zdennis

Steve said:
That's remarkably similar to my current rough-n-ready approach - the
one I consider inelegant...(N.B. the example above doesn't address the
problem of atomically replacing stuff.txt with newstuff.txt.) I was
thinking that something like this would be preferable:

FileModify.open 'stuff.txt' { |mfile| mfile.delete(/foo/) }

Of course, I've just invented FileModify off the top of my head, and I
imagine it being 'transactional' - i.e. any exception arising in the
block would prevent any change to stuff.txt. I'd prefer not to go
around re-inventing the wheel if FileModify (or something similar)
already exists. I don't need it to be desperately scalable or quick -
on the other hand, reliability _is_ a key concern and I'd prefer to use
the neatest possible syntax.

I have done some things like:

def write filename, data
File.open( filename, 'w' ){ |file| file.write data }
end

begin
data = IO.read( 'stuff.txt' )
write 'stuff.txt', data.gsub( /^foo(\n|$)/, '' )
rescue
write 'stuff.txt.orig', data
end


Zach
 
Z

zdennis

zdennis said:
def write filename, data
File.open( filename, 'w' ){ |file| file.write data }
end

begin
data = IO.read( 'stuff.txt' )
write 'stuff.txt', data.gsub( /^foo(\n|$)/, '' )
rescue
write 'stuff.txt.orig', data
end

You'd probably want the rescue statement to be a "rescue Exception" so you catch any/all errors...


Zach
 
Z

zdennis

Ok, slightly more elegant...

class File
def self.modify filename
if block_given?
data = IO.read filename
begin
file = File.open filename, 'w'
yield file, data
rescue Exception => ex
file.open( filename, 'w' ){|file| file.write data }
ensure
file.close unless file.closed?
end
end
end
end


File.modify( 'stuff.txt' ) do |writable_file, original_file_contents|
writable_file.write original_file_contents.gsub /foo(\n|$)/, ''
end

Hope this works better...

Zach
 
S

Steve [RubyTalk]

zdennis said:
Hope this works better...
It will still permanently and irrecoverably loose data if the process
terminates (e.g. a process hard-limit is exceeded, an administrator
kills the process explicitly; or an old-fashioned power-cut etc.) just
after starting to write the updated file... so I wouldn't consider it
sufficiently robust for my purposes.
 
Z

zdennis

Steve said:
It will still permanently and irrecoverably loose data if the process
terminates (e.g. a process hard-limit is exceeded, an administrator
kills the process explicitly; or an old-fashioned power-cut etc.) just
after starting to write the updated file... so I wouldn't consider it
sufficiently robust for my purposes.

You could modify for your needs:

require 'fileutils'

class File
def self.modify filename
if block_given?
data = IO.read filename
FileUtils.mv filename, "#{filename}.orig"
begin
file = File.open filename, 'w'
yield file, data
ensure
file.close unless file.closed?
end
end
end
end

File.modify( 'stuff.txt' ) do |writable_file, original_file_contents|
writable_file.write original_file_contents.gsub( /foo(\n|$)/, '' )
end

Zach
 
S

Steve [RubyTalk]

zdennis said:
You'd probably want the rescue statement to be a "rescue Exception" so
you catch any/all errors...
Both versions look dangerous to me.

1. If an exception is raised on opening 'stuff.txt' to read then an
attempt will be made to truncate the file (or to overwrite it with
whatever happened to be in data previously. [This could be avoided by
reading before begin.]

2. If a disk becomes full (or nearly full) during the write operation
then the rescue will likely not be able to write all the unmodified data
back - hence permanently loosing valuable information.

I need a more robust approach than this. :)
 
Z

Zach Dennis

Steve said:
zdennis said:
You'd probably want the rescue statement to be a "rescue Exception" so
you catch any/all errors...

Both versions look dangerous to me.

1. If an exception is raised on opening 'stuff.txt' to read then an
attempt will be made to truncate the file (or to overwrite it with
whatever happened to be in data previously. [This could be avoided by
reading before begin.]

2. If a disk becomes full (or nearly full) during the write operation
then the rescue will likely not be able to write all the unmodified data
back - hence permanently loosing valuable information.

I need a more robust approach than this. :)

Understood. I don't know full extent of your issue. It appears you can
run into a lot of possibilities regarding where the *power goes up*. It
could happen during any system process, not just rubys.

If this helps lead you to an elegant implementation, great!
Otherwise...maybe it will steer you away from a potential disaster! good
luck!

Zach
 
A

Alan Chen

I don't think you can get away without a tempfile and get safe
"in-place"
modifications. It looks to me like the best compromise would be to

- read in the original
- write the modified file to a temp (use ruby's 'tempfile' which, I
think, should create a temp with secure permissions)
- use the most atomic os facility you can to copy the modified atop the
original

On many platforms this might map to Rubys File.rename or FileUtils.mv,
I'm not sure...

HTH,
- alan
 
S

Steve [RubyTalk]

Alan said:
I don't think you can get away without a tempfile and get safe
"in-place"
modifications. It looks to me like the best compromise would be to

- read in the original
- write the modified file to a temp (use ruby's 'tempfile' which, I
think, should create a temp with secure permissions)
- use the most atomic os facility you can to copy the modified atop the
original

On many platforms this might map to Rubys File.rename or FileUtils.mv,
I'm not sure...
Yup... that seems pretty reasonable to me too....though I have to say
I'm surprised that I seem to be defining something to do this rather
than just using a library component. It's exactly the sort of thing I'd
have previously been sure someone would have contributed.

Steve
 
G

Gavin Kistner

Yup... that seems pretty reasonable to me too....though I have to
say I'm surprised that I seem to be defining something to do this
rather than just using a library component. It's exactly the sort
of thing I'd have previously been sure someone would have contributed.

I think most of us have faith that, in general, the computer will not
lose power in the middle of an operation.

Which explains why I had to re-type a half-hour's worth of wiki page
editing when the my company's building lost power a few days ago. :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top