Open file, get first line, delete first line close file

Discussion in 'Ruby' started by Richard Schneeman, Aug 23, 2008.

  1. Hey, i'm trying to open a file, get the first line of the file, delete
    that line from the file, and then close the file. Using ruby 1.8.6.

    I've tried using

    The_File = IO.readlines("public/languages/chinese/practice.txt")

    &

    The_File = open('public/languages/chinese/practice.txt','r+')

    but i can't figure out the correct syntax for how to delete a line in a
    file, and then save that file.
    --
    Posted via http://www.ruby-forum.com/.
    Richard Schneeman, Aug 23, 2008
    #1
    1. Advertising

  2. Richard Schneeman

    Daniel Bush Guest

    Richard Schneeman wrote:
    > Hey, i'm trying to open a file, get the first line of the file, delete
    > that line from the file, and then close the file. Using ruby 1.8.6.
    >
    > I've tried using
    >
    > The_File = IO.readlines("public/languages/chinese/practice.txt")
    >
    > &
    >
    > The_File = open('public/languages/chinese/practice.txt','r+')
    >
    > but i can't figure out the correct syntax for how to delete a line in a
    > file, and then save that file.


    Simplest thing is to get it into memory like you did with readlines
    above and write out the altered contents to a new file and then move it.
    (I prefer File.readlines - although this is the same method either way).
    Large files might require different treatment rather than slurping into
    memory like that.

    That being said, I'd be curious to hear how people use r+ mode.

    Daniel
    --
    Posted via http://www.ruby-forum.com/.
    Daniel Bush, Aug 24, 2008
    #2
    1. Advertising

  3. That is one option...the reason i need to be able to delete lines, is
    that i am running a rake task on a server, that i cannot easily modify
    files, and I can't predict how long the rake task will time out. The
    task takes entries from text based dictionaries and adds it to my DB,
    the thing is, if i didn't delete the lines i've already added, everytime
    i ran the task, (it has to be run multiple times due to timeouts) i
    would only re-add the same lines. The deletion acts as a place holder of
    sorts. I'll play around with your suggestion, and i'll let you know. If
    anyone else has an alternate method...i'm all ears

    --
    Posted via http://www.ruby-forum.com/.
    Richard Schneeman, Aug 24, 2008
    #3
  4. Richard Schneeman wrote:
    >
    >
    >
    > That is one option...the reason i need to be able to delete lines, is
    > that i am running a rake task on a server, that i cannot easily modify
    > files, and I can't predict how long the rake task will time out. The
    > task takes entries from text based dictionaries and adds it to my DB,
    > the thing is, if i didn't delete the lines i've already added, everytime
    > i ran the task, (it has to be run multiple times due to timeouts) i
    > would only re-add the same lines. The deletion acts as a place holder of
    > sorts. I'll play around with your suggestion, and i'll let you know. If
    > anyone else has an alternate method...i'm all ears


    Its gross but it works

    active_dictionary =
    File.readlines("public/languages/chinese/practice.txt")

    open('public/languages/chinese/practice.txt', 'w') do |file|
    file.puts active_dictionary[1,active_dictionary.size]
    end

    Once again, i'm interested in different approaches, or something, not
    quite so processor intensive
    --
    Posted via http://www.ruby-forum.com/.
    Richard Schneeman, Aug 24, 2008
    #4
  5. Richard Schneeman

    Daniel Bush Guest

    Richard Schneeman wrote:
    ...
    > Its gross but it works


    I don't know if it's all that gross. I have feeling this is
    standard way to do it for apps and editors ie write out entire altered
    content
    after working with file in-memory using whatever scheme.
    Different story if you're a database I guess.

    Here is one scheme some text editors use:
    http://en.wikipedia.org/wiki/Gap_buffer
    although it doesn't discuss file system/persistence issues.
    Presumably when you hit the save button, the system writes
    out to a new file (the gap buffer is not playing around with
    the old file stream).

    If you're just replacing stuff character for character, then
    it seems ok to use the file stream (in r+ mode) or if you're
    appending (or both); but deleting or inserting content seems
    problematic - not sure it's possible let alone standardized.
    Anyone want to weigh in here?


    > Once again, i'm interested in different approaches, or something, not
    > quite so processor intensive


    If the file is really large, you can perhaps just move through the
    stream till you get to the point where you want to start
    then commence writing from the old stream to the new file stream.
    May be ways to optimise it.

    Sparse files and fixed line lengths ? :)
    http://en.wikipedia.org/wiki/Sparse_file

    Maybe I've said enough wrong things to provoke a reacion
    from someone else.

    Daniel
    --
    Posted via http://www.ruby-forum.com/.
    Daniel Bush, Aug 24, 2008
    #5
  6. Richard Schneeman wrote:
    > but i can't figure out the correct syntax for how to delete a line in a
    > file, and then save that file.


    tail +2 the_file > the_new_file

    Not all problems are best solved with ruby :)

    --
    Posted via http://www.ruby-forum.com/.
    Erik Hollensbe, Aug 24, 2008
    #6
  7. Richard Schneeman

    Dave Bass Guest

    Erik Hollensbe wrote:
    > Not all problems are best solved with ruby :)


    And sometimes the problems are not with the program but with the data
    structure -- maybe a flat file isn't the right way to do things?

    And... it's a lot easier to delete the *last* line of a file than the
    first.

    Just some thoughts. :)
    --
    Posted via http://www.ruby-forum.com/.
    Dave Bass, Aug 24, 2008
    #7
  8. Dave Bass wrote:
    > And... it's a lot easier to delete the *last* line of a file than the
    > first.


    I don't think this is actually true, can you explain further?

    -Erik
    --
    Posted via http://www.ruby-forum.com/.
    Erik Hollensbe, Aug 25, 2008
    #8
  9. On Aug 25, 2008, at 4:07 AM, Erik Hollensbe wrote:

    > Dave Bass wrote:
    >> And... it's a lot easier to delete the *last* line of a file than the
    >> first.

    >
    > I don't think this is actually true, can you explain further?
    >
    > -Erik



    You can just truncate the file size. You don't have any subsequent
    lines (bytes) to move into a new position within the file.

    -Rob

    Rob Biedenharn http://agileconsultingllc.com
    Rob Biedenharn, Aug 25, 2008
    #9
  10. Rob Biedenharn wrote:
    > On Aug 25, 2008, at 4:07 AM, Erik Hollensbe wrote:
    >
    >> Dave Bass wrote:
    >>> And... it's a lot easier to delete the *last* line of a file than the
    >>> first.

    >>
    >> I don't think this is actually true, can you explain further?
    >>
    >> -Erik

    >
    >
    > You can just truncate the file size. You don't have any subsequent
    > lines (bytes) to move into a new position within the file.


    How do you know which line is the last line?

    Unless there's something I don't know, that involves reading the whole
    file, or a combination of seek/read from the end until you find the last
    newline, which is essentially what tail +2 does, but starts at the
    beginning of the file.

    "Moving" data in a file is the worst possible scenario for I/O at all.
    You can do both of these operations in a single pass read of the file
    without shoving the whole thing into memory at once. It just involves
    writing to one file and reading from another, is all.
    --
    Posted via http://www.ruby-forum.com/.
    Erik Hollensbe, Aug 25, 2008
    #10
  11. On Aug 25, 2008, at 9:21 AM, Erik Hollensbe wrote:

    > Rob Biedenharn wrote:
    >> On Aug 25, 2008, at 4:07 AM, Erik Hollensbe wrote:
    >>
    >>> Dave Bass wrote:
    >>>> And... it's a lot easier to delete the *last* line of a file than
    >>>> the
    >>>> first.
    >>>
    >>> I don't think this is actually true, can you explain further?
    >>>
    >>> -Erik

    >>
    >>
    >> You can just truncate the file size. You don't have any subsequent
    >> lines (bytes) to move into a new position within the file.

    >
    > How do you know which line is the last line?
    >
    > Unless there's something I don't know, that involves reading the whole
    > file, or a combination of seek/read from the end until you find the
    > last
    > newline, which is essentially what tail +2 does, but starts at the
    > beginning of the file.
    >
    > "Moving" data in a file is the worst possible scenario for I/O at all.
    > You can do both of these operations in a single pass read of the file
    > without shoving the whole thing into memory at once. It just involves
    > writing to one file and reading from another, is all.
    > --
    > Posted via http://www.ruby-forum.com/.


    Well, if you want/need the last line(s) of a file (presumably text or
    how would you define a "line"), you can take a look at the File::Tail
    gem.

    gem install file-tail

    I had some Perl code (lifted from some forum or article) that would
    cut initial lines out of a log file using sysread/syswrite with a
    truncate to reset the end-of-file. I don't recall if it used a single
    file descriptor or two separate ones, but the idea is the same -- move
    bytes "backward" across the gap that you want to eliminate. I agree
    with your "worst possible scenario for I/O" assessment.

    -Rob

    Rob Biedenharn http://agileconsultingllc.com
    Rob Biedenharn, Aug 25, 2008
    #11
  12. Rob Biedenharn wrote:

    > I had some Perl code (lifted from some forum or article) that would
    > cut initial lines out of a log file using sysread/syswrite with a
    > truncate to reset the end-of-file. I don't recall if it used a single
    > file descriptor or two separate ones, but the idea is the same -- move
    > bytes "backward" across the gap that you want to eliminate. I agree
    > with your "worst possible scenario for I/O" assessment.


    You are talking about `tail -f`. This is different.

    (And if you ever need to find that again, perldoc -q tail).

    -Erik
    --
    Posted via http://www.ruby-forum.com/.
    Erik Hollensbe, Aug 25, 2008
    #12
  13. Richard Schneeman

    Dave Bass Guest

    Erik Hollensbe wrote:
    > Dave Bass wrote:
    >> And... it's a lot easier to delete the *last* line of a file than the
    >> first.

    >
    > I don't think this is actually true, can you explain further?


    My Ruby isn't up to coding it, but in principle I'd seek to the end of
    the file, then backtrack until I found the appropriate newline. Then I'd
    truncate the file.
    --
    Posted via http://www.ruby-forum.com/.
    Dave Bass, Aug 25, 2008
    #13
  14. Richard Schneeman

    Mark Thomas Guest

    > the thing is, if i didn't delete the lines i've already added, everytime
    > i ran the task, (it has to be run multiple times due to timeouts) i
    > would only re-add the same lines. The deletion acts as a place holder of
    > sorts. I'll play around with your suggestion, and i'll let you know. If
    > anyone else has an alternate method...i'm all ears


    I suggest exploring a distributed worker system like Rinda,
    Backgroundrb, AP4R, or Sparrow. You can prepare a master list of
    dictionary words, and then worker processes can take one at a time and
    add them to your database. Having a timeout won't slow things down,
    nor will it cause you to have to re-read your wordlist.
    Mark Thomas, Aug 25, 2008
    #14
  15. I've set it up with a rake task and a cron job, that re-runs every 5hrs
    (my timeout window)

    i this is the code that i've already uploaded, and is currently
    running...

    namespace :chinese do

    desc "adds all chinese files to database"
    task :create => :environment do

    active_dictionary =
    File.readlines("public/languages/chinese/practice.txt")
    count = 0
    for @element in active_dictionary
    count += 1
    process_chinese
    open('public/languages/chinese/practice.txt', 'w') do
    |file|
    file.puts
    active_dictionary[count,active_dictionary.size]
    end

    end

    end

    end

    where process_chinese contains all my proprietary code, i played around
    with only writing to the file every time i've processed ten entries, but
    it only cut my process time down by a trivial amount of time, so i just
    let it write over the file after every line. As we've seen here, there
    are quite a few ways this can be accomplished, this ended up
    working...and didn't kill my processor (the writes to the DB are
    infinitely more expensive than opening and writing to this file).

    Thanks for the help!!
    Richard
    --
    Posted via http://www.ruby-forum.com/.
    Richard Schneeman, Aug 25, 2008
    #15
  16. Richard Schneeman wrote:
    > i this is the code that i've already uploaded, and is currently
    > running...


    If the file is exceptionally large, you can save a lot of memory (and
    processing time, likely), by doing something like this:

    File.open("my_file") do |f|
    f.readline
    File.open("my_file.tmp", 'w') do |f2|
    f2 << f.read
    end
    end

    FileUtils.mv("my_file.tmp", "my_file")

    The point here is that almost all the work is done on the file
    descriptors instead of in memory. I don't know if ruby has a sendfile()
    implementation, but that would be the most ideal, as it'd instruct the
    OS to do the copy.
    --
    Posted via http://www.ruby-forum.com/.
    Erik Hollensbe, Aug 25, 2008
    #16
  17. Richard Schneeman

    Daniel Bush Guest

    Erik Hollensbe wrote:
    > Richard Schneeman wrote:
    > If the file is exceptionally large, you can save a lot of memory (and
    > processing time, likely), by doing something like this:
    >
    > File.open("my_file") do |f|
    > f.readline
    > File.open("my_file.tmp", 'w') do |f2|
    > f2 << f.read
    > end
    > end
    >
    > FileUtils.mv("my_file.tmp", "my_file")
    >


    Just on the "f2 << f.read" part, isn't this still reading the rest of
    the file into ruby?
    I was thinking more of reading stuff into a fixed buffer and then
    writing it.
    ie
    while buf=f.read(32000) # bytes
    f2.write buf # or f2 << buf
    end
    which will result in a bazillion more calls to IO#read and IO#write on a
    large file but doesn't read the whole thing into memory. I'm not
    recommending this or anything - just wanted to clarify.


    Daniel
    --
    Posted via http://www.ruby-forum.com/.
    Daniel Bush, Aug 27, 2008
    #17
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Peter A. Schott
    Replies:
    0
    Views:
    876
    Peter A. Schott
    Feb 10, 2005
  2. Tor Erik Sønvisen

    Delete first line from file

    Tor Erik Sønvisen, Mar 1, 2005, in forum: Python
    Replies:
    4
    Views:
    7,433
    Jeff Sandys
    Mar 4, 2005
  3. Dani
    Replies:
    3
    Views:
    469
  4. Iñaki Baz Castillo
    Replies:
    7
    Views:
    822
    Iñaki Baz Castillo
    Jan 12, 2010
  5. Iulian Ilea
    Replies:
    1
    Views:
    292
    pcx99
    Dec 21, 2006
Loading...

Share This Page