Can I Force Perl to Bypass File Write Buffers?

Discussion in 'Perl Misc' started by Hal Vaughan, Aug 30, 2005.

  1. Hal Vaughan

    Hal Vaughan Guest

    I'm using Perl 5.6.1 (and in some cases 5.8) on Linux. I've noticed that
    when I'm processing files, that Perl writes in blocks, so it'll process a
    number of items, and instead of the file having one line at a time written
    to it, it'll get a whole block at once suddenly written to the disk.

    Is there any way to avoid this and force Perl to write each line as I use a
    "print" statement to output the line? I log (in MySQL) each item as I
    finish it, so if power fails or the program is aborted, the system can pick
    up right where it left off. Because of the buffers, the log is ahead of
    what is written to the file, which would mean I'd lose the data between
    what's written and what's logged.

    Thanks!

    Hal
     
    Hal Vaughan, Aug 30, 2005
    #1
    1. Advertising

  2. Hal Vaughan

    Simon Taylor Guest

    Hal Vaughan wrote:
    > I'm using Perl 5.6.1 (and in some cases 5.8) on Linux. I've noticed that
    > when I'm processing files, that Perl writes in blocks, so it'll process a
    > number of items, and instead of the file having one line at a time written
    > to it, it'll get a whole block at once suddenly written to the disk.
    >
    > Is there any way to avoid this and force Perl to write each line as I use a
    > "print" statement to output the line?


    You'll need to disable buffering by setting $| to non-zero.
    See $| in

    perldoc perlvar

    and also checkout

    perldoc -f select

    This sample should do what you want:

    #!/usr/bin/perl
    use strict;
    use warnings;

    open (OUTPUT, '>', 'sample') or die "Could not create file: $!";
    my $fd = select(OUTPUT);
    $| = 1;
    select($fd);
    for (0..20) {
    print OUTPUT "some data...\n";
    sleep 2;
    }
    close OUTPUT;


    Regards,

    Simon Taylor


    --
    www.perlmeme.org
     
    Simon Taylor, Aug 30, 2005
    #2
    1. Advertising

  3. Hal Vaughan

    Anno Siegel Guest

    Simon Taylor <> wrote in comp.lang.perl.misc:
    > Hal Vaughan wrote:
    > > I'm using Perl 5.6.1 (and in some cases 5.8) on Linux. I've noticed that
    > > when I'm processing files, that Perl writes in blocks, so it'll process a
    > > number of items, and instead of the file having one line at a time written
    > > to it, it'll get a whole block at once suddenly written to the disk.
    > >
    > > Is there any way to avoid this and force Perl to write each line as I use a
    > > "print" statement to output the line?

    >
    > You'll need to disable buffering by setting $| to non-zero.


    [good advice snipped]

    Just one note: "$| = 1" doesn't disable buffering, it enables auto-flushing.
    The buffer(s) remain in place and active, but after each print-statement
    the buffer is automatically emptied (presumably into the next buffer down
    the line). You still have character buffering (and you want it).

    Anno
    --
    If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers.
     
    Anno Siegel, Aug 30, 2005
    #3
  4. Hal Vaughan

    Guest

    Hal Vaughan <> wrote:
    > I'm using Perl 5.6.1 (and in some cases 5.8) on Linux. I've noticed that
    > when I'm processing files, that Perl writes in blocks, so it'll process a
    > number of items, and instead of the file having one line at a time
    > written to it, it'll get a whole block at once suddenly written to the
    > disk.


    To answer the question you asked, check out the variable $|.

    To answer the question you didn't ask, your method isn't very good. If you
    are truly concerned about data integrity, use a transactional database for
    both the data and the log, and make sure both data write and log write are
    in the same transaction. Or make your program, upon restarting, tail the
    existing data file and figure out where to pick up based solely on the data
    file, and dispense with the logging altogether. Or do both--write the data
    into a database, and have the entry in the database by its own log.

    >
    > Is there any way to avoid this and force Perl to write each line as I use
    > a "print" statement to output the line? I log (in MySQL) each item as I
    > finish it, so if power fails or the program is aborted, the system can
    > pick up right where it left off. Because of the buffers, the log is
    > ahead of what is written to the file, which would mean I'd lose the data
    > between what's written and what's logged.



    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
     
    , Aug 30, 2005
    #4
  5. Hal Vaughan

    Hal Vaughan Guest

    Anno Siegel wrote:

    > Simon Taylor <> wrote in comp.lang.perl.misc:
    >> Hal Vaughan wrote:
    >> > I'm using Perl 5.6.1 (and in some cases 5.8) on Linux. I've noticed
    >> > that when I'm processing files, that Perl writes in blocks, so it'll
    >> > process a number of items, and instead of the file having one line at a
    >> > time written to it, it'll get a whole block at once suddenly written to
    >> > the disk.
    >> >
    >> > Is there any way to avoid this and force Perl to write each line as I
    >> > use a "print" statement to output the line?

    >>
    >> You'll need to disable buffering by setting $| to non-zero.

    >
    > [good advice snipped]
    >
    > Just one note: "$| = 1" doesn't disable buffering, it enables
    > auto-flushing. The buffer(s) remain in place and active, but after each
    > print-statement the buffer is automatically emptied (presumably into the
    > next buffer down
    > the line). You still have character buffering (and you want it).
    >
    > Anno


    I have $| = 1 set, since I had to redirect output to a file for debugging
    and needed the errors to sync with the output, but it doesn't seem to make
    a difference in the problem I'm talking about. You seem to be the only
    person that has pointed out this doesn't effect the buffers directly.

    Hal
     
    Hal Vaughan, Aug 30, 2005
    #5
  6. Hal Vaughan

    Hal Vaughan Guest

    wrote:

    > Hal Vaughan <> wrote:
    >> I'm using Perl 5.6.1 (and in some cases 5.8) on Linux. I've noticed that
    >> when I'm processing files, that Perl writes in blocks, so it'll process a
    >> number of items, and instead of the file having one line at a time
    >> written to it, it'll get a whole block at once suddenly written to the
    >> disk.

    >
    > To answer the question you asked, check out the variable $|.


    Thanks. I've used it and it helps with syncing out put so if I redirect
    output to a file, the error messages and other output is synced, but it
    doesn't seem to help here.

    > To answer the question you didn't ask, your method isn't very good. If
    > you are truly concerned about data integrity, use a transactional database
    > for both the data and the log, and make sure both data write and log write
    > are
    > in the same transaction. Or make your program, upon restarting, tail the
    > existing data file and figure out where to pick up based solely on the
    > data
    > file, and dispense with the logging altogether. Or do both--write the
    > data into a database, and have the entry in the database by its own log.


    I seriously thought about putting the info into a database, but there were a
    number of reasons I didn't. Part is because different programs on
    different systems can use this, and it works better to make the directory
    shared through NFS and I'd rather share that than the database. I've also
    got a stream of data coming in, and it has been working much better to save
    it to a capture file. Trying to break it up into chunks so it could be put
    into a database as it comes in would be a nightmare.

    Thanks!

    Hal
     
    Hal Vaughan, Aug 30, 2005
    #6
  7. autoflush() and how to find the code of a method...

    [A complimentary Cc of this posting was sent to
    Hal Vaughan
    <>], who wrote in article <>:
    > I have $| = 1 set, since I had to redirect output to a file for debugging
    > and needed the errors to sync with the output, but it doesn't seem to make
    > a difference in the problem I'm talking about. You seem to be the only
    > person that has pointed out this doesn't effect the buffers directly.


    Remember that $| affects the currently select(1arg)ed filehandle. Let
    me see... Yes, the ->autoflush() method will do select()ing for you....

    Hope this helps,
    Ilya

    P.S. I needed a lot of time to find the source of autoflush(). Best
    try (do not know how to do it better so it would work if IO::Handle
    would define it in an XSUB...; would some Emacs package help here?):

    perl -MFileHandle -wdle "(my $fh = new FileHandle)->open(q[> xx]); $fh->autoflush(1)"
    n
    s
    v

    IO::Handle::autoflush(i:/perllib/lib/5.8.2/os2/IO/Handle.pm:465):
    464 sub autoflush {
    465==> my $old = new SelectSaver qualify($_[0], caller);
    466: my $prev = $|;
    467: $| = @_ > 1 ? $_[1] : 1;
    468: $prev;
    469 }

    Actually, doing
    n
    |m $fh

    thinks that autoflush() *is* in FileHandle module; it is not. Is it
    some bug related to a change of semantic of
    exists &function
    vs
    defined &function
    recently?
     
    Ilya Zakharevich, Aug 30, 2005
    #7
  8. Hal Vaughan <> wrote:

    > Perl writes in blocks,


    > Is there any way to avoid this and force Perl to write each line as I use a
    > "print" statement to output the line?



    Your Question is Asked Frequently:

    perldoc -q buffer

    How do I flush/unbuffer an output filehandle? Why must I do this?


    You must have missed it when you checked the Perl FAQ before
    posting to the Perl newsgroup.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Aug 30, 2005
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. disgracelands

    Clearing file object buffers

    disgracelands, Sep 17, 2003, in forum: Python
    Replies:
    2
    Views:
    241
    John J. Lee
    Sep 18, 2003
  2. Madhusudan Singh

    Force flushing buffers

    Madhusudan Singh, Oct 7, 2005, in forum: Python
    Replies:
    6
    Views:
    362
    Bengt Richter
    Oct 16, 2005
  3. Henk
    Replies:
    4
    Views:
    856
  4. Anon Anon
    Replies:
    4
    Views:
    2,548
    Mike Treseler
    May 29, 2007
  5. jmDesktop
    Replies:
    4
    Views:
    1,136
    Arne Vajhøj
    May 26, 2008
Loading...

Share This Page