file size

Discussion in 'Perl Misc' started by George Mpouras, Jan 21, 2014.

  1. I have a file already open (in fact could be 100s). How can I get the
    size faster

    1) -f ...
    2) seek FILE, 0, 2; tell FILE;
     
    George Mpouras, Jan 21, 2014
    #1
    1. Advertising

  2. Στις 21/1/2014 12:08, ο/η George Mpouras έγÏαψε:
    > I have a file already open (in fact could be 100s). How can I get the
    > size faster
    >
    > 1) -f ...
    > 2) seek FILE, 0, 2; tell FILE;
    >

    1) -s
     
    George Mpouras, Jan 21, 2014
    #2
    1. Advertising

  3. George Mpouras <> writes:

    > Στις 21/1/2014 12:08, ο/η George Mpouras έγÏαψε:
    >> I have a file already open (in fact could be 100s). How can I get the
    >> size faster


    use Benchmark;

    This code:

    #!/usr/bin/perl -w

    use Benchmark qw/cmpthese/;

    open FILE, '<file';
    cmpthese(10000000, {
    -s => sub { -s FILE },
    seektell => sub { seek FILE, 0, 2; tell FILE },
    })

    produces this output over here (Debian amd64, perl v5.18.2):

    Rate seektell -s
    seektell 3952569/s -- -48%
    -s 7575758/s 92% --

    So -s is almost twice as fast as seektell.

    If the seek is only done once, the result becomes:

    Rate -s seektell
    -s 7022472/s -- -82%
    seektell 39370079/s 461% --

    So tell is significantly faster than -s, but the combination of
    seek+tell is slower. Of course, YMMV. Run the benchmark and see what
    works best in your case.
    --
    Marius Gavrilescu

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1

    iQIcBAEBCgAGBQJS3l4bAAoJEMoENb5ewbNi5uoQAJJ0bWy8563Jb0F1AFrMIbLC
    tfTBwpuotCznakouOaB8d41H6vVeXrs9y0owtmYNg2SRCmfFoqPB2vc7wypF4J3j
    B7/UR2uoeFYvkWpqu0t7EjeozhtXv4Vo39JiTzJulzm48lp1mVk9Oj/azsa/hxTu
    hDK0zcWJL9q5CGk3ppBzpjrelS6iyx3tbbRHaOAyKY79DO4tr22JJFBhns7nAWih
    AB57p8/465IirAxYP3yrY3OABFsd2DUcjILJyASR9haTPU2MqvUIKLGU6rua/uwA
    7mXik8yEWrpbdIovO9XHKXqeNaGCCayNnobnpZiSZFuuvSxA9m+oP8/DS8M2vHSH
    z5yCcJbo3aQ8yk1OOugZ/U8slh59WzxG88Nme7uoWhjP/4qIek93pakrjGDJWuRm
    VWpuawTNXIgPhVyJsDQ9p95pBXCrO0yEzxC13RjDUzDCtDAOOe3Awhdd+TR3ztx1
    c4T7p81G7LSJiZqzujS9ITi6sphHz4YH0TfQbeoqnSW7Ba0ZWZgl6QQ61+hVeAKb
    i2PnunZL4mkXlQxJy/AqnDHS+u/5Vun5K2M+1DVr0xrPzaWsyiEWWtX3hpuJ0zh9
    9CyrsmCr0ggDosYFcww67NK/PIplo/SJAOjSBo6Y2YDDGkkpvIVQ8iSCivCJINot
    fsUKksOK1RcCInh21m45
    =bgO+
    -----END PGP SIGNATURE-----
     
    Marius Gavrilescu, Jan 21, 2014
    #3
  4. Marius Gavrilescu <> writes:
    > George Mpouras <> writes:
    >> Στις 21/1/2014 12:08, ο/η George Mpouras έγÏαψε:
    >>> I have a file already open (in fact could be 100s). How can I get the
    >>> size faster

    >
    > use Benchmark;
    >
    > This code:
    >
    > #!/usr/bin/perl -w
    >
    > use Benchmark qw/cmpthese/;
    >
    > open FILE, '<file';
    > cmpthese(10000000, {
    > -s => sub { -s FILE },
    > seektell => sub { seek FILE, 0, 2; tell FILE },
    > })
    >
    > produces this output over here (Debian amd64, perl v5.18.2):
    >
    > Rate seektell -s
    > seektell 3952569/s -- -48%
    > -s 7575758/s 92% --
    >
    > So -s is almost twice as fast as seektell.
    >
    > If the seek is only done once, the result becomes:
    >
    > Rate -s seektell
    > -s 7022472/s -- -82%
    > seektell 39370079/s 461% --
    >
    > So tell is significantly faster than -s, but the combination of
    > seek+tell is slower.


    The first seek moves the current file position to the end of the
    file, which causes the old current position to be lost. All subsequent
    seeks don't do any actual seeking. This should rather be something like

    ---------
    #!/usr/bin/perl -w

    use Benchmark qw/cmpthese/;

    open FILE, '<file';
    cmpthese(10000000, {
    -s => sub {
    -s FILE
    },
    seektell => sub {
    my ($old, $rc);

    $old = tell(FILE);
    seek FILE, 0, 2;
    $rc = tell FILE;
    seek FILE, $old, 0;
    return $rc;
    },
    })
    ----------
     
    Rainer Weikusat, Jan 21, 2014
    #4
  5. Ben Morrow <> writes:
    > Quoth Rainer Weikusat <>:
    >> Marius Gavrilescu <> writes:
    >> >
    >> > cmpthese(10000000, {
    >> > -s => sub { -s FILE },
    >> > seektell => sub { seek FILE, 0, 2; tell FILE },
    >> > })

    >>
    >> The first seek moves the current file position to the end of the
    >> file, which causes the old current position to be lost. All subsequent
    >> seeks don't do any actual seeking.

    >
    > They may not move the (OS) file pointer, but they will still make two
    > lseek(2) calls, which is what takes the time. (Moving the file pointer
    > from within the kernel is obviously entirely trivial.) Perl doesn't
    > know, until the OS tells it, that the file hasn't changed length since
    > the last time it found the end.


    It seems to me that a better implementation should be possible here but
    that's sort-of besides the point which was supposed to be that the first
    seek moves the file pointer to the end of the file and it then stays
    there for the purpose of this benchmark: There's nothing which magically
    causes it to revert to the 'current position' prior to the seek, hence


    [...]

    >> seektell => sub {
    >> my ($old, $rc);
    >>
    >> $old = tell(FILE);
    >> seek FILE, 0, 2;
    >> $rc = tell FILE;
    >> seek FILE, $old, 0;
    >> return $rc;
    >> },
    >> })

    >
    > This will do four lseek(2)s per iteration vs one fstat(2); not exactly a
    > fair comparison.


    would be a fairer comparison because fstat doesn't destroy the current
    file position.
     
    Rainer Weikusat, Jan 21, 2014
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. tiewknvc9
    Replies:
    6
    Views:
    661
    Chris Uppal
    Oct 1, 2006
  2. Jason Cavett

    Preferred Size, Minimum Size, Size

    Jason Cavett, May 23, 2008, in forum: Java
    Replies:
    5
    Views:
    12,583
    Michael Jung
    May 25, 2008
  3. Keith Thompson

    Re: File Size - Big File Size

    Keith Thompson, Oct 1, 2009, in forum: C Programming
    Replies:
    6
    Views:
    293
    Phil Carmody
    Oct 3, 2009
  4. Michael Tsang

    Re: File Size - Big File Size

    Michael Tsang, Oct 4, 2009, in forum: C Programming
    Replies:
    2
    Views:
    323
    Keith Thompson
    Oct 4, 2009
  5. Nobody

    Re: File Size - Big File Size

    Nobody, Oct 5, 2009, in forum: C Programming
    Replies:
    10
    Views:
    2,032
    Flash Gordon
    Oct 10, 2009
Loading...

Share This Page