getting file size

Discussion in 'Python' started by Bob Smith, Jan 22, 2005.

  1. Bob Smith

    Bob Smith Guest

    Are these the same:

    1. f_size = os.path.getsize(file_name)

    2. fp1 = file(file_name, 'r')
    data = fp1.readlines()
    last_byte = fp1.tell()

    I always get the same value when doing 1. or 2. Is there a reason I
    should do both? When reading to the end of a file, won't tell() be just
    as accurate as os.path.getsize()?

    Thanks guys,

    Bob
    Bob Smith, Jan 22, 2005
    #1
    1. Advertising

  2. Bob Smith

    John Machin Guest

    Bob Smith wrote:
    > Are these the same:
    >
    > 1. f_size = os.path.getsize(file_name)
    >
    > 2. fp1 = file(file_name, 'r')
    > data = fp1.readlines()
    > last_byte = fp1.tell()
    >
    > I always get the same value when doing 1. or 2. Is there a reason I
    > should do both? When reading to the end of a file, won't tell() be

    just
    > as accurate as os.path.getsize()?
    >


    Read the docs. Note the hint that you get what the stdio serves up.
    ftell() can only be _guaranteed_ to give you a magic cookie that you
    may later use with fseek(magic_cookie) to return to the same place in a
    more reliable manner than with Hansel & Gretel's non-magic
    bread-crumbs. On 99.99% of modern filesystems, the cookie obtained by
    ftell() when positioned at EOF is in fact the size in bytes. But why
    chance it? os.path.getsize does as its name suggests; why not use it,
    instead of a method with a side-effect? As for doing _both_, why would
    you??
    John Machin, Jan 22, 2005
    #2
    1. Advertising

  3. In <cssbfo$ppn$>, Bob Smith wrote:

    > Are these the same:
    >
    > 1. f_size = os.path.getsize(file_name)
    >
    > 2. fp1 = file(file_name, 'r')
    > data = fp1.readlines()
    > last_byte = fp1.tell()
    >
    > I always get the same value when doing 1. or 2. Is there a reason I
    > should do both? When reading to the end of a file, won't tell() be just
    > as accurate as os.path.getsize()?


    You don't always get the same value, even on systems where `tell()`
    returns a byte position. You need the rights to read the file in case 2.

    >>> import os
    >>> os.path.getsize('/etc/shadow')

    612L
    >>> f = open('/etc/shadow', 'r')

    Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    IOError: [Errno 13] Permission denied: '/etc/shadow'

    Ciao,
    Marc 'BlackJack' Rintsch
    Marc 'BlackJack' Rintsch, Jan 22, 2005
    #3
  4. Bob Smith

    Tim Roberts Guest

    Bob Smith <> wrote:

    >Are these the same:
    >
    >1. f_size = os.path.getsize(file_name)
    >
    >2. fp1 = file(file_name, 'r')
    > data = fp1.readlines()
    > last_byte = fp1.tell()
    >
    >I always get the same value when doing 1. or 2. Is there a reason I
    >should do both? When reading to the end of a file, won't tell() be just
    >as accurate as os.path.getsize()?


    On Windows, those two are not equivalent. Besides the newline conversion
    done by reading text files, the solution in 2. will stop as soon as it sees
    a ctrl-Z.

    If you used 'rb', you'd be much closer.
    --
    - Tim Roberts,
    Providenza & Boekelheide, Inc.
    Tim Roberts, Jan 23, 2005
    #4
  5. Bob Smith

    John Machin Guest

    Tim Roberts wrote:
    > Bob Smith <> wrote:
    >
    > >Are these the same:
    > >
    > >1. f_size = os.path.getsize(file_name)
    > >
    > >2. fp1 = file(file_name, 'r')
    > > data = fp1.readlines()
    > > last_byte = fp1.tell()
    > >
    > >I always get the same value when doing 1. or 2. Is there a reason I
    > >should do both? When reading to the end of a file, won't tell() be

    just
    > >as accurate as os.path.getsize()?

    >
    > On Windows, those two are not equivalent. Besides the newline

    conversion
    > done by reading text files,


    Doesn't appear to me to go wrong due to newline conversion:

    Python 2.4 (#60, Nov 30 2004, 11:49:19) [MSC v.1310 32 bit (Intel)] on
    win32
    >>> import os.path
    >>> txt = 'qwertyuiop\nasdfghjkl\nzxcvbnm\n'
    >>> file('bob', 'w').write(txt)
    >>> len(txt)

    29
    >>> os.path.getsize('bob')

    32L ##### as expected
    >>> f = file('bob', 'r')
    >>> lines = f.readlines()
    >>> lines

    ['qwertyuiop\n', 'asdfghjkl\n', 'zxcvbnm\n']
    >>> f.tell()

    32L ##### as expected

    > the solution in 2. will stop as soon as it sees
    > a ctrl-Z.


    .... and the value returned by f.tell() is not the position of the
    ctrl-Z but more likely the position of the end of the current block --
    which could be thousands/millions of bytes before the physical end of
    the file.

    Good ol' CP/M.

    >
    > If you used 'rb', you'd be much closer.


    And be much less hassled when that ctrl-Z wasn't meant to mean EOF, it
    just happened to appear in an unvalidated data field part way down a
    critical file :-(
    John Machin, Jan 23, 2005
    #5
  6. Bob Smith

    John Machin Guest

    Tim Roberts wrote:
    > Bob Smith <> wrote:
    >
    > >Are these the same:
    > >
    > >1. f_size = os.path.getsize(file_name)
    > >
    > >2. fp1 = file(file_name, 'r')
    > > data = fp1.readlines()
    > > last_byte = fp1.tell()
    > >
    > >I always get the same value when doing 1. or 2. Is there a reason I
    > >should do both? When reading to the end of a file, won't tell() be

    just
    > >as accurate as os.path.getsize()?

    >
    > On Windows, those two are not equivalent. Besides the newline

    conversion
    > done by reading text files,


    Doesn't appear to me to go wrong due to newline conversion:

    Python 2.4 (#60, Nov 30 2004, 11:49:19) [MSC v.1310 32 bit (Intel)] on
    win32
    >>> import os.path
    >>> txt = 'qwertyuiop\nasdfghjkl\nzxcvbnm\n'
    >>> file('bob', 'w').write(txt)
    >>> len(txt)

    29
    >>> os.path.getsize('bob')

    32L ##### as expected
    >>> f = file('bob', 'r')
    >>> lines = f.readlines()
    >>> lines

    ['qwertyuiop\n', 'asdfghjkl\n', 'zxcvbnm\n']
    >>> f.tell()

    32L ##### as expected

    > the solution in 2. will stop as soon as it sees
    > a ctrl-Z.


    .... and the value returned by f.tell() is not the position of the
    ctrl-Z but more likely the position of the end of the current block --
    which could be thousands/millions of bytes before the physical end of
    the file.

    Good ol' CP/M.

    >
    > If you used 'rb', you'd be much closer.


    And be much less hassled when that ctrl-Z wasn't meant to mean EOF, it
    just happened to appear in an unvalidated data field part way down a
    critical file :-(
    John Machin, Jan 23, 2005
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. tiewknvc9
    Replies:
    6
    Views:
    647
    Chris Uppal
    Oct 1, 2006
  2. Jason Cavett

    Preferred Size, Minimum Size, Size

    Jason Cavett, May 23, 2008, in forum: Java
    Replies:
    5
    Views:
    12,524
    Michael Jung
    May 25, 2008
  3. Keith Thompson

    Re: File Size - Big File Size

    Keith Thompson, Oct 1, 2009, in forum: C Programming
    Replies:
    6
    Views:
    279
    Phil Carmody
    Oct 3, 2009
  4. Michael Tsang

    Re: File Size - Big File Size

    Michael Tsang, Oct 4, 2009, in forum: C Programming
    Replies:
    2
    Views:
    308
    Keith Thompson
    Oct 4, 2009
  5. jodleren

    Getting picture size/setting window size

    jodleren, Feb 14, 2007, in forum: Javascript
    Replies:
    2
    Views:
    144
    jodleren
    Feb 15, 2007
Loading...

Share This Page