Built-in open() with buffering > 1

Discussion in 'Python' started by Marco, Aug 24, 2012.

  1. Marco

    Marco Guest

    Please, can anyone explain me the meaning of the
    "buffering > 1" in the built-in open()?
    The doc says: "...and an integer > 1 to indicate the size
    of a fixed-size chunk buffer."
    So I thought this size was the number of bytes or chars, but
    it is not:

    >>> f = open('myfile', 'w', buffering=2)
    >>> f.write('a')

    1
    >>> open('myfile').read()

    ''
    >>> f.write('b')

    1
    >>> open('myfile').read()

    ''
    >>> f.write('cdefghi\n')

    8
    >>> open('myfile').read()

    ''
    >>> f.flush()
    >>> open('myfile').read()

    'abcdefghi\n'

    Regards,
    Marco
    Marco, Aug 24, 2012
    #1
    1. Advertising

  2. Marco

    Marco Guest

    On 08/24/2012 06:35 AM, Marco wrote:
    > Please, can anyone explain me the meaning of the
    > "buffering > 1" in the built-in open()?
    > The doc says: "...and an integer > 1 to indicate the size
    > of a fixed-size chunk buffer."


    Sorry, I get it:

    >>> f = open('myfile', 'w', buffering=2)
    >>> f._CHUNK_SIZE = 5
    >>> for i in range(6):

    .... n = f.write(str(i))
    .... print(i, open('myfile').read(), sep=':')
    ....
    0:
    1:
    2:
    3:
    4:
    5:012345
    Marco, Aug 24, 2012
    #2
    1. Advertising

  3. `f._CHUNK_SIZE = 5` is modifying Python's internal variables - don't do that
    google buffering to find out what it is
    buffering is how much Python will keep in memory
    f.read(1) will actually read `buffering` bytes of memory so that when you read later, the reading can be done from memory
    On Friday, 24 August 2012 10:51:36 UTC+5:30, Marco wrote:
    > On 08/24/2012 06:35 AM, Marco wrote:
    >
    > > Please, can anyone explain me the meaning of the

    >
    > > "buffering > 1" in the built-in open()?

    >
    > > The doc says: "...and an integer > 1 to indicate the size

    >
    > > of a fixed-size chunk buffer."

    >
    >
    >
    > Sorry, I get it:
    >
    >
    >
    > >>> f = open('myfile', 'w', buffering=2)

    >
    > >>> f._CHUNK_SIZE = 5

    >
    > >>> for i in range(6):

    >
    > ... n = f.write(str(i))
    >
    > ... print(i, open('myfile').read(), sep=':')
    >
    > ...
    >
    > 0:
    >
    > 1:
    >
    > 2:
    >
    > 3:
    >
    > 4:
    >
    > 5:012345
    Ramchandra Apte, Aug 24, 2012
    #3
  4. Marco

    Hans Mulder Guest

    On 24/08/12 06:35:27, Marco wrote:
    > Please, can anyone explain me the meaning of the
    > "buffering > 1" in the built-in open()?
    > The doc says: "...and an integer > 1 to indicate the size
    > of a fixed-size chunk buffer."
    > So I thought this size was the number of bytes or chars, but
    > it is not


    The algorithm is explained at
    http://docs.python.org/library/io.html#io.DEFAULT_BUFFER_SIZE

    >> io.DEFAULT_BUFFER_SIZE
    >>
    >> An int containing the default buffer size used by the
    >> module’s buffered I/O classes. open() uses the file’s
    >> blksize (as obtained by os.stat()) if possible.


    In other words: open() tries to find a suitable size by
    calling os.stat(your_file).st_blksize and if that fails,
    it uses io.DEFAULT_BUFFER_SIZE, which is 8192 on my box.

    Whether you call open with buffering=2 or any larger
    number, does not matter: the buffer size will be the
    outcome of this algorithm.


    Hope this helps,

    -- HansM
    Hans Mulder, Aug 26, 2012
    #4
  5. Marco

    Marco Guest

    On 08/26/2012 10:25 AM, Hans Mulder wrote:

    > The algorithm is explained at
    > http://docs.python.org/library/io.html#io.DEFAULT_BUFFER_SIZE


    Thanks ;)

    > In other words: open() tries to find a suitable size by
    > calling os.stat(your_file).st_blksize and if that fails,
    > it uses io.DEFAULT_BUFFER_SIZE, which is 8192 on my box.


    Yes, when the parameter `buffering` is a negative integer
    that is right

    > Whether you call open with buffering=2 or any larger
    > number, does not matter: the buffer size will be the
    > outcome of this algorithm.


    Mmm, I think it is not right, because in this case
    the buffer size is not computed but it is
    the value you assign to the buffering parameter.
    In fact:

    >>> f = open('myfile', 'w', buffering=2)
    >>> f._CHUNK_SIZE = 1
    >>> f.write('ab')

    2
    >>> open('myfile').read()


    Now two bytes are in the buffer and the buffer is full.
    If you write another byte, it will not be written in the
    buffer, because the bytes in the queue will be transferred
    into the buffer only when they are more than f._CHUNK_SIZE:

    >>> f.write('c')

    1
    >>> open('myfile').read()


    Now, if you write another byte 'd', the chunk 'cd' will
    be transferred to the buffer, but because it is full,
    its content 'ab' will be transferred to the disk, and
    after 'cd' written to the buffer, that still full:

    >>> f.write('d')

    1
    >>> open('myfile').read()

    'ab'

    So, the buffer is really of size 2
    Marco, Aug 30, 2012
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Tim Mohler
    Replies:
    1
    Views:
    439
    Steve Grazzini
    Sep 16, 2003
  2. Douglass Turner
    Replies:
    2
    Views:
    2,016
    Manfred Bartz
    Sep 4, 2003
  3. Sam
    Replies:
    1
    Views:
    379
  4. John Pritchard-williams

    Trying to open a Lucene-built index with Ferret...

    John Pritchard-williams, Nov 2, 2008, in forum: Ruby
    Replies:
    4
    Views:
    106
    Hugh Sasse
    Nov 3, 2008
  5. Marco
    Replies:
    13
    Views:
    414
    Steven D'Aprano
    Sep 6, 2012
Loading...

Share This Page