file object, details of modes and some issues.

Discussion in 'Python' started by simon place, Aug 26, 2003.

  1. simon place

    simon place Guest

    is the code below meant to produce rubbish?, i had expected an exception.

    f=file('readme.txt','w')
    f.write(' ')
    f.read()

    ( PythonWin 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)] on
    win32. )

    I got this while experimenting, trying to figure out the file objects modes,
    which on very careful reading of the documentation left me completely in the
    dark. Below is a summary of that experimentation, some is for reference and
    some is more of a warning.

    'r' is (r)ead mode so you can't write to the file, you get
    'IOError:(0,'Error')' if you try, which isn't a particularly helpful error
    description. read() reads the whole file and read(x) x-bytes, unless there are
    less than x bytes left, then it reads as much as possible. so a test for less
    than the required number of bytes indicates the end of the file, i think maybe
    an exception when a read at the end of the file is attempted would be better,
    like iterators. if you try to open a non-existent file in 'r' mode you get
    'IOError: [Errno 2] No such file or directory: "filename"' which makes sense

    'w' is (w)rite mode so you can't read from the file, ( any existing file is
    erased or a new file created, and bear in mind that anything you write to the
    file can't be read back directly on this object.), you get 'IOError: [Errno 9]
    Bad file descriptor' if you try reading, which is an awful error description.
    BUT this only happens at the beginning of the file? when at the end of the
    file, as is the case when you have just written something ( without a backward
    seek, see below), you don't get an exception, but lots of rubbish data ( see
    example at beginning.) This mode allows you to seek backward and rewrite
    data, but if you try a read somewhere between the first character and the end,
    you get a different exception 'IOError: (0, 'Error')'

    'a' is (a)ppend mode, you can only add to the file, so basically write mode
    (with the same problems ) plus a seek to the end, obviously append doesn't
    erase an existing file and it also ignores file seeks, so all writes pile up
    at the end. tell() gives the correct location in the file after a write ( so
    actually always gives the length of the file.) but if you seek() you don't get
    an exception and tell() returns the new value but writes actually go to the
    end of the file, so if you use tell() to find out where writes are going, in
    this mode it might not always be right.

    'r+' is (r)ead (+) update, which means read and write access, but
    don't read, without backward seeking, after a write because it will then read
    a lot of garbage.( the rest of the disk fragment/buffer i guess? )

    'w+' is (w)rite (+) update mode, which means read and write access,
    (like 'r+' but on a new or erased file).

    'a+' is (a)ppend (+) update mode, which also means read and write, but
    file seeks are ignored, so any reads seems a bit pointless since they always
    read past the end of the file! returning garbage, but it does extend
    the file, so this garbage becomes incorporated in the file!! ( yes really )

    'b', all modes can have a 'b' appended to indicate binary mode, i think this
    is something of a throw-back to serial comms ( serial comms being bundled into
    the same handlers as files because when these things were developed, 20+ years
    ago, nothing better was around. ) Binary mode turns off the 'clever' handling
    of line ends and ( depending on use and os ) other functional characters (
    tabs expanded to spaces etc ), the normal mode is already binary on windows so
    binary makes no difference on win32 files. But since in may do on other
    o.s.'s, ( or when actually using the file object for serial comms.) i think
    you should actually ALWAYS use the binary version of the mode, and handle the
    line ends etc. yourself. ( then of course you'll have to deal with the
    different line end types!)

    Bit surprised that the file object doesn't do ANY access control, multiple
    file objects on the same actual file can ALL write to it!! and other software
    can edit files opened for writing by the file object. However a write lock on
    a file made by other software cause a 'IOError: [Errno 13] Permission denied'
    when opened by python with write access. i guess you need
    os.access to test file locks and os.chmode to change the file locks, but i
    haven't gone into this, shame that there doesn't appear to be a nice simple
    file object subclass that does all this! Writes to the file object actually
    get done when flush() ( or seek() ) is called.

    suffice to say, i wasn't entirely impressed with the python file object, then
    i remembered the cross platform problems its dealing with and all
    the code that works ok with it, and though i'd knock up this post of my
    findings to try to elicit some discussion / get it improved / stop others
    making mistakes.
     
    simon place, Aug 26, 2003
    #1
    1. Advertising

  2. simon place

    Jeff Epler Guest

    Here's what I get on my system:
    >>> f = file("xyzzy", "w")
    >>> f.read()

    Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    IOError: [Errno 9] Bad file descriptor
    >>> f.write(' ')
    >>> f.read()

    Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    IOError: [Errno 9] Bad file descriptor

    Python relies fairly directly on the C standard library for correct
    behavior when it comes to file objects. I suspect that the following C
    program will also "succeed" on your system:

    #include <stdio.h>
    int main(void) {
    FILE *f = fopen("xyzzy", "w");
    char buf[2];
    char *res;
    fputs(" ", f);
    res = fgets(buf, 2, f);
    if(!res) {
    perror("fgets");
    return 1;
    }
    return 0;
    }

    On my system, it does given an error:
    $ gcc simon.c
    $ ./a.out
    fgets: Bad file descriptor
    $ echo $?
    1
    If the C program prints an error message like above, but Python does not
    raise an exception on the mentioned code, then there's a Python bug.
    Otherwise, if the C program executes on your system without printing an
    error and returns the 0 (success) exit code, then the problem is the
    poor quality of your platform's stdio implementation.

    Jeff
    PS relevant text from the fgets manpage on my system:
    gets() and fgets() return s on success, and NULL on error or when
    end of file occurs while no characters have been read.
    and from fopen:
    w Truncate file to zero length or create text file for writing.
    The stream is positioned at the beginning of the file.
     
    Jeff Epler, Aug 26, 2003
    #2
    1. Advertising

  3. On Tue, 26 Aug 2003 21:11:52 +0300, rumours say that Christos "TZOTZIOY"
    Georgiou <> might have written:

    >I will open a bug report if none other does,
    >but first I would like to know if it's the Windows stdio to blame or
    >not.


    I didn't wait that long, it's bug 795550 in SF.
    --
    TZOTZIOY, I speak England very best,
    Microsoft Security Alert: the Matrix began as open source.
     
    Christos TZOTZIOY Georgiou, Aug 26, 2003
    #3
  4. simon place <> writes:

    > is the code below meant to produce rubbish?


    Python uses C's stdio. According to the C standard:

    >, i had expected an exception.
    >
    > f=file('readme.txt','w')
    > f.write(' ')
    > f.read()


    engages in undefined behaviour (i.e. is perfectly entitled to make
    demons fly out of your nose). You can apparently trigger hair-raising
    crashes on Win98 by playing along these lines. There's not a lot that
    Python can do about this except include it's own implementation of a
    stdio-a-like, and indeed some future version of Python may do just
    this.

    Cheers,
    mwh

    --
    <arigo> something happens, what I'm not exactly sure.
    -- PyPy debugging fun
     
    Michael Hudson, Aug 26, 2003
    #4
  5. On Tue, 26 Aug 2003 18:37:03 GMT, rumours say that Michael Hudson
    <> might have written:

    >>, i had expected an exception.
    >>
    >> f=file('readme.txt','w')
    >> f.write(' ')
    >> f.read()

    >
    >engages in undefined behaviour (i.e. is perfectly entitled to make
    >demons fly out of your nose).


    OK, then, let's close the 795550 bug (I saw your reply there after
    posting the second comment).
    --
    TZOTZIOY, I speak England very best,
    Microsoft Security Alert: the Matrix began as open source.
     
    Christos TZOTZIOY Georgiou, Aug 26, 2003
    #5
  6. Jeff Epler <> writes:

    > On Tue, Aug 26, 2003 at 06:37:03PM +0000, Michael Hudson wrote:
    > > simon place <> writes:
    > >
    > > > is the code below meant to produce rubbish?

    > >
    > > Python uses C's stdio. According to the C standard:
    > >
    > > >, i had expected an exception.
    > > >
    > > > f=file('readme.txt','w')
    > > > f.write(' ')
    > > > f.read()

    > >
    > > engages in undefined behaviour (i.e. is perfectly entitled to make
    > > demons fly out of your nose). You can apparently trigger hair-raising
    > > crashes on Win98 by playing along these lines. There's not a lot that
    > > Python can do about this except include it's own implementation of a
    > > stdio-a-like, and indeed some future version of Python may do just
    > > this.

    >
    > If it's true that stdio doesn't guarantee an error return from fwrite() on
    > a file opened for reading, then the Python documentation should be
    > changed (it claims an exception is raised, but this depends on the
    > return value being different from the number of items written
    > (presumably 0))


    I may be getting confused. The undefined behaviour I was on about was
    interleaving reads & writes without an intervening seek.

    > It's my feeling that this is intended to be an error condition, not
    > undefined behavior. But I can't prove it. Here are some relevant pages
    > from the SUS spec, which intends to follow ISO C:
    > http://www.opengroup.org/onlinepubs/007904975/functions/fopen.html
    > http://www.opengroup.org/onlinepubs/007904975/functions/fwrite.html


    The EBADF error seems to be marked as an extension to ISO C, but I
    don't know what that signifies.

    > Hm, and there's a bug even on Linux:
    > >>> f = open("/dev/null", "r")
    > >>> f.write("") # should cause exception (?)
    > >>> # nope, it doesn't


    That might well not even call an C library routine at all (I don't
    know).

    Cheers,
    mwh

    --
    59. In English every word can be verbed. Would that it were so in
    our programming languages.
    -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html
     
    Michael Hudson, Aug 27, 2003
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?Sm9l?=

    Show Details/Hide Details link button

    =?Utf-8?B?Sm9l?=, Mar 13, 2006, in forum: ASP .Net
    Replies:
    1
    Views:
    938
    dkode
    Mar 13, 2006
  2. Jakob Bieling

    Q: Why are there file access modes?

    Jakob Bieling, Dec 2, 2003, in forum: C++
    Replies:
    6
    Views:
    379
    jeffc
    Dec 3, 2003
  3. Replies:
    2
    Views:
    444
    Dennis Lee Bieber
    Aug 20, 2006
  4. erikcw

    Having trouble with file modes

    erikcw, Nov 3, 2006, in forum: Python
    Replies:
    6
    Views:
    291
    Fredrik Lundh
    Nov 4, 2006
  5. tubby
    Replies:
    2
    Views:
    323
    Stefan Schwarzer
    Jan 7, 2007
Loading...

Share This Page