[RELEASED] Python 3.1 final

Discussion in 'Python' started by Benjamin Peterson, Jun 27, 2009.

  1. On behalf of the Python development team, I'm thrilled to announce the first
    production release of Python 3.1.

    Python 3.1 focuses on the stabilization and optimization of the features and
    changes that Python 3.0 introduced. For example, the new I/O system has been
    rewritten in C for speed. File system APIs that use unicode strings now handle
    paths with undecodable bytes in them. Other features include an ordered
    dictionary implementation, a condensed syntax for nested with statements, and
    support for ttk Tile in Tkinter. For a more extensive list of changes in 3.1,
    see http://doc.python.org/3.1/whatsnew/3.1.html or Misc/NEWS in the Python
    distribution.

    To download Python 3.1 visit:

    http://www.python.org/download/releases/3.1/

    The 3.1 documentation can be found at:

    http://docs.python.org/3.1

    Bugs can always be reported to:

    http://bugs.python.org


    Enjoy!

    --
    Benjamin Peterson
    Release Manager
    benjamin at python.org
    (on behalf of the entire python-dev team and 3.1's contributors)
    Benjamin Peterson, Jun 27, 2009
    #1
    1. Advertising

  2. Benjamin Peterson

    Nobody Guest

    On Sat, 27 Jun 2009 16:12:10 -0500, Benjamin Peterson wrote:

    > Python 3.1 focuses on the stabilization and optimization of the features and
    > changes that Python 3.0 introduced. For example, the new I/O system has been
    > rewritten in C for speed. File system APIs that use unicode strings now
    > handle paths with undecodable bytes in them.


    That's a significant improvement. It still decodes os.environ and sys.argv
    before you have a chance to call sys.setfilesystemencoding(), but it
    appears to be recoverable (with some effort; I can't find any way to re-do
    the encoding without manually replacing the surrogates).

    However, sys.std{in,out,err} are still created as text streams, and AFAICT
    there's nothing you can do about this from within your code.

    All in all, Python 3.x still has a long way to go before it will be
    suitable for real-world use.
    Nobody, Jun 28, 2009
    #2
    1. Advertising

  3. > That's a significant improvement. It still decodes os.environ and sys.argv
    > before you have a chance to call sys.setfilesystemencoding(), but it
    > appears to be recoverable (with some effort; I can't find any way to re-do
    > the encoding without manually replacing the surrogates).


    See PEP 383.

    > However, sys.std{in,out,err} are still created as text streams, and AFAICT
    > there's nothing you can do about this from within your code.


    That's intentional, and not going to change. You can access the
    underlying byte streams if you want to, as you could already in 3.0.

    Regards,
    Martin

    P.S. Please identify yourself on this newsgroup.
    Martin v. Löwis, Jun 28, 2009
    #3
  4. Nobody <nobody <at> nowhere.com> writes:
    > All in all, Python 3.x still has a long way to go before it will be
    > suitable for real-world use.


    Such as?
    Benjamin Peterson, Jun 28, 2009
    #4
  5. Benjamin Peterson

    Paul Moore Guest

    2009/6/28 "Martin v. Löwis" <>:
    >> However, sys.std{in,out,err} are still created as text streams, and AFAICT
    >> there's nothing you can do about this from within your code.

    >
    > That's intentional, and not going to change. You can access the
    > underlying byte streams if you want to, as you could already in 3.0.


    I had a quick look at the documentation, and couldn't see how to do
    this. It's the first time I'd read the new IO module documentation, so
    I probably missed something obvious. Could you explain how I get the
    byte stream underlying sys.stdin? (That should give me enough to find
    what I was misunderstanding in the docs).

    Thanks,
    Paul.
    Paul Moore, Jun 28, 2009
    #5
  6. >>>>> Paul Moore <> (PM) wrote:

    >PM> 2009/6/28 "Martin v. Löwis" <>:
    >>>> However, sys.std{in,out,err} are still created as text streams, and AFAICT
    >>>> there's nothing you can do about this from within your code.
    >>>
    >>> That's intentional, and not going to change. You can access the
    >>> underlying byte streams if you want to, as you could already in 3.0.


    >PM> I had a quick look at the documentation, and couldn't see how to do
    >PM> this. It's the first time I'd read the new IO module documentation, so
    >PM> I probably missed something obvious. Could you explain how I get the
    >PM> byte stream underlying sys.stdin? (That should give me enough to find
    >PM> what I was misunderstanding in the docs).


    http://docs.python.org/3.1/library/sys.html#sys.stdin
    --
    Piet van Oostrum <>
    URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
    Private email:
    Piet van Oostrum, Jun 28, 2009
    #6
  7. Benjamin Peterson

    Nobody Guest

    On Sun, 28 Jun 2009 15:22:15 +0000, Benjamin Peterson wrote:

    > Nobody <nobody <at> nowhere.com> writes:
    >> All in all, Python 3.x still has a long way to go before it will be
    >> suitable for real-world use.

    >
    > Such as?


    Such as not trying to shoe-horn every byte string it encounters into
    Unicode. Some of them really are *just* byte strings.
    Nobody, Jun 28, 2009
    #7
  8. Nobody <nobody <at> nowhere.com> writes:
    >
    > Such as not trying to shoe-horn every byte string it encounters into
    > Unicode. Some of them really are *just* byte strings.



    You're certainly allowed to convert them back to byte strings if you want.
    Benjamin Peterson, Jun 28, 2009
    #8
  9. Benjamin Peterson

    Terry Reedy Guest

    Nobody wrote:
    > On Sun, 28 Jun 2009 15:22:15 +0000, Benjamin Peterson wrote:
    >
    >> Nobody <nobody <at> nowhere.com> writes:
    >>> All in all, Python 3.x still has a long way to go before it will be
    >>> suitable for real-world use.

    >> Such as?

    >
    > Such as not trying to shoe-horn every byte string it encounters into
    > Unicode. Some of them really are *just* byte strings.


    Let's ignore the disinformation. So false it is hardly worth refuting.
    Terry Reedy, Jun 28, 2009
    #9
  10. Paul Moore <p.f.moore <at> gmail.com> writes:

    > The "buffer" attribute doesn't seem to be documented in the docs for
    > the io module. I'm guessing that the TextIOBase class should have a
    > note that you get at the buffer through the "buffer" attribute?



    Good point. I've now documented it, and the "raw" attribute of BufferedIOBase.
    Benjamin Peterson, Jun 28, 2009
    #10
  11. Benjamin Peterson

    Aahz Guest

    In article <>,
    Benjamin Peterson <> wrote:
    >Nobody <nobody <at> nowhere.com> writes:
    >>
    >> Such as not trying to shoe-horn every byte string it encounters into
    >> Unicode. Some of them really are *just* byte strings.

    >
    >You're certainly allowed to convert them back to byte strings if you want.


    Yes, but do you get back the original byte strings? Maybe I'm missing
    something, but my impression is that this is still an issue for the email
    module as well as command-line arguments and environment variables.
    --
    Aahz () <*> http://www.pythoncraft.com/

    "as long as we like the same operating system, things are cool." --piranha
    Aahz, Jun 28, 2009
    #11
  12. Aahz <aahz <at> pythoncraft.com> writes:
    > Yes, but do you get back the original byte strings? Maybe I'm missing
    > something, but my impression is that this is still an issue for the email
    > module as well as command-line arguments and environment variables.


    The email module is, yes, broken. You can recover the bytestrings of
    command-line arguments and environment variables.
    Benjamin Peterson, Jun 28, 2009
    #12
  13. Benjamin Peterson

    Nobody Guest

    On Sun, 28 Jun 2009 19:21:49 +0000, Benjamin Peterson wrote:

    >> Yes, but do you get back the original byte strings? Maybe I'm missing
    >> something, but my impression is that this is still an issue for the email
    >> module as well as command-line arguments and environment variables.

    >
    > The email module is, yes, broken. You can recover the bytestrings of
    > command-line arguments and environment variables.


    1. Does Python offer any assistance in doing so, or do you have to
    manually convert the surrogates which are generated for unrecognised bytes?

    2. How do you do this for non-invertible encodings (e.g. ISO-2022)?

    Most of the issues can be worked around by calling
    sys.setfilesystemencoding('iso-8859-1') at the start of the program, but
    sys.argv and os.environ have already been converted by this point.
    Nobody, Jun 28, 2009
    #13
  14. Benjamin Peterson

    Nobody Guest

    On Sun, 28 Jun 2009 13:31:50 -0400, Terry Reedy wrote:

    >>> Nobody <nobody <at> nowhere.com> writes:
    >>>> All in all, Python 3.x still has a long way to go before it will be
    >>>> suitable for real-world use.
    >>> Such as?

    >>
    >> Such as not trying to shoe-horn every byte string it encounters into
    >> Unicode. Some of them really are *just* byte strings.

    >
    > Let's ignore the disinformation.


    Translation: let's ignore anything which falsifies the assumptions.

    > So false it is hardly worth refuting.


    Your copy of Trolling by Numbers must be getting pretty dog-eared by now.
    Nobody, Jun 28, 2009
    #14
  15. Nobody <nobody <at> nowhere.com> writes:

    >
    > On Sun, 28 Jun 2009 19:21:49 +0000, Benjamin Peterson wrote:
    >
    > >> Yes, but do you get back the original byte strings? Maybe I'm missing
    > >> something, but my impression is that this is still an issue for the email
    > >> module as well as command-line arguments and environment variables.

    > >
    > > The email module is, yes, broken. You can recover the bytestrings of
    > > command-line arguments and environment variables.

    >
    > 1. Does Python offer any assistance in doing so, or do you have to
    > manually convert the surrogates which are generated for unrecognised bytes?


    fs_encoding = sys.getfilesystemencoding()
    bytes_argv = [arg.encode(fs_encoding, "surrogateescape") for arg in sys.argv]

    >
    > 2. How do you do this for non-invertible encodings (e.g. ISO-2022)?


    What's a non-invertible encoding? I can't find a reference to the term.
    Benjamin Peterson, Jun 28, 2009
    #15
  16. Benjamin Peterson writes:
    >Nobody <nobody <at> nowhere.com> writes:
    >> On Sun, 28 Jun 2009 19:21:49 +0000, Benjamin Peterson wrote:
    >> 1. Does Python offer any assistance in doing so, or do you have to
    >> manually convert the surrogates which are generated for unrecognised bytes?

    >
    > fs_encoding = sys.getfilesystemencoding()
    > bytes_argv = [arg.encode(fs_encoding, "surrogateescape") for arg in sys.argv]
    >
    >> 2. How do you do this for non-invertible encodings (e.g. ISO-2022)?

    >
    > What's a non-invertible encoding? I can't find a reference to the term.


    Different ISO-2022 strings can map to the same Unicode string.
    Thus you can convert back to _some_ ISO-2022 string, but it won't
    necessarily match the original.

    --
    Hallvard
    Hallvard B Furuseth, Jun 28, 2009
    #16
  17. > 2. How do you do this for non-invertible encodings (e.g. ISO-2022)?

    ISO-2022 cannot be used as a system encoding.

    Please do read the responses I write, and please do identify yourself.

    Regards,
    Martin
    Martin v. Löwis, Jun 28, 2009
    #17
  18. Scott David Daniels wrote:
    > Nobody wrote:
    >> On Sat, 27 Jun 2009 16:12:10 -0500, Benjamin Peterson wrote:
    >> <announcement of 3.1>
    >>
    >> That's a significant improvement....
    >> All in all, Python 3.x still has a long way to go before it will be
    >> suitable for real-world use.

    >
    > Fortunately, I have assiduously avoided the real word, and am happy to
    > embrace the world from our 'bot overlords.
    >
    > Congratulations on another release from the hydra-like world of
    > multi-head development.


    +1 QOTW

    -- Gerhard
    Gerhard Häring, Jun 28, 2009
    #18
  19. Benjamin Peterson

    Nobody Guest

    On Sun, 28 Jun 2009 21:25:13 +0000, Benjamin Peterson wrote:

    >> > The email module is, yes, broken. You can recover the bytestrings of
    >> > command-line arguments and environment variables.

    >>
    >> 1. Does Python offer any assistance in doing so, or do you have to
    >> manually convert the surrogates which are generated for unrecognised bytes?

    >
    > fs_encoding = sys.getfilesystemencoding()
    > bytes_argv = [arg.encode(fs_encoding, "surrogateescape") for arg in sys.argv]


    This results in an internal error:

    > "\udce4\udceb\udcef\udcf6\udcfc".encode("iso-8859-1", "surrogateescape")

    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    SystemError: Objects/bytesobject.c:3182: bad argument to internal function

    [FWIW, the error corresponds to _PyBytes_Resize, which has a
    cautionary comment almost as large as the code.]

    The documentation gives the impression that "surrogateescape" is only
    meaningful for decoding.

    >> 2. How do you do this for non-invertible encodings (e.g. ISO-2022)?

    >
    > What's a non-invertible encoding? I can't find a reference to the term.


    One where different inputs can produce the same output.
    Nobody, Jun 29, 2009
    #19
  20. Benjamin Peterson

    Nobody Guest

    On Sun, 28 Jun 2009 14:36:37 +0200, Martin v. Löwis wrote:

    >> That's a significant improvement. It still decodes os.environ and sys.argv
    >> before you have a chance to call sys.setfilesystemencoding(), but it
    >> appears to be recoverable (with some effort; I can't find any way to re-do
    >> the encoding without manually replacing the surrogates).

    >
    > See PEP 383.


    Okay, that's useful, except that it may have some bugs:

    > r = "\udce4\udceb\udcef\udcf6\udcfc".encode("iso-8859-1", "surrogateescape")

    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    SystemError: Objects/bytesobject.c:3182: bad argument to internal function

    Trying a few random test cases suggests that the ratio of valid to invalid
    bytes has an effect. Strings which consist mostly of invalid bytes trigger
    the error, those which are mostly valid don't.

    The error corresponds to _PyBytes_Resize(), which has the following
    words of caution in a preceding comment:

    /* The following function breaks the notion that strings are immutable:
    it changes the size of a string. We get away with this only if there
    is only one module referencing the object. You can also think of it
    as creating a new string object and destroying the old one, only
    more efficiently. In any case, don't use this if the string may
    already be known to some other part of the code...
    Note that if there's not enough memory to resize the string, the original
    string object at *pv is deallocated, *pv is set to NULL, an "out of
    memory" exception is set, and -1 is returned. Else (on success) 0 is
    returned, and the value in *pv may or may not be the same as on input.
    As always, an extra byte is allocated for a trailing \0 byte (newsize
    does *not* include that), and a trailing \0 byte is stored.
    */

    Assuming that this gets fixed, it should make most of the problems with
    3.0 solvable. OTOH, it wouldn't have killed them to have added e.g.
    sys.argv_bytes and os.environ_bytes.

    >> However, sys.std{in,out,err} are still created as text streams, and AFAICT
    >> there's nothing you can do about this from within your code.

    >
    > That's intentional, and not going to change. You can access the
    > underlying byte streams if you want to, as you could already in 3.0.


    Okay, I've since been pointed to the relevant information (I was looking
    under "File Objects"; I didn't think to look at "sys").
    Nobody, Jun 29, 2009
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. JFCM
    Replies:
    4
    Views:
    5,732
  2. Barry A. Warsaw

    RELEASED Python 2.3 (final)

    Barry A. Warsaw, Jul 30, 2003, in forum: Python
    Replies:
    10
    Views:
    504
    John Baxter
    Aug 1, 2003
  3. Anthony Baxter

    RELEASED Python 2.3.2 (final)

    Anthony Baxter, Oct 3, 2003, in forum: Python
    Replies:
    5
    Views:
    258
    Peter Hansen
    Oct 3, 2003
  4. Tim Peters

    RE: RELEASED Python 2.3.2 (final)

    Tim Peters, Oct 3, 2003, in forum: Python
    Replies:
    3
    Views:
    307
    Colin J. Williams
    Oct 3, 2003
  5. Replies:
    5
    Views:
    510
    Chris Uppal
    Nov 17, 2006
Loading...

Share This Page