"env" parameter to "popen" won't accept Unicode on Windows - minorUnicode bug

Discussion in 'Python' started by John Nagle, Jan 15, 2008.

  1. John Nagle

    John Nagle Guest

    I passed a dict for the "env" variable to Popen with Unicode strings
    for the dictionary values.

    Got:

    File "D:\Python24\lib\subprocess.py", line 706, in _execute_child
    TypeError: environment can only contain strings

    It turns out that the strings in the "env" parameter have to be ASCII,
    not Unicode, even though Windows fully supports Unicode in CreateProcess.

    John Nagle
     
    John Nagle, Jan 15, 2008
    #1
    1. Advertising

  2. Re: "env" parameter to "popen" won't accept Unicode on Windows - minor Unicode bug

    John Nagle wrote:

    > It turns out that the strings in the "env" parameter have to be
    > ASCII, not Unicode, even though Windows fully supports Unicode in
    > CreateProcess.


    Are you sure it supports Unicode, not UTF8 or UTF16? Probably using
    something like u"thestring".encode("utf16") will help.

    Regards,


    Björn

    --
    BOFH excuse #31:

    cellular telephone interference
     
    Bjoern Schliessmann, Jan 15, 2008
    #2
    1. Advertising

  3. John Nagle

    Benjamin Guest

    Re: "env" parameter to "popen" won't accept Unicode on Windows -minor Unicode bug

    On Jan 14, 6:26 pm, Bjoern Schliessmann <usenet-
    > wrote:
    > John Nagle wrote:
    > > It turns out that the strings in the "env" parameter have to be
    > > ASCII, not Unicode, even though Windows fully supports Unicode in
    > > CreateProcess.

    >
    > Are you sure it supports Unicode, not UTF8 or UTF16? Probably using
    > something like u"thestring".encode("utf16") will help.

    Otherwise: bugs.python.org
    >
    > Regards,
    >
    > Björn
    >
    > --
    > BOFH excuse #31:
    >
    > cellular telephone interference
     
    Benjamin, Jan 15, 2008
    #3
  4. John Nagle

    Benjamin Guest

    Re: "env" parameter to "popen" won't accept Unicode on Windows -minor Unicode bug

    On Jan 14, 6:26 pm, John Nagle <> wrote:
    > I passed a dict for the "env" variable to Popen with Unicode strings
    > for the dictionary values.
    >
    > Got:
    >
    > File "D:\Python24\lib\subprocess.py", line 706, in _execute_child
    > TypeError: environment can only contain strings
    >
    > It turns out that the strings in the "env" parameter have to be ASCII,
    > not Unicode, even though Windows fully supports Unicode in CreateProcess.


    >
    > John Nagle
     
    Benjamin, Jan 15, 2008
    #4
  5. John Nagle

    John Nagle Guest

    Re: "env" parameter to "popen" won't accept Unicode on Windows -minor Unicode bug

    Benjamin wrote:
    > On Jan 14, 6:26 pm, Bjoern Schliessmann <usenet-
    > > wrote:
    >> John Nagle wrote:
    >>> It turns out that the strings in the "env" parameter have to be
    >>> ASCII, not Unicode, even though Windows fully supports Unicode in
    >>> CreateProcess.

    >> Are you sure it supports Unicode, not UTF8 or UTF16? Probably using
    >> something like u"thestring".encode("utf16") will help.

    > Otherwise: bugs.python.org


    Whatever translation is necessary should be done in "popen", which
    has cases for Windows and POSIX. "popen" is supposed to be cross-platform
    to the extent possible. I think it's just something that didn't get fixed
    when Unicode support went in.

    John Nagle
     
    John Nagle, Jan 15, 2008
    #5
  6. Re: "env" parameter to "popen" won't accept Unicode on Windows - minor Unicode bug

    John Nagle wrote:

    > Benjamin wrote:
    >> On Jan 14, 6:26 pm, Bjoern Schliessmann <usenet-
    >> > wrote:
    >>> John Nagle wrote:
    >>>> It turns out that the strings in the "env" parameter have to be
    >>>> ASCII, not Unicode, even though Windows fully supports Unicode in
    >>>> CreateProcess.


    That's of course nonsense, they don't need to be ascii, they need to be
    byte-strings in whatever encoding you like.

    >>> Are you sure it supports Unicode, not UTF8 or UTF16? Probably using
    >>> something like u"thestring".encode("utf16") will help.

    >> Otherwise: bugs.python.org


    John's understanding of the differences between unicode and it's encodings
    is a bit blurry, to say the least.

    > Whatever translation is necessary should be done in "popen", which
    > has cases for Windows and POSIX. "popen" is supposed to be cross-platform
    > to the extent possible. I think it's just something that didn't get fixed
    > when Unicode support went in.


    Sure thing, python will just magically convert unicode to the encoding the
    program YOU invoke will expect. Right after we introduced the

    solve_my_problem()

    built-in-function. Any other wishes?

    If I write this simple program

    ------ test.py -------
    import os
    import sys

    ENCODDINGS = ['utf-8', 'latin1']

    os.env["MY_VARIABLE"].encode(ENCODINGS[int(sys.argv[1])])
    ------ test.py -------


    how's python supposed to know that

    suprocess.call("python test.py 0", env=dict(MY_VARIABLE=u'foo'))

    needs to be UTF-8?

    Diez
     
    Diez B. Roggisch, Jan 15, 2008
    #6
  7. John Nagle

    Brian Smith Guest

    RE: "env" parameter to "popen" won't accept Unicode on Windows -minorUnicode bug

    Diez B. Roggisch wrote:
    > Sure thing, python will just magically convert unicode to the
    > encoding the program YOU invoke will expect. Right after we
    > introduced the
    >
    > solve_my_problem()
    >
    > built-in-function. Any other wishes?


    There's no reason to be rude.

    Anyway, at least on Windows it makes perfect sense for people to expect
    Unicode to be handled automatically. popen() knows that it is running on
    Windows, and it knows what encoding Windows needs for its environment
    (it's either UCS2 or UTF-16 for most Windows APIs). At least when it
    receives a unicode string, it has enough information to apply the
    conversion automatically, and doing so saves the caller from having to
    figure out what exact encoding is to be used.

    - Brian
     
    Brian Smith, Jan 15, 2008
    #7
  8. RE: "env" parameter to "popen" won't accept Unicode on Windows -minor Unicode bug

    Brian Smith wrote:

    > Diez B. Roggisch wrote:
    >> Sure thing, python will just magically convert unicode to the
    >> encoding the program YOU invoke will expect. Right after we
    >> introduced the
    >>
    >> solve_my_problem()
    >>
    >> built-in-function. Any other wishes?

    >
    > There's no reason to be rude.


    If you'd know John, you'd know there is.

    > Anyway, at least on Windows it makes perfect sense for people to expect
    > Unicode to be handled automatically. popen() knows that it is running on
    > Windows, and it knows what encoding Windows needs for its environment
    > (it's either UCS2 or UTF-16 for most Windows APIs). At least when it
    > receives a unicode string, it has enough information to apply the
    > conversion automatically, and doing so saves the caller from having to
    > figure out what exact encoding is to be used.



    For once, the distinction between windows and other platforms is debatable.
    I admit that subprocess contains already quite a few platform specific
    aspects, but it's purpose is to abstract these away as much as possible.

    However, I'm not sure that just because there are wide-char windows apis
    available automatically means that using UCS2/UTF-16 would succeed. A look
    into the python sources (PC/_subprocess.c) reveals that someone already
    thought about this, but it seems that just setting a
    CREATE_UNICODE_ENVIRONMENT in the CreateProcess-function should have been
    easy enough to do it if there weren't any troubles to expect.

    Additionally, passing unicode to env would also imply that os.environ should
    yield unicode as well. Not sure how much code _that_ breaks.

    Diez
     
    Diez B. Roggisch, Jan 15, 2008
    #8
  9. RE: "env" parameter to "popen" won't accept Unicode on Windows -minor Unicode bug

    Brian Smith wrote:
    > popen() knows that it is running on Windows, and it knows what
    > encoding Windows needs for its environment (it's either UCS2 or
    > UTF-16 for most Windows APIs). At least when it receives a unicode
    > string, it has enough information to apply the conversion
    > automatically, and doing so saves the caller from having to figure
    > out what exact encoding is to be used.


    So you propose Python should employ a hidden automatism that
    automagically guesses the right encoding? Why not leave it
    explicitly/consistently and let the user decide? What will happen
    if a future Windows changes its encoding? Will we need another
    magic routine to tell it apart?

    Regards,


    Björn

    --
    BOFH excuse #353:

    Second-system effect.
     
    Bjoern Schliessmann, Jan 15, 2008
    #9
  10. John Nagle

    John Nagle Guest

    Re: "env" parameter to "popen" won't accept Unicode on Windows -minor Unicode bug

    Diez B. Roggisch wrote:
    > John Nagle wrote:
    >
    >> Benjamin wrote:
    >>> On Jan 14, 6:26 pm, Bjoern Schliessmann <usenet-
    >>> > wrote:
    >>>> John Nagle wrote:
    >>>>> It turns out that the strings in the "env" parameter have to be
    >>>>> ASCII, not Unicode, even though Windows fully supports Unicode in
    >>>>> CreateProcess.

    >
    > That's of course nonsense, they don't need to be ascii, they need to be
    > byte-strings in whatever encoding you like.
    >
    >>>> Are you sure it supports Unicode, not UTF8 or UTF16? Probably using
    >>>> something like u"thestring".encode("utf16") will help.
    >>> Otherwise: bugs.python.org

    >
    > John's understanding of the differences between unicode and it's encodings
    > is a bit blurry, to say the least.


    Who's this guy?
    >
    >> Whatever translation is necessary should be done in "popen", which
    >> has cases for Windows and POSIX. "popen" is supposed to be cross-platform
    >> to the extent possible. I think it's just something that didn't get fixed
    >> when Unicode support went in.


    I've been looking at the source code. There's "_PyPopenCreateProcess"
    in "posixmodule.c". That one doesn't support passing an environment at
    all; see the call to Windows CreateProcess. Is that the one that Popen uses?

    Where is "win32process" in the source? It ought to be in Modules, but
    it's not.

    John Nagle
     
    John Nagle, Jan 15, 2008
    #10
  11. John Nagle

    John Nagle Guest

    Re: "env" parameter to "popen" won't accept Unicode on Windows -minorUnicode bug

    Diez B. Roggisch wrote:
    > Brian Smith wrote:
    >
    >> Diez B. Roggisch wrote:
    >>> Sure thing, python will just magically convert unicode to the
    >>> encoding the program YOU invoke will expect. Right after we
    >>> introduced the
    >>>
    >>> solve_my_problem()
    >>>
    >>> built-in-function. Any other wishes?

    >> There's no reason to be rude.

    >
    > If you'd know John, you'd know there is.


    ?

    >> Anyway, at least on Windows it makes perfect sense for people to expect
    >> Unicode to be handled automatically. popen() knows that it is running on
    >> Windows, and it knows what encoding Windows needs for its environment
    >> (it's either UCS2 or UTF-16 for most Windows APIs). At least when it
    >> receives a unicode string, it has enough information to apply the
    >> conversion automatically, and doing so saves the caller from having to
    >> figure out what exact encoding is to be used.

    >
    >
    > For once, the distinction between windows and other platforms is debatable.
    > I admit that subprocess contains already quite a few platform specific
    > aspects, but it's purpose is to abstract these away as much as possible.
    >
    > However, I'm not sure that just because there are wide-char windows apis
    > available automatically means that using UCS2/UTF-16 would succeed. A look
    > into the python sources (PC/_subprocess.c) reveals that someone already
    > thought about this, but it seems that just setting a
    > CREATE_UNICODE_ENVIRONMENT in the CreateProcess-function should have been
    > easy enough to do it if there weren't any troubles to expect.


    The problem is that only the NT-derived Microsoft systems talk Unicode.
    The DOS/Win16/Win9x family did not. But they did have CreateProcess.
    So the current code will handle Win9x, but not Unicode.

    When do we drop support for Win9x? It probably has to happen in
    Python 3K, since that's Unicode-everywhere.

    John Nagle
     
    John Nagle, Jan 15, 2008
    #11
  12. Re: "env" parameter to "popen" won't accept Unicode on Windows -minor Unicode bug

    John Nagle wrote:
    > The problem is that only the NT-derived Microsoft systems
    > talk Unicode. The DOS/Win16/Win9x family did not. But they did
    > have CreateProcess. So the current code will handle Win9x, but not
    > Unicode.


    Please explain, I don't understand. If you try using Windows system
    functions in older Windows versions, u"mystring" will fail, too.
    Those functions need byte strings, not Unicode string instances.
    The latter have to be encoded to byte strings to pass them.

    Regards,


    Björn

    --
    BOFH excuse #70:

    nesting roaches shorted out the ether cable
     
    Bjoern Schliessmann, Jan 15, 2008
    #12
  13. Re: "env" parameter to "popen" won't accept Unicode on Windows - minor Unicode bug

    On Tue, 15 Jan 2008 12:21:48 -0800, John Nagle <>
    declaimed the following in comp.lang.python:

    >
    > Where is "win32process" in the source? It ought to be in Modules, but
    > it's not.
    >

    Probably because it's part of the PythonWin (or whatever the name is
    this month) extension library, not part of base Python -- and is mostly
    a lot of compiled PYD files...
    --
    Wulfraed Dennis Lee Bieber KD6MOG

    HTTP://wlfraed.home.netcom.com/
    (Bestiaria Support Staff: )
    HTTP://www.bestiaria.com/
     
    Dennis Lee Bieber, Jan 16, 2008
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Eli Sidwell
    Replies:
    7
    Views:
    2,728
    Gavin Williams
    Jun 24, 2004
  2. Ankit Mehta
    Replies:
    1
    Views:
    1,561
    Simon Brooke
    Sep 29, 2006
  3. enda man
    Replies:
    1
    Views:
    1,440
    Nobody
    Mar 5, 2010
  4. TDR
    Replies:
    3
    Views:
    194
    Daniel Berger
    Aug 31, 2007
  5. Eli Sidwell
    Replies:
    1
    Views:
    131
    John Bokma
    Jun 10, 2004
Loading...

Share This Page