strftime return value encoding (mbcs, locale, etc.)

Discussion in 'Python' started by Giovanni Bajo, Jan 27, 2008.

  1. Hello,

    I am trying to find a good way to portably get the output of strftime()
    and put it onto a dialog (I'm using PyQt, but it doesn't really matter).
    The problem is that I need to decode the byte stream returned by strftime
    () into Unicode.

    This old bug:
    http://mail.python.org/pipermail/python-bugs-list/2003-
    November/020983.html

    (last comment) mentions that it is "byte string in the locale's encoding".

    The comment also suggests to use "mbcs" in Windows (which, AFAIK, it's
    sort of an "alias" encoding for the current Windows codepage), and to
    find out the exact encoding using locale.getpreferredencoding().

    Thus, I was hoping that something like:

    strftime("%#c", localtime()).decode(locale.getpreferredencoding())

    would work... but alas I was reported this exception:

    LookupError: unknown encoding: cp932

    So: what is the correct code to achieve this? Will something like this
    work:

    data = strftime("%#c", localtime())
    if os.name == "nt":
    data = data.decode("mbcs")
    else:
    data = dada.decode(locale.getpreferredencoding())

    Is this the correct way of doing it? (Yes, it sucks).

    Shouldn't Python automatically alias whatever is returned by
    locale.getpreferredencoding() to "mbcs", so that my original code works
    portably?

    Thanks in advance!
    --
    Giovanni Bajo
     
    Giovanni Bajo, Jan 27, 2008
    #1
    1. Advertising

  2. Giovanni Bajo

    Mark Tolonen Guest

    "Giovanni Bajo" <> wrote in message
    news:%D5nj.2630$...
    > Hello,
    >
    > I am trying to find a good way to portably get the output of strftime()
    > and put it onto a dialog (I'm using PyQt, but it doesn't really matter).
    > The problem is that I need to decode the byte stream returned by strftime
    > () into Unicode.
    >
    > This old bug:
    > http://mail.python.org/pipermail/python-bugs-list/2003-
    > November/020983.html
    >
    > (last comment) mentions that it is "byte string in the locale's encoding".
    >
    > The comment also suggests to use "mbcs" in Windows (which, AFAIK, it's
    > sort of an "alias" encoding for the current Windows codepage), and to
    > find out the exact encoding using locale.getpreferredencoding().
    >
    > Thus, I was hoping that something like:
    >
    > strftime("%#c", localtime()).decode(locale.getpreferredencoding())
    >
    > would work... but alas I was reported this exception:
    >
    > LookupError: unknown encoding: cp932
    >
    > So: what is the correct code to achieve this? Will something like this
    > work:
    >
    > data = strftime("%#c", localtime())
    > if os.name == "nt":
    > data = data.decode("mbcs")
    > else:
    > data = dada.decode(locale.getpreferredencoding())
    >
    > Is this the correct way of doing it? (Yes, it sucks).
    >
    > Shouldn't Python automatically alias whatever is returned by
    > locale.getpreferredencoding() to "mbcs", so that my original code works
    > portably?
    >
    > Thanks in advance!
    > --
    > Giovanni Bajo


    Odd, what version of Python are you using? Python 2.5 works:

    >>> import time,locale
    >>> time.strftime('%#c').decode(locale.getpreferredencoding()) # cp1252 on
    >>> my system

    u'Sunday, January 27, 2008 12:56:30'
    >>> time.strftime('%#c').decode('cp932')

    u'Sunday, January 27, 2008 12:56:40'
    >>> time.strftime('%#c').decode('mbcs')

    u'Sunday, January 27, 2008 12:56:48'

    --Mark
     
    Mark Tolonen, Jan 27, 2008
    #2
    1. Advertising

  3. > LookupError: unknown encoding: cp932

    What Python version are you using? cp932 is supported cross-platform
    since Python 2.4.

    > So: what is the correct code to achieve this? Will something like this
    > work:
    >
    > data = strftime("%#c", localtime())
    > if os.name == "nt":
    > data = data.decode("mbcs")
    > else:
    > data = dada.decode(locale.getpreferredencoding())
    >
    > Is this the correct way of doing it?


    Not necessarily. On some systems, and in some locales, Python will not
    have any codec that converts the locale's encoding to Unicode.

    In such a case, using ASCII with replacement characters might be the
    best bet, as long as the locale's charset is an ASCII superset (i.e.
    you don't work on an EBCDIC machine).

    > Shouldn't Python automatically alias whatever is returned by
    > locale.getpreferredencoding() to "mbcs", so that my original code works
    > portably?


    No. The "mbcs" codec has a slightly different semantics from the cp932
    codec, on your system. Specifically, the "mbcs" codec might map
    characters as approximations, whereas the cp932 codec will give errors
    if a certain Unicode character is not supported in the target character
    set.

    Regards,
    Martin
     
    Martin v. Löwis, Jan 27, 2008
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Edward K. Ream
    Replies:
    5
    Views:
    630
    Martin v. =?iso-8859-15?q?L=F6wis?=
    Oct 23, 2003
  2. Mike Conmackie
    Replies:
    2
    Views:
    1,523
    Michael Wojcik
    Oct 13, 2004
  3. Tejas
    Replies:
    1
    Views:
    630
    William Ahern
    Nov 14, 2007
  4. Skip Montanaro

    2to3 on Mac - unknown encoding: mbcs

    Skip Montanaro, Nov 6, 2009, in forum: Python
    Replies:
    0
    Views:
    398
    Skip Montanaro
    Nov 6, 2009
  5. Gabriel Genellina

    Re: 2to3 on Mac - unknown encoding: mbcs

    Gabriel Genellina, Nov 6, 2009, in forum: Python
    Replies:
    0
    Views:
    570
    Gabriel Genellina
    Nov 6, 2009
Loading...

Share This Page