I18n issue with optik

Discussion in 'Python' started by Thorsten Kampe, Mar 31, 2007.

  1. Hi,

    I've written a script which uses Optik/Optparse to display the
    options (which works fine). The text for the help message is localised
    (with german umlauts) and when I execute the script with the localised
    environment variable set, I get this traceback[1]. The interesting
    thing is that the localised optparse messages from displays fine -
    it's only my localisation that errors.

    From my understanding, my script doesn't put out anything, it's
    optik/optparse who does that. My po file is directly copied from the
    optik po file (who displays fine) and modified so the po file should
    be fine, too.

    What can I do to troubleshoot whether the culprit is my script, optik
    or gettext?

    Would it make sense to post the script and the mo or po files?


    Thorsten

    [1]
    Traceback (most recent call last):
    File "script.py", line 37, in <module>
    options, args = cmdlineparser.parse_args()
    File "/usr/lib/python2.5/optparse.py", line 1378, in parse_args
    stop = self._process_args(largs, rargs, values)
    File "/usr/lib/python2.5/optparse.py", line 1418, in _process_args
    self._process_long_opt(rargs, values)
    File "/usr/lib/python2.5/optparse.py", line 1493, in
    _process_long_opt
    option.process(opt, value, values, self)
    File "/usr/lib/python2.5/optparse.py", line 782, in process
    self.action, self.dest, opt, value, values, parser)
    File "/usr/lib/python2.5/optparse.py", line 804, in take_action
    parser.print_help()
    File "/usr/lib/python2.5/optparse.py", line 1648, in print_help
    file.write(self.format_help().encode(encoding, "replace"))
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position
    264: ordinal not in range(128)
    Thorsten Kampe, Mar 31, 2007
    #1
    1. Advertising

  2. Thorsten Kampe wrote:
    > I've written a script which uses Optik/Optparse to display the
    > options (which works fine). The text for the help message is localised
    > (with german umlauts) and when I execute the script with the localised
    > environment variable set, I get this traceback[1]. The interesting
    > thing is that the localised optparse messages from displays fine -
    > it's only my localisation that errors.
    >
    > From my understanding, my script doesn't put out anything, it's
    > optik/optparse who does that. My po file is directly copied from the
    > optik po file (who displays fine) and modified so the po file should
    > be fine, too.
    >
    > What can I do to troubleshoot whether the culprit is my script, optik
    > or gettext?
    >
    > Would it make sense to post the script and the mo or po files?


    Yes, probably. Though if you can reduce it to the simplest test case
    that produces the error, it'll increase your chances of having someone
    look at it.

    You could also try posting to the optik list:
    http://lists.sourceforge.net/lists/listinfo/optik-users

    STeVe
    Steven Bethard, Apr 1, 2007
    #2
    1. Advertising

  3. * Steven Bethard (Sat, 31 Mar 2007 20:08:45 -0600)
    > Thorsten Kampe wrote:
    > > I've written a script which uses Optik/Optparse to display the
    > > options (which works fine). The text for the help message is localised
    > > (with german umlauts) and when I execute the script with the localised
    > > environment variable set, I get this traceback[1]. The interesting
    > > thing is that the localised optparse messages from displays fine -
    > > it's only my localisation that errors.
    > >
    > > From my understanding, my script doesn't put out anything, it's
    > > optik/optparse who does that. My po file is directly copied from the
    > > optik po file (who displays fine) and modified so the po file should
    > > be fine, too.
    > >
    > > What can I do to troubleshoot whether the culprit is my script, optik
    > > or gettext?
    > >
    > > Would it make sense to post the script and the mo or po files?

    >
    > Yes, probably. Though if you can reduce it to the simplest test case
    > that produces the error, it'll increase your chances of having someone
    > look at it.


    The most simple test.py is:

    ###
    #! /usr/bin/env python

    import gettext, \
    os, \
    sys

    gettext.textdomain('optparse')
    gettext.install('test')

    from optparse import OptionParser, \
    OptionGroup

    cmdlineparser = OptionParser(description = _('THIS SOFTWARE COMES
    WITHOUT WARRANTY, LIABILITY OR SUPPORT!'))

    options, args = cmdlineparser.parse_args()
    ###

    When I run LANGUAGE=de ./test.py --help I get the error.

    ### This is the test.de.po file
    # Copyright (C) 2006 Thorsten Kampe
    # Thorsten Kampe <>, 2006

    msgid ""
    msgstr ""

    "Project-Id-Version: Template 1.0\n"
    "POT-Creation-Date: Tue Sep 7 22:20:34 2004\n"
    "PO-Revision-Date: 2005-07-03 16:47+0200\n"
    "Last-Translator: Thorsten Kampe <>\n"
    "Language-Team: Thorsten Kampe <>\n"
    "MIME-Version: 1.0\n"
    "Content-Type: text/plain; charset=ISO-8859-15\n"
    "Content-Transfer-Encoding: 8-bit\n"
    "Generated-By: pygettext.py 1.5\n"

    msgid "THIS SOFTWARE COMES WITHOUT WARRANTY, LIABILITY OR SUPPORT!"
    msgstr "DIESES PROGRAMM HAT WEDER GEWÄHRLEISTUNG, HAFTUNG NOCH
    UNTERSTÜTZUNG!"
    ###

    The localisation now produces an error in the localised optik files,
    too.

    Under Windows I get " File "G:\program files\python\lib\encodings
    \cp1252.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_table)"

    Is there something I have to do to put the terminal in "non-ascii
    output mode"?

    I tried

    ###
    #! /usr/bin/env python
    # -*- coding: ISO-8859-15 -*-

    print "DIESES PROGRAMM HAT WEDER GEWÄHRLEISTUNG, HAFTUNG NOCH
    UNTERSTÜTZUNG!"
    ###

    ....and this worked. That means that my terminal is willing to print,
    right?!

    > You could also try posting to the optik list:
    > http://lists.sourceforge.net/lists/listinfo/optik-users


    I already did this via Gmane (although the list seems pretty dead to
    me). Sourceforge seems to have a bigger problem as [1] and [2] error.

    Sorry for the confusion but this Unicode magic is far from being
    rational. I guess most people just don't get it...


    Thorsten
    [1] http://sourceforge.net/mailarchive/forum.php?forum=optik-users
    [2] https://lists.sourceforge.net/lists/listinfo
    Thorsten Kampe, Apr 1, 2007
    #3
  4. Just an addition : when I insert this statement...

    print _('THIS SOFTWARE COMES WITHOUT WARRANTY, LIABILITY OR SUPPORT!')

    into this skript, the line is printed out. So if my Skript can output
    the localised text but Optparse can't it should be an optparse bug,
    right?!

    Thorsten
    Thorsten Kampe, Apr 1, 2007
    #4
  5. I guess the culprit is this snippet from optparse.py:

    # used by test suite
    def _get_encoding(self, file):
    encoding = getattr(file, "encoding", None)
    if not encoding:
    encoding = sys.getdefaultencoding()
    return encoding

    def print_help(self, file=None):
    """print_help(file : file = stdout)

    Print an extended help message, listing all options and any
    help text provided with them, to 'file' (default stdout).
    """
    if file is None:
    file = sys.stdout
    encoding = self._get_encoding(file)
    file.write(self.format_help().encode(encoding, "replace"))

    So this means: when the encoding of sys.stdout is US-ASCII, Optparse
    sets the encoding to of the help text to ASCII, too. But that's
    nonsense because the Encoding is declared in the Po (localisation)
    file.

    How can I set the encoding of sys.stdout to another encoding? Of
    course this would be a terrible hack if the encoding of the
    localisation changes or different translators use different
    encodings...

    Thorsten
    Thorsten Kampe, Apr 1, 2007
    #5
  6. Thorsten Kampe wrote:
    > * Steven Bethard (Sat, 31 Mar 2007 20:08:45 -0600)
    >> Thorsten Kampe wrote:
    >>> I've written a script which uses Optik/Optparse to display the
    >>> options (which works fine). The text for the help message is localised
    >>> (with german umlauts) and when I execute the script with the localised
    >>> environment variable set, I get this traceback[1]. The interesting
    >>> thing is that the localised optparse messages from displays fine -
    >>> it's only my localisation that errors.
    >>>
    >>> From my understanding, my script doesn't put out anything, it's
    >>> optik/optparse who does that. My po file is directly copied from the
    >>> optik po file (who displays fine) and modified so the po file should
    >>> be fine, too.
    >>>
    >>> What can I do to troubleshoot whether the culprit is my script, optik
    >>> or gettext?
    >>>
    >>> Would it make sense to post the script and the mo or po files?

    >> Yes, probably. Though if you can reduce it to the simplest test case
    >> that produces the error, it'll increase your chances of having someone
    >> look at it.

    >
    > The most simple test.py is:
    >
    > ###
    > #! /usr/bin/env python
    >
    > import gettext, \
    > os, \
    > sys
    >
    > gettext.textdomain('optparse')
    > gettext.install('test')
    >
    > from optparse import OptionParser, \
    > OptionGroup
    >
    > cmdlineparser = OptionParser(description = _('THIS SOFTWARE COMES
    > WITHOUT WARRANTY, LIABILITY OR SUPPORT!'))
    >
    > options, args = cmdlineparser.parse_args()
    > ###
    >
    > When I run LANGUAGE=de ./test.py --help I get the error.
    >
    > ### This is the test.de.po file
    > # Copyright (C) 2006 Thorsten Kampe
    > # Thorsten Kampe <>, 2006
    >
    > msgid ""
    > msgstr ""
    >
    > "Project-Id-Version: Template 1.0\n"
    > "POT-Creation-Date: Tue Sep 7 22:20:34 2004\n"
    > "PO-Revision-Date: 2005-07-03 16:47+0200\n"
    > "Last-Translator: Thorsten Kampe <>\n"
    > "Language-Team: Thorsten Kampe <>\n"
    > "MIME-Version: 1.0\n"
    > "Content-Type: text/plain; charset=ISO-8859-15\n"
    > "Content-Transfer-Encoding: 8-bit\n"
    > "Generated-By: pygettext.py 1.5\n"
    >
    > msgid "THIS SOFTWARE COMES WITHOUT WARRANTY, LIABILITY OR SUPPORT!"
    > msgstr "DIESES PROGRAMM HAT WEDER GEWÄHRLEISTUNG, HAFTUNG NOCH
    > UNTERSTÜTZUNG!"
    > ###
    >
    > The localisation now produces an error in the localised optik files,
    > too.
    >
    > Under Windows I get " File "G:\program files\python\lib\encodings
    > \cp1252.py", line 12, in encode
    > return codecs.charmap_encode(input,errors,encoding_table)"


    I'm not very experienced with internationalization, but if you change::

    gettext.install('test')

    to::

    gettext.install('test', unicode=True)

    what happens?

    STeVe
    Steven Bethard, Apr 1, 2007
    #6
  7. Thorsten Kampe wrote:
    > I guess the culprit is this snippet from optparse.py:
    >
    > # used by test suite
    > def _get_encoding(self, file):
    > encoding = getattr(file, "encoding", None)
    > if not encoding:
    > encoding = sys.getdefaultencoding()
    > return encoding
    >
    > def print_help(self, file=None):
    > """print_help(file : file = stdout)
    >
    > Print an extended help message, listing all options and any
    > help text provided with them, to 'file' (default stdout).
    > """
    > if file is None:
    > file = sys.stdout
    > encoding = self._get_encoding(file)
    > file.write(self.format_help().encode(encoding, "replace"))
    >
    > So this means: when the encoding of sys.stdout is US-ASCII, Optparse
    > sets the encoding to of the help text to ASCII, too. But that's
    > nonsense because the Encoding is declared in the Po (localisation)
    > file.
    >
    > How can I set the encoding of sys.stdout to another encoding? Of
    > course this would be a terrible hack if the encoding of the
    > localisation changes or different translators use different
    > encodings...


    If print_help() is what's wrong, you should probably hack print_help()
    instead of sys.stdout. You could try something like::

    def print_help(self, file=None):
    """print_help(file : file = stdout)

    Print an extended help message, listing all options and any
    help text provided with them, to 'file' (default stdout).
    """
    if file is None:
    file = sys.stdout
    file.write(self.format_help())

    optparse.OptionParser.print_help = print_help

    cmdlineparser = optparse.OptionParser(description=...)
    ...

    That is, you could monkey-patch print_help() before you create an
    OptionParser.

    STeVe
    Steven Bethard, Apr 1, 2007
    #7
  8. * Steven Bethard (Sun, 01 Apr 2007 10:21:40 -0600)
    > Thorsten Kampe wrote:
    > > * Steven Bethard (Sat, 31 Mar 2007 20:08:45 -0600)
    > >> Thorsten Kampe wrote:
    > >>> I've written a script which uses Optik/Optparse to display the
    > >>> options (which works fine). The text for the help message is localised
    > >>> (with german umlauts) and when I execute the script with the localised
    > >>> environment variable set, I get this traceback[1]. The interesting
    > >>> thing is that the localised optparse messages from displays fine -
    > >>> it's only my localisation that errors.
    > >>>
    > >>> From my understanding, my script doesn't put out anything, it's
    > >>> optik/optparse who does that. My po file is directly copied from the
    > >>> optik po file (who displays fine) and modified so the po file should
    > >>> be fine, too.
    > >>>
    > >>> What can I do to troubleshoot whether the culprit is my script, optik
    > >>> or gettext?
    > >>>
    > >>> Would it make sense to post the script and the mo or po files?
    > >> Yes, probably. Though if you can reduce it to the simplest test case
    > >> that produces the error, it'll increase your chances of having someone
    > >> look at it.

    > >
    > > The most simple test.py is:
    > >
    > > ###
    > > #! /usr/bin/env python
    > >
    > > import gettext, \
    > > os, \
    > > sys
    > >
    > > gettext.textdomain('optparse')
    > > gettext.install('test')
    > >
    > > from optparse import OptionParser, \
    > > OptionGroup
    > >
    > > cmdlineparser = OptionParser(description = _('THIS SOFTWARE COMES
    > > WITHOUT WARRANTY, LIABILITY OR SUPPORT!'))
    > >
    > > options, args = cmdlineparser.parse_args()
    > > ###
    > >
    > > When I run LANGUAGE=de ./test.py --help I get the error.
    > >
    > > ### This is the test.de.po file
    > > # Copyright (C) 2006 Thorsten Kampe
    > > # Thorsten Kampe <>, 2006
    > >
    > > msgid ""
    > > msgstr ""
    > >
    > > "Project-Id-Version: Template 1.0\n"
    > > "POT-Creation-Date: Tue Sep 7 22:20:34 2004\n"
    > > "PO-Revision-Date: 2005-07-03 16:47+0200\n"
    > > "Last-Translator: Thorsten Kampe <>\n"
    > > "Language-Team: Thorsten Kampe <>\n"
    > > "MIME-Version: 1.0\n"
    > > "Content-Type: text/plain; charset=ISO-8859-15\n"
    > > "Content-Transfer-Encoding: 8-bit\n"
    > > "Generated-By: pygettext.py 1.5\n"
    > >
    > > msgid "THIS SOFTWARE COMES WITHOUT WARRANTY, LIABILITY OR SUPPORT!"
    > > msgstr "DIESES PROGRAMM HAT WEDER GEWÄHRLEISTUNG, HAFTUNG NOCH
    > > UNTERSTÜTZUNG!"
    > > ###
    > >
    > > The localisation now produces an error in the localised optik files,
    > > too.
    > >
    > > Under Windows I get " File "G:\program files\python\lib\encodings
    > > \cp1252.py", line 12, in encode
    > > return codecs.charmap_encode(input,errors,encoding_table)"

    >
    > I'm not very experienced with internationalization, but if you change::
    >
    > gettext.install('test')
    >
    > to::
    >
    > gettext.install('test', unicode=True)
    >
    > what happens?


    No traceback anymore from optparse but the non-ascii umlauts are
    displayed as question marks ("?").

    Thorsten
    Thorsten Kampe, Apr 1, 2007
    #8
  9. * Steven Bethard (Sun, 01 Apr 2007 10:26:54 -0600)
    > Thorsten Kampe wrote:
    > > I guess the culprit is this snippet from optparse.py:
    > >
    > > # used by test suite
    > > def _get_encoding(self, file):
    > > encoding = getattr(file, "encoding", None)
    > > if not encoding:
    > > encoding = sys.getdefaultencoding()
    > > return encoding
    > >
    > > def print_help(self, file=None):
    > > """print_help(file : file = stdout)
    > >
    > > Print an extended help message, listing all options and any
    > > help text provided with them, to 'file' (default stdout).
    > > """
    > > if file is None:
    > > file = sys.stdout
    > > encoding = self._get_encoding(file)
    > > file.write(self.format_help().encode(encoding, "replace"))
    > >
    > > So this means: when the encoding of sys.stdout is US-ASCII, Optparse
    > > sets the encoding to of the help text to ASCII, too. But that's
    > > nonsense because the Encoding is declared in the Po (localisation)
    > > file.
    > >
    > > How can I set the encoding of sys.stdout to another encoding? Of
    > > course this would be a terrible hack if the encoding of the
    > > localisation changes or different translators use different
    > > encodings...

    >
    > If print_help() is what's wrong, you should probably hack print_help()
    > instead of sys.stdout. You could try something like::
    >
    > def print_help(self, file=None):
    > """print_help(file : file = stdout)
    >
    > Print an extended help message, listing all options and any
    > help text provided with them, to 'file' (default stdout).
    > """
    > if file is None:
    > file = sys.stdout
    > file.write(self.format_help())
    >
    > optparse.OptionParser.print_help = print_help
    >
    > cmdlineparser = optparse.OptionParser(description=...)
    > ...
    >
    > That is, you could monkey-patch print_help() before you create an
    > OptionParser.


    Yes, I could do that but I'd rather know first if my code is wrong or
    the optparse code.

    Thorsten
    Thorsten Kampe, Apr 1, 2007
    #9
  10. * Thorsten Kampe (Sun, 1 Apr 2007 19:45:59 +0100)
    > Yes, I could do that but I'd rather know first if my code is wrong or
    > the optparse code.


    It might be the bug mentioned in
    http://mail.python.org/pipermail/python-dev/2006-May/065458.html

    The patch although doesn't work. From my unicode-charset-codepage-
    codeset-challenged point of view the encoding of sys.stdout doesn't
    matter. The charset is defined in the .po/.mo file (but of course
    optparse can't know if the message has been translated by gettext
    ("_").

    Thorsten
    Thorsten Kampe, Apr 1, 2007
    #10
  11. * Thorsten Kampe (Sun, 1 Apr 2007 20:08:39 +0100)
    > * Thorsten Kampe (Sun, 1 Apr 2007 19:45:59 +0100)
    > > Yes, I could do that but I'd rather know first if my code is wrong or
    > > the optparse code.

    >
    > It might be the bug mentioned in
    > http://mail.python.org/pipermail/python-dev/2006-May/065458.html
    >
    > The patch although doesn't work. From my unicode-charset-codepage-
    > codeset-challenged point of view the encoding of sys.stdout doesn't
    > matter. The charset is defined in the .po/.mo file (but of course
    > optparse can't know if the message has been translated by gettext
    > ("_").


    If I "patch" line 1648 (the one mentioned in the traceback) of
    optparse.py from

    file.write(self.format_help().encode(encoding, "replace"))
    to
    file.write(self.format_help())

    ....then everything works and is displayed fine (even without the
    "unicode = True" parameter to gettext.install).

    But the "patch" might make other things fail, of course...

    Thorsten
    Thorsten Kampe, Apr 1, 2007
    #11
  12. * Thorsten Kampe (Sun, 1 Apr 2007 20:22:51 +0100)
    > * Thorsten Kampe (Sun, 1 Apr 2007 20:08:39 +0100)
    > > * Thorsten Kampe (Sun, 1 Apr 2007 19:45:59 +0100)
    > > > Yes, I could do that but I'd rather know first if my code is wrong or
    > > > the optparse code.

    > >
    > > It might be the bug mentioned in
    > > http://mail.python.org/pipermail/python-dev/2006-May/065458.html
    > >
    > > The patch although doesn't work. From my unicode-charset-codepage-
    > > codeset-challenged point of view the encoding of sys.stdout doesn't
    > > matter. The charset is defined in the .po/.mo file (but of course
    > > optparse can't know if the message has been translated by gettext
    > > ("_").

    >
    > If I "patch" line 1648 (the one mentioned in the traceback) of
    > optparse.py from
    >
    > file.write(self.format_help().encode(encoding, "replace"))
    > to
    > file.write(self.format_help())
    >
    > ...then everything works and is displayed fine [...]


    ....but only in Cygwin rxvt, the standard Windows console doesn't show
    the right colors.

    I give up and revert back to ASCII. This whole charset mess is not
    meant to solved by mere mortals.

    Thorsten
    Thorsten Kampe, Apr 1, 2007
    #12
  13. Thorsten Kampe

    Jarek Zgoda Guest

    Thorsten Kampe napisa³(a):

    >>> Under Windows I get " File "G:\program files\python\lib\encodings
    >>> \cp1252.py", line 12, in encode
    >>> return codecs.charmap_encode(input,errors,encoding_table)"

    >> I'm not very experienced with internationalization, but if you change::
    >>
    >> gettext.install('test')
    >>
    >> to::
    >>
    >> gettext.install('test', unicode=True)
    >>
    >> what happens?

    >
    > No traceback anymore from optparse but the non-ascii umlauts are
    > displayed as question marks ("?").


    And this is expected behaviour of encode() with errors set to 'replace'.
    I think this is the solution to your problem. I was a bit surprised I
    never saw this error, but I always use the unicode=True setting to
    gettext.install()...

    --
    Jarek Zgoda
    http://jpa.berlios.de/
    Jarek Zgoda, Apr 1, 2007
    #13
  14. Thorsten Kampe

    Leo Kislov Guest

    On Apr 1, 8:47 am, Thorsten Kampe <> wrote:
    > I guess the culprit is this snippet from optparse.py:
    >
    > # used by test suite
    > def _get_encoding(self, file):
    > encoding = getattr(file, "encoding", None)
    > if not encoding:
    > encoding = sys.getdefaultencoding()
    > return encoding
    >
    > def print_help(self, file=None):
    > """print_help(file : file = stdout)
    >
    > Print an extended help message, listing all options and any
    > help text provided with them, to 'file' (default stdout).
    > """
    > if file is None:
    > file = sys.stdout
    > encoding = self._get_encoding(file)
    > file.write(self.format_help().encode(encoding, "replace"))
    >
    > So this means: when the encoding of sys.stdout is US-ASCII, Optparse
    > sets the encoding to of the help text to ASCII, too.


    ..encode() method doesn't set an encoding. It encodes unicode text into
    bytes according to specified encoding. That means optparse needs ascii
    or unicode (at least) for help text. In other words you'd better use
    unicode throughout your program.

    > But that's
    > nonsense because the Encoding is declared in the Po (localisation)
    > file.


    For backward compatibility gettext is working with bytes by default,
    so the PO file encoding is not even involved. You need to use unicode
    gettext.

    > How can I set the encoding of sys.stdout to another encoding?


    What are you going to set it to? As I understand you're going to
    distribute your program to some users. How are you going to find out
    the encoding of the terminal of your users?

    -- Leo
    Leo Kislov, Apr 1, 2007
    #14
  15. * Leo Kislov (1 Apr 2007 14:24:17 -0700)
    > On Apr 1, 8:47 am, Thorsten Kampe <> wrote:
    > > I guess the culprit is this snippet from optparse.py:
    > >
    > > # used by test suite
    > > def _get_encoding(self, file):
    > > encoding = getattr(file, "encoding", None)
    > > if not encoding:
    > > encoding = sys.getdefaultencoding()
    > > return encoding
    > >
    > > def print_help(self, file=None):
    > > """print_help(file : file = stdout)
    > >
    > > Print an extended help message, listing all options and any
    > > help text provided with them, to 'file' (default stdout).
    > > """
    > > if file is None:
    > > file = sys.stdout
    > > encoding = self._get_encoding(file)
    > > file.write(self.format_help().encode(encoding, "replace"))
    > >
    > > So this means: when the encoding of sys.stdout is US-ASCII, Optparse
    > > sets the encoding to of the help text to ASCII, too.

    >
    > .encode() method doesn't set an encoding. It encodes unicode text into
    > bytes according to specified encoding. That means optparse needs ascii
    > or unicode (at least) for help text. In other words you'd better use
    > unicode throughout your program.
    >
    > > But that's
    > > nonsense because the Encoding is declared in the Po (localisation)
    > > file.

    >
    > For backward compatibility gettext is working with bytes by default,
    > so the PO file encoding is not even involved. You need to use unicode
    > gettext.


    You mean

    gettext.install('test', unicode = True)
    and
    description = _(u'THIS SOFTWARE COMES WITHOUT WARRANTY, LIABILITY OR
    SUPPORT!') ?

    If I modify my code like this, I don't get any traceback anymore, but
    the non-ascii umlauts are still displayed as question marks.


    Thorsten
    Thorsten Kampe, Apr 2, 2007
    #15
  16. * Jarek Zgoda (Sun, 01 Apr 2007 22:02:15 +0200)
    > Thorsten Kampe napisa?(a):
    >
    > >>> Under Windows I get " File "G:\program files\python\lib\encodings
    > >>> \cp1252.py", line 12, in encode
    > >>> return codecs.charmap_encode(input,errors,encoding_table)"
    > >> I'm not very experienced with internationalization, but if you change::
    > >>
    > >> gettext.install('test')
    > >>
    > >> to::
    > >>
    > >> gettext.install('test', unicode=True)
    > >>
    > >> what happens?

    > >
    > > No traceback anymore from optparse but the non-ascii umlauts are
    > > displayed as question marks ("?").

    >
    > And this is expected behaviour of encode() with errors set to 'replace'.
    > I think this is the solution to your problem. I was a bit surprised I
    > never saw this error, but I always use the unicode=True setting to
    > gettext.install()...


    I can't see the "solution" here. Is the optparse "print_help" function
    wrong? Why should there even be errors if I use "unicode = True" with
    gettext.install?

    I have ISO-8859-15 gettext translations and I want optparse to display
    them correctly. What do I have to do?

    Thorsten
    Thorsten Kampe, Apr 2, 2007
    #16
  17. * Steven Bethard (Sun, 01 Apr 2007 10:21:40 -0600)
    > Thorsten Kampe wrote:
    > I'm not very experienced with internationalization, but if you change::
    >
    > gettext.install('test')
    >
    > to::
    >
    > gettext.install('test', unicode=True)
    >
    > what happens?


    Actually, this is the solution.

    But there's one more problem: the solution only works when the
    Terminal encoding is not US-ASCII. Unfortunately (almost) all
    terminals I tried are set to US-ASCII (rxvt under Cygwin, Console[1]
    running bash, Poderosa[2] running bash). Only the Windows Console is
    CP852 and this works.

    I got the tip to set a different encoding by
    sys.stdout = codecs.EncodedFile(sys.stdout, 'utf-8')

    but unfortunately this does not change the encoding of any Terminal.
    So my question is: how can I set a different encoding to sys.stdout
    (or why can I set it without any error but nothing changes?)


    Thorsten

    [1] http://sourceforge.net/project/screenshots.php?group_id=43764
    [2] http://en.poderosa.org/present/about_poderosa.html
    Thorsten Kampe, Apr 2, 2007
    #17
  18. * Jarek Zgoda (Mon, 02 Apr 2007 17:52:34 +0200)
    > Thorsten Kampe napisa?(a):
    >
    > > I can't see the "solution" here. Is the optparse "print_help" function
    > > wrong? Why should there even be errors if I use "unicode = True" with
    > > gettext.install?
    > >
    > > I have ISO-8859-15 gettext translations and I want optparse to display
    > > them correctly. What do I have to do?

    >
    > Please, see gettext module documentation on this topic.
    >
    > The solution is: always install your translation with unicode=True
    > setting. This assures usage of ugettext() instead of gettext() and works
    > properly with character sets other than ASCII. Your messages are
    > internally decoded to unicode objects and passed to output. Then the
    > displayed output will be limited only by the encoding of your terminal,


    You are right. My problem is that all the terminals I use are set to
    US-ASCII (rxvt under Cygwin, Console[1] running bash, Poderosa[2]
    running bash). Even those who actually support non-ASCII characters.

    I got the tip to set a different encoding by
    sys.stdout = codecs.EncodedFile(sys.stdout, 'utf-8')

    but unfortunately this does not change the encoding.

    So my question is: how can I set a different encoding to sys.stdout
    (or why can I set it without any error but nothing changes?)


    Thorsten

    [1] http://sourceforge.net/project/screenshots.php?group_id=43764
    [2] http://en.poderosa.org/present/about_poderosa.html
    Thorsten Kampe, Apr 2, 2007
    #18
  19. Thorsten Kampe

    paul Guest

    Thorsten Kampe schrieb:
    [snipp]
    > I got the tip to set a different encoding by
    > sys.stdout = codecs.EncodedFile(sys.stdout, 'utf-8')
    >
    > but unfortunately this does not change the encoding of any Terminal.
    > So my question is: how can I set a different encoding to sys.stdout
    > (or why can I set it without any error but nothing changes?)

    AFAIK you can't. If the terminal is limited to ascii it won't be able to
    display anything else; it might not even have the right font, so how are
    you supposed to fix that? The .encode(encoding, "replace") ensures safe
    downgrades though.

    cheers
    Paul
    paul, Apr 2, 2007
    #19
  20. Thorsten Kampe

    Jarek Zgoda Guest

    Thorsten Kampe napisa³(a):

    > I can't see the "solution" here. Is the optparse "print_help" function
    > wrong? Why should there even be errors if I use "unicode = True" with
    > gettext.install?
    >
    > I have ISO-8859-15 gettext translations and I want optparse to display
    > them correctly. What do I have to do?


    Please, see gettext module documentation on this topic.

    The solution is: always install your translation with unicode=True
    setting. This assures usage of ugettext() instead of gettext() and works
    properly with character sets other than ASCII. Your messages are
    internally decoded to unicode objects and passed to output. Then the
    displayed output will be limited only by the encoding of your terminal,
    but it will not crash your program on any inconsistency, you would see
    question marks.

    --
    Jarek Zgoda

    "We read Knuth so you don't have to."
    Jarek Zgoda, Apr 2, 2007
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Klaus
    Replies:
    0
    Views:
    3,818
    Klaus
    Jul 2, 2003
  2. Kent Tong

    i18n using resource bundle

    Kent Tong, Jul 15, 2003, in forum: Java
    Replies:
    2
    Views:
    883
    Kent Tong
    Jul 16, 2003
  3. Jack

    TILES: How to i18n a put

    Jack, Jul 25, 2003, in forum: Java
    Replies:
    0
    Views:
    422
  4. David Thielen

    sample struts w/ I18N and validator

    David Thielen, Aug 2, 2003, in forum: Java
    Replies:
    0
    Views:
    443
    David Thielen
    Aug 2, 2003
  5. Bobby Quinne

    i18n Currency issue

    Bobby Quinne, Apr 8, 2008, in forum: Java
    Replies:
    2
    Views:
    488
    Bobby Quinne
    Apr 8, 2008
Loading...

Share This Page