printing a list with non-ascii strings

Discussion in 'Python' started by Helmut Jarausch, Jan 20, 2011.

  1. Hi,

    I don't understand Python's behaviour when printing a list.
    The following example uses 2 German non-ascii characters.

    #!/usr/bin/python
    # _*_ coding: latin1 _*_
    L=["abc","süß","def"]
    print L[1],L

    The output of L[1] is correct, while the output of L shows up as
    ['abc', 's\xfc\xdf', 'def']

    How can this be changed?

    Thanks for hint,
    Helmut.
     
    Helmut Jarausch, Jan 20, 2011
    #1
    1. Advertising

  2. Helmut Jarausch

    Peter Otten Guest

    Helmut Jarausch wrote:

    > I don't understand Python's behaviour when printing a list.
    > The following example uses 2 German non-ascii characters.
    >
    > #!/usr/bin/python
    > # _*_ coding: latin1 _*_
    > L=["abc","süß","def"]
    > print L[1],L
    >
    > The output of L[1] is correct, while the output of L shows up as
    > ['abc', 's\xfc\xdf', 'def']
    >
    > How can this be changed?


    Use unicode and follow Martin's recipe:

    http://mail.python.org/pipermail/python-list/2011-January/1263783.html
     
    Peter Otten, Jan 20, 2011
    #2
    1. Advertising

  3. Helmut Jarausch <> writes:

    > Hi,
    >
    > I don't understand Python's behaviour when printing a list.
    > The following example uses 2 German non-ascii characters.
    >
    > #!/usr/bin/python
    > # _*_ coding: latin1 _*_
    > L=["abc","süß","def"]
    > print L[1],L
    >
    > The output of L[1] is correct, while the output of L shows up as
    > ['abc', 's\xfc\xdf', 'def']
    >
    > How can this be changed?
    >
    > Thanks for hint,
    > Helmut.


    That's because when you print a list, the code executed is roughly:

    print "[" + ", ".join(repr(x) for x in L) + "]"

    Now try:

    print repr("süß")

    I don't think this can be changed in Python 2.X. I vaguely remember
    discussions about this issue for Python 3 I think, but I can't remember
    the outcome and it is different anyway as Python 3 strings are not the
    same as Python 2 strings (they are the same as Python 2 unicode strings).

    The issue though is that the python interpreter doesn't know what
    encoding is supposed to be used for a string - a string in Python 2.X is
    a sequence of bytes. If you print the string, then the terminal encodes
    the bytes according to its settings, which has nothing to do with python
    - so the appearance will differ according to the locale configuration of
    the terminal. However, the repr() of a string needs to be consistent
    irrespective of the configuration of the terminal - so the only viable
    option is to use nothing but ASCII characters. Hence the difference.

    HTH

    --
    Arnaud
     
    Arnaud Delobelle, Jan 20, 2011
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. TOXiC
    Replies:
    5
    Views:
    1,283
    TOXiC
    Jan 31, 2007
  2. Dun Peal
    Replies:
    2
    Views:
    266
    Carl Banks
    Oct 18, 2010
  3. Jochen Lehmeier

    DBD::Oracle, Unicode, non-UTF8-non-ASCII strings

    Jochen Lehmeier, Jul 23, 2009, in forum: Perl Misc
    Replies:
    0
    Views:
    425
    Jochen Lehmeier
    Jul 23, 2009
  4. bruce
    Replies:
    38
    Views:
    293
    Mark Lawrence
    Nov 1, 2013
  5. MRAB
    Replies:
    0
    Views:
    102
Loading...

Share This Page