unicode newbie - printing mixed languages to the terminal

Discussion in 'Python' started by David, May 4, 2008.

  1. David

    David Guest

    Hi list.

    I've never used unicode in a Python script before, but I need to now.
    I'm not sure where to start. I'm hoping that a kind soul can help me
    out here.

    My current (almost non-existant) knowledge of unicode:

    >From the docs I know about the unicode string type, and how to declare

    string types. What I don't understand yet is what encodings are and
    when you'd want/need to use them. What I'd like is to just be able to
    print out unicode strings in mixed languages, and they'd appear on the
    terminal the same way they get shown in a web browser (when you have
    appropriate fonts installed), without any fuss.

    Here is an example of how I'd like my script to work:

    $ ./test.py

    Random hiragana: <some jp characters>
    Random romaji: kakikukeko

    Is this possible?

    >From my limited knowledge, I *think* I need to do to things:


    1) In my Python script, run .encode() on my unicode variable before
    printing it out (I assume I need to encode into Japanese)

    Question: How does this work when you have multiple languages in a
    single unicode string? Do you need to separate them into separate
    strings (one per language) and print them separately?

    Or is it the case that you can (unlike a web browser) *only*
    display/print one language at a time? (I really want mixed language -
    English AND Japanese).

    2) Setup the terminal to display the output. From various online docs
    it looks like I need to set the LANG environment variable to Japanese,
    and then start konsole (or gnome-terminal if that will work better).
    But again, it looks like this limits me to 1 language.

    If what I want to do is very hard, I'll output html instead and view
    it in a web browser. But I'd prefer to use the terminal instead if
    possible :)

    Thanks in advance.

    David.
    David, May 4, 2008
    #1
    1. Advertising

  2. David wrote:
    > Hi list.
    >
    > I've never used unicode in a Python script before, but I need to now.
    > I'm not sure where to start. I'm hoping that a kind soul can help me
    > out here.
    >
    > My current (almost non-existant) knowledge of unicode:
    >
    >>From the docs I know about the unicode string type, and how to declare

    > string types. What I don't understand yet is what encodings are and
    > when you'd want/need to use them. What I'd like is to just be able to
    > print out unicode strings in mixed languages, and they'd appear on the
    > terminal the same way they get shown in a web browser (when you have
    > appropriate fonts installed), without any fuss.
    >
    > Here is an example of how I'd like my script to work:
    >
    > $ ./test.py
    >
    > Random hiragana: <some jp characters>
    > Random romaji: kakikukeko
    >
    > Is this possible?
    >
    >>From my limited knowledge, I *think* I need to do to things:

    >
    > 1) In my Python script, run .encode() on my unicode variable before
    > printing it out (I assume I need to encode into Japanese)
    >
    > Question: How does this work when you have multiple languages in a
    > single unicode string? Do you need to separate them into separate
    > strings (one per language) and print them separately?
    >
    > Or is it the case that you can (unlike a web browser) *only*
    > display/print one language at a time? (I really want mixed language -
    > English AND Japanese).
    >
    > 2) Setup the terminal to display the output. From various online docs
    > it looks like I need to set the LANG environment variable to Japanese,
    > and then start konsole (or gnome-terminal if that will work better).
    > But again, it looks like this limits me to 1 language.
    >
    > If what I want to do is very hard, I'll output html instead and view
    > it in a web browser. But I'd prefer to use the terminal instead if
    > possible :)


    I suggest you read http://www.amk.ca/python/howto/unicode to demystify
    what Unicode is and does, and how to use it in Python.

    Printing text from different languages is possible if and only if the
    output device (terminal, in this case) supports a character encoding
    that accommodates all the characters you wish to print. UTF-8 is a
    fairly ubiquitous candidate that fits that criteria, since it
    encompasses Unicode in its entirety (as opposed to latin-1, for example,
    which only includes a very small subset of Unicode).

    HTH,

    --
    Carsten Haese
    http://informixdb.sourceforge.net
    Carsten Haese, May 4, 2008
    #2
    1. Advertising

  3. David

    David Guest

    > I suggest you read http://www.amk.ca/python/howto/unicode to demystify what
    > Unicode is and does, and how to use it in Python.


    That document really helped.

    This page helped me to setup the console:http://www.jw-stumpel.nl/stestu.html#T3

    I ran:

    dpkg-reconfigure locales # And enabled a en_ZA.utf8
    update-locale LANG=en_ZA.utf8

    (And then rebooted, but I don't know if that was necessary).

    I can now print mixed language unicode to the console from Python.

    Thanks for your help.

    David.
    David, May 5, 2008
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jacob Kjaergaard

    printing the same spot in the terminal

    Jacob Kjaergaard, Apr 20, 2004, in forum: C Programming
    Replies:
    2
    Views:
    301
    Richard Bos
    Apr 20, 2004
  2. Paul Cheetham

    Master Pages and Mixed Languages

    Paul Cheetham, Feb 26, 2007, in forum: ASP .Net
    Replies:
    1
    Views:
    323
    Paul Cheetham
    Feb 26, 2007
  3. gaurav kashyap
    Replies:
    3
    Views:
    6,617
    Paul Boddie
    Oct 31, 2008
  4. Steve
    Replies:
    2
    Views:
    918
    edicionsdigitals.com edicions digitals xarxa socia
    Dec 7, 2010
  5. Arnold Shore

    ASP-to-RTF - Mixed Languages

    Arnold Shore, Sep 30, 2003, in forum: ASP General
    Replies:
    0
    Views:
    87
    Arnold Shore
    Sep 30, 2003
Loading...

Share This Page