UnicodeEncodeError when not running script from IDE

Discussion in 'Python' started by Magnus Pettersson, Feb 12, 2013.

  1. I am using Eclipse to write my python scripts and when i run them from inside eclipse they work fine without errors.

    But almost in every script that handle some form of special characters likeswedish åäö and chinese characters etc i get Unicode errors when running the script externally with python.exe or pythonw.exe (but the scripts run completely fine from within Eclipse (standard pydev projects, python2.7). I have usually launched the script gui from wihin eclipse because of this error but now i want to get the bottom of this so i dont have to open eclipse everytime i want to run a script!

    Here is the error i get now when running the script with python.exe:
    UnicodeEncodeError:'charmap' codec cant encode character u'\u898b' in position 32: character maps to <undefined>

    what can i do to fix this?
    Magnus Pettersson, Feb 12, 2013
    #1
    1. Advertising

  2. Magnus Pettersson

    Andrew Berg Guest

    On 2013.02.12 04:43, Magnus Pettersson wrote:
    > I am using Eclipse to write my python scripts and when i run them from inside eclipse they work fine without errors.
    >
    > But almost in every script that handle some form of special characters like swedish åäö and chinese characters etc i get Unicode errors when running the script externally with python.exe or pythonw.exe (but the scripts run completely fine from within Eclipse (standard pydev projects, python2.7). I have usually launched the script gui from wihin eclipse because of this error but now i want to get the bottom of this so i dont have to open eclipse everytime i want to run a script!
    >
    > Here is the error i get now when running the script with python.exe:
    > UnicodeEncodeError:'charmap' codec cant encode character u'\u898b' in position 32: character maps to <undefined>
    >
    > what can i do to fix this?
    >

    Since you didn't say what code actually does this, I'll turn to my
    crystal ball. It says you are trying to print characters to a terminal
    that doesn't support them. If that is the case, you could try changing
    the code page (but only 3.3 supports cp65001, so that probably won't
    help) or use replacement characters when printing.

    --
    CPython 3.3.0 | Windows NT 6.2.9200.16461 / FreeBSD 9.1-RELEASE
    Andrew Berg, Feb 12, 2013
    #2
    1. Advertising

  3. Magnus Pettersson wrote:

    > I am using Eclipse to write my python scripts and when i run them from
    > inside eclipse they work fine without errors.
    >
    > But almost in every script that handle some form of special characters
    > like swedish åäö and chinese characters etc


    A comment: they are not "special" characters. They're merely not American.


    > i get Unicode errors when
    > running the script externally with python.exe or pythonw.exe (but the
    > scripts run completely fine from within Eclipse (standard pydev projects,
    > python2.7). I have usually launched the script gui from wihin eclipse
    > because of this error but now i want to get the bottom of this so i dont
    > have to open eclipse everytime i want to run a script!
    >
    > Here is the error i get now when running the script with python.exe:
    > UnicodeEncodeError:'charmap' codec cant encode character u'\u898b' in
    > position 32: character maps to <undefined>


    Please show the *complete* traceback, including the line of code that causes
    the exception.


    > what can i do to fix this?


    My guess is that you are trying to print a character which your terminal
    cannot display. My terminal is set to use UTF-8, and so it can display it
    fine:

    py> c = u'\u898b'
    py> print(c)
    見


    (or at least it would display fine if the font used had a glyph for that
    character). Why there are still terminals in the world that don't default
    to UTF-8 is beyond me.

    If I manually change the terminal's encoding to Western European ISO 8859-1,
    I get some moji-bake:

    py> print(c)
    è¦


    I can't replicate the exception you give, so I assume it is specific to
    Windows.




    --
    Steven
    Steven D'Aprano, Feb 12, 2013
    #3
  4. Ahh so its the actual printing that makes it error out outside of eclipse because its a different terminal that its printing to. Its the default DOS terminal in windows that runs then i run the script with python.exe and i guess its the same when i run with pythonw.exe just that the terminal window is not opened up, only the pyqt gui in this case.

    I will try to fix it now when i know what it is :)

    I never thought about the terminal, last time i had the same problem i justwere playing around for hours with unicode encode and decode and all that not-so-fun stuff :)

    Andrew Berg: Thanks, your crystal ball seems to be right :p

    On Tuesday, February 12, 2013 12:43:00 PM UTC+1, Steven D'Aprano wrote:
    > Magnus Pettersson wrote:
    >
    >
    >
    > > I am using Eclipse to write my python scripts and when i run them from

    >
    > > inside eclipse they work fine without errors.

    >
    > >

    >
    > > But almost in every script that handle some form of special characters

    >
    > > like swedish åäö and chinese characters etc

    >
    >
    >
    > A comment: they are not "special" characters. They're merely not American..
    >
    >
    >
    >
    >
    > > i get Unicode errors when

    >
    > > running the script externally with python.exe or pythonw.exe (but the

    >
    > > scripts run completely fine from within Eclipse (standard pydev projects,

    >
    > > python2.7). I have usually launched the script gui from wihin eclipse

    >
    > > because of this error but now i want to get the bottom of this so i dont

    >
    > > have to open eclipse everytime i want to run a script!

    >
    > >

    >
    > > Here is the error i get now when running the script with python.exe:

    >
    > > UnicodeEncodeError:'charmap' codec cant encode character u'\u898b' in

    >
    > > position 32: character maps to <undefined>

    >
    >
    >
    > Please show the *complete* traceback, including the line of code that causes
    >
    > the exception.
    >
    >
    >
    >
    >
    > > what can i do to fix this?

    >
    >
    >
    > My guess is that you are trying to print a character which your terminal
    >
    > cannot display. My terminal is set to use UTF-8, and so it can display it
    >
    > fine:
    >
    >
    >
    > py> c = u'\u898b'
    >
    > py> print(c)
    >
    > 見
    >
    >
    >
    >
    >
    > (or at least it would display fine if the font used had a glyph for that
    >
    > character). Why there are still terminals in the world that don't default
    >
    > to UTF-8 is beyond me.
    >
    >
    >
    > If I manually change the terminal's encoding to Western European ISO 8859-1,
    >
    > I get some moji-bake:
    >
    >
    >
    > py> print(c)
    >
    > è¦
    >
    >
    >
    >
    >
    > I can't replicate the exception you give, so I assume it is specific to
    >
    > Windows.
    >
    >
    >
    >
    >
    >
    >
    >
    >
    > --
    >
    > Steven
    Magnus Pettersson, Feb 12, 2013
    #4
  5. I have tried now to take away printing to terminal and just keeping the writing to a .txt file to disk (which is what the scripts purpose is):

    with open(filepath,"a") as f:
    for card in cardlist:
    f.write(card+"\n")

    The file it writes to exists and im just appending to it, but when i run the script trough eclipse, all is fine. When i run in terminal i get this error instead:

    File "K:\dev\python\webscraping\kanji_anki.py", line 69, in savefile
    f.write(card+"\n")
    UnicodeEncodeError: 'ascii' codec can't encode character u'\u898b' in position 3
    2: ordinal not in range(128)

    On Tuesday, February 12, 2013 12:01:19 PM UTC+1, Andrew Berg wrote:
    > On 2013.02.12 04:43, Magnus Pettersson wrote:
    >
    > > I am using Eclipse to write my python scripts and when i run them from inside eclipse they work fine without errors.

    >
    > >

    >
    > > But almost in every script that handle some form of special characters like swedish åäö and chinese characters etc i get Unicode errors whenrunning the script externally with python.exe or pythonw.exe (but the scripts run completely fine from within Eclipse (standard pydev projects, python2.7). I have usually launched the script gui from wihin eclipse because ofthis error but now i want to get the bottom of this so i dont have to openeclipse everytime i want to run a script!

    >
    > >

    >
    > > Here is the error i get now when running the script with python.exe:

    >
    > > UnicodeEncodeError:'charmap' codec cant encode character u'\u898b' in position 32: character maps to <undefined>

    >
    > >

    >
    > > what can i do to fix this?

    >
    > >

    >
    > Since you didn't say what code actually does this, I'll turn to my
    >
    > crystal ball. It says you are trying to print characters to a terminal
    >
    > that doesn't support them. If that is the case, you could try changing
    >
    > the code page (but only 3.3 supports cp65001, so that probably won't
    >
    > help) or use replacement characters when printing.
    >
    >
    >
    > --
    >
    > CPython 3.3.0 | Windows NT 6.2.9200.16461 / FreeBSD 9.1-RELEASE
    Magnus Pettersson, Feb 12, 2013
    #5
  6. I have tried now to take away printing to terminal and just keeping the writing to a .txt file to disk (which is what the scripts purpose is):

    with open(filepath,"a") as f:
    for card in cardlist:
    f.write(card+"\n")

    The file it writes to exists and im just appending to it, but when i run the script trough eclipse, all is fine. When i run in terminal i get this error instead:

    File "K:\dev\python\webscraping\kanji_anki.py", line 69, in savefile
    f.write(card+"\n")
    UnicodeEncodeError: 'ascii' codec can't encode character u'\u898b' in position 3
    2: ordinal not in range(128)

    On Tuesday, February 12, 2013 12:01:19 PM UTC+1, Andrew Berg wrote:
    > On 2013.02.12 04:43, Magnus Pettersson wrote:
    >
    > > I am using Eclipse to write my python scripts and when i run them from inside eclipse they work fine without errors.

    >
    > >

    >
    > > But almost in every script that handle some form of special characters like swedish åäö and chinese characters etc i get Unicode errors whenrunning the script externally with python.exe or pythonw.exe (but the scripts run completely fine from within Eclipse (standard pydev projects, python2.7). I have usually launched the script gui from wihin eclipse because ofthis error but now i want to get the bottom of this so i dont have to openeclipse everytime i want to run a script!

    >
    > >

    >
    > > Here is the error i get now when running the script with python.exe:

    >
    > > UnicodeEncodeError:'charmap' codec cant encode character u'\u898b' in position 32: character maps to <undefined>

    >
    > >

    >
    > > what can i do to fix this?

    >
    > >

    >
    > Since you didn't say what code actually does this, I'll turn to my
    >
    > crystal ball. It says you are trying to print characters to a terminal
    >
    > that doesn't support them. If that is the case, you could try changing
    >
    > the code page (but only 3.3 supports cp65001, so that probably won't
    >
    > help) or use replacement characters when printing.
    >
    >
    >
    > --
    >
    > CPython 3.3.0 | Windows NT 6.2.9200.16461 / FreeBSD 9.1-RELEASE
    Magnus Pettersson, Feb 12, 2013
    #6
  7. Magnus Pettersson

    Peter Otten Guest

    Magnus Pettersson wrote:

    > I have tried now to take away printing to terminal and just keeping the
    > writing to a .txt file to disk (which is what the scripts purpose is):
    >
    > with open(filepath,"a") as f:
    > for card in cardlist:
    > f.write(card+"\n")
    >
    > The file it writes to exists and im just appending to it, but when i run
    > the script trough eclipse, all is fine. When i run in terminal i get this
    > error instead:
    >
    > File "K:\dev\python\webscraping\kanji_anki.py", line 69, in savefile
    > f.write(card+"\n")
    > UnicodeEncodeError: 'ascii' codec can't encode character u'\u898b' in
    > position 3 2: ordinal not in range(128)


    Are you sure you are writing the same data? That would mean that pydev
    changes the default encoding -- which is evil.

    A portable approach would be to use codecs.open() or io.open() instead of
    the built-in:

    import io
    with io.open(filepath, "a") as f:
    ...

    io.open() uses UTF-8 by default, but you can specify other encodings with
    io.open(filepath, mode, encoding=whatever).
    Peter Otten, Feb 12, 2013
    #7
  8. > Are you sure you are writing the same data? That would mean that pydev
    >
    > changes the default encoding -- which is evil.
    >
    >
    >
    > A portable approach would be to use codecs.open() or io.open() instead of
    >
    > the built-in:
    >
    >
    >
    > import io
    >
    > with io.open(filepath, "a") as f:
    >
    > ...
    >
    >
    >
    > io.open() uses UTF-8 by default, but you can specify other encodings with
    >
    > io.open(filepath, mode, encoding=whatever).



    Interesting. Pydev must be doing something behind the scenes because when i changed open() to io.open() i get error inside of eclipse now:

    f.write(card+"\n")
    File "C:\python27\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
    UnicodeEncodeError: 'charmap' codec can't encode character u'\u53c8' in position 32: character maps to <undefined>

    .....

    io.open(filepath, "a", encoding="UTF-8") as f:

    Then it works in eclipse. But I seem to be having an encoding problem all over the place that works in eclipse but dosnt work outside of eclipse pydev.

    Here is the flow of my data, im terrible at using unicode/encode/decode so could use some help here:

    kanji_anki_gui.py:

    def on_addButton_clicked(self):
    #code
    # self.kanji.text() comes from a kanji letter written into a pyqt4 QLineEdit
    kanji = unicode(self.kanji.text())
    card = kanji_anki.scrapeKanji(kanji,tags)
    #more code

    kanji_anki.py:

    def scrapeKanji(kanji, tags="", onlymeaning=False):
    baseurl = unicode("http://www.romajidesu.com/kanji/")
    url = unicode(baseurl+kanji)
    #test to write out url to disk, works outside of eclipse now
    savefile() #getting webpage works fine in e...d in cardlist: f.write(card "\n") return True
    Magnus Pettersson, Feb 12, 2013
    #8
  9. > Are you sure you are writing the same data? That would mean that pydev
    >
    > changes the default encoding -- which is evil.
    >
    >
    >
    > A portable approach would be to use codecs.open() or io.open() instead of
    >
    > the built-in:
    >
    >
    >
    > import io
    >
    > with io.open(filepath, "a") as f:
    >
    > ...
    >
    >
    >
    > io.open() uses UTF-8 by default, but you can specify other encodings with
    >
    > io.open(filepath, mode, encoding=whatever).



    Interesting. Pydev must be doing something behind the scenes because when i changed open() to io.open() i get error inside of eclipse now:

    f.write(card+"\n")
    File "C:\python27\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
    UnicodeEncodeError: 'charmap' codec can't encode character u'\u53c8' in position 32: character maps to <undefined>

    .....

    io.open(filepath, "a", encoding="UTF-8") as f:

    Then it works in eclipse. But I seem to be having an encoding problem all over the place that works in eclipse but dosnt work outside of eclipse pydev.

    Here is the flow of my data, im terrible at using unicode/encode/decode so could use some help here:

    kanji_anki_gui.py:

    def on_addButton_clicked(self):
    #code
    # self.kanji.text() comes from a kanji letter written into a pyqt4 QLineEdit
    kanji = unicode(self.kanji.text())
    card = kanji_anki.scrapeKanji(kanji,tags)
    #more code

    kanji_anki.py:

    def scrapeKanji(kanji, tags="", onlymeaning=False):
    baseurl = unicode("http://www.romajidesu.com/kanji/")
    url = unicode(baseurl+kanji)
    #test to write out url to disk, works outside of eclipse now
    savefile() #getting webpage works fine in e...d in cardlist: f.write(card "\n") return True
    Magnus Pettersson, Feb 12, 2013
    #9
  10. Magnus Pettersson

    Peter Otten Guest

    Magnus Pettersson wrote:

    >> io.open() uses UTF-8 by default, but you can specify other encodings with
    >>
    >> io.open(filepath, mode, encoding=whatever).

    >
    >
    > Interesting. Pydev must be doing something behind the scenes because when
    > i changed open() to io.open() i get error inside of eclipse now:
    >
    > f.write(card+"\n")
    > File "C:\python27\lib\encodings\cp1252.py", line 19, in encode
    > return codecs.charmap_encode(input,self.errors,encoding_table)[0]
    > UnicodeEncodeError: 'charmap' codec can't encode character u'\u53c8' in
    > position 32: character maps to <undefined>
    >
    > ....
    >
    > io.open(filepath, "a", encoding="UTF-8") as f:
    >
    > Then it works in eclipse. But I seem to be having an encoding problem all
    > over the place that works in eclipse but dosnt work outside of eclipse
    > pydev.


    No, I was wrong about the default; it is actually
    locale.getpreferredencoding(). Sorry for the confusion.
    Peter Otten, Feb 12, 2013
    #10
  11. Magnus Pettersson

    Dave Angel Guest

    On 02/12/2013 10:29 AM, Magnus Pettersson wrote:
    >> Are you sure you are writing the same data? That would mean that pydev
    >>
    >> changes the default encoding -- which is evil.
    >>
    >>
    >>
    >> A portable approach would be to use codecs.open() or io.open() instead of
    >>
    >> the built-in:
    >>
    >>
    >>
    >> import io
    >>
    >> with io.open(filepath, "a") as f:
    >>
    >> ...
    >>
    >>
    >>
    >> io.open() uses UTF-8 by default, but you can specify other encodings with


    I think you are using Python 2.x, not Python 3. So you'd better be
    explicit what encodings you want for each file.

    >>
    >> io.open(filepath, mode, encoding=whatever).

    >
    >
    > Interesting. Pydev must be doing something behind the scenes because when i changed open() to io.open() i get error inside of eclipse now:


    What encoding is this file? Since you're appending to it, you really
    need to match the pre-existing encoding, or the next program to deal
    with it is in big trouble. So using the io.open() without the encoding=
    keyword is probably a mistake.

    >
    > f.write(card+"\n")
    > File "C:\python27\lib\encodings\cp1252.py", line 19, in encode
    > return codecs.charmap_encode(input,self.errors,encoding_table)[0]
    > UnicodeEncodeError: 'charmap' codec can't encode character u'\u53c8' in position 32: character maps to <undefined>
    >
    > ....
    >



    --
    DaveA
    Dave Angel, Feb 12, 2013
    #11
  12. Magnus Pettersson

    Terry Reedy Guest

    On 2/12/2013 7:34 AM, Magnus Pettersson wrote:
    > Ahh so its the actual printing that makes it error out outside of
    > eclipse because its a different terminal that its printing to. Its
    > the default DOS terminal in windows that runs then i run the script
    > with python.exe and i guess its the same when i run with pythonw.exe
    > just that the terminal window is not opened up, only the pyqt gui in
    > this case.


    Writing

    txt = <expression involving coding>
    print(txt)

    rather than

    print(<expression involving coding>)

    makes it easier to tell whether a UnicodeError comes from evaluating the
    expression or from the print operation.

    Using 3.3 instead of 2.7 will make using unicode somewhat easier.

    --
    Terry Jan Reedy
    Terry Reedy, Feb 12, 2013
    #12
  13. > What encoding is this file? Since you're appending to it, you really
    >
    > need to match the pre-existing encoding, or the next program to deal
    >
    > with it is in big trouble. So using the io.open() without the encoding=
    >
    > keyword is probably a mistake.


    The .txt file is in UTF-8

    I have got it to work now in the terminal, but i dont understand what im doing and why i didnt need to do all the unicode strings and encode mumbo jumbo in eclipse

    #Here kanji = u"ç§"
    baseurl = u"http://www.romajidesu.com/kanji/"
    url = baseurl+kanji
    savefile() #this test works now. uses: io.op... urllib2.urlopen(url.encode("UTF-8")) .....
    Magnus Pettersson, Feb 12, 2013
    #13
  14. Magnus Pettersson

    Dave Angel Guest

    On 02/12/2013 12:12 PM, Magnus Pettersson wrote:
    >> < snip >
    >>

    > #Here kanji = u"ç§"
    > baseurl = u"http://www.romajidesu.com/kanji/"
    > url = baseurl+kanji
    > savefile() #this test works now. uses: io.op...me decoding when processing them. -- DaveA
    Dave Angel, Feb 12, 2013
    #14
  15. Magnus Pettersson

    MRAB Guest

    On 2013-02-12 14:24, Magnus Pettersson wrote:
    > I have tried now to take away printing to terminal and just keeping the writing to a .txt file to disk (which is what the scripts purpose is):
    >
    > with open(filepath,"a") as f:
    > for card in cardlist:
    > f.write(card+"\n")
    >
    > The file it writes to exists and im just appending to it, but when i run the script trough eclipse, all is fine. When i run in terminal i get this error instead:
    >
    > File "K:\dev\python\webscraping\kanji_anki.py", line 69, in savefile
    > f.write(card+"\n")
    > UnicodeEncodeError: 'ascii' codec can't encode character u'\u898b' in position 3
    > 2: ordinal not in range(128)
    >

    When you open the file, tell it what encoding to use. For example:

    with open(filepath, "a", encoding="utf-8") as f:
    for card in cardlist:
    f.write(card + "\n")
    MRAB, Feb 12, 2013
    #15

  16. > You don't show the code that actually does the io.open(), nor the
    >
    > url.encode, so I'm not going to guess what you're actually doing.


    Hmm im not sure what you mean but I wrote all code needed in a previous post so maybe you missed that one :)
    In short I basically just have:
    import io
    io.open(myfile,"a",encode="UTF-8") as f:
    f.write(my_ustring_with_kanji)

    the url.encode() is my unicode string variable named "url" using the type built in function .encode() which was the thing i wondered why i needed to use, which you explained very well, thank you!

    Just one more question since all this is still a little fuzzy in my head.

    When do i need to use .decode() in my code? is it when i read lines from f.ex a UTF-8 file? And why didn't I have to use .encode() on my unicode string when running from within eclipse pydev? someone wrote that it has a default codec setting so maybe that handles it for me there (which is kinda dangerous since my programs wont work running outside of eclipse since i didnt do any encoding or using of unicode strings before in my script and it still worked)

    --Magnus
    Magnus Pettersson, Feb 13, 2013
    #16
  17. Magnus Pettersson wrote:


    > # This made the fetching of the website work. Why did i have to write
    > # url.encode("UTF-8") when url already is unicode? I feel i dont have a
    > # good understanding of this.
    > page = urllib2.urlopen(url.encode("UTF-8"))



    Start here:

    "The Absolute Minimum Every Software Developer Absolutely, Positively Must
    Know About Unicode and Character Sets (No Excuses!)"

    http://www.joelonsoftware.com/articles/Unicode.html


    Basically, Unicode is an in-memory data format. Python knows about Unicode
    characters (to be technical: code points), but files on disk do not.
    Neither do network protocols, or terminals, or other simple devices. They
    only understand bytes.

    So when you have Unicode text, and you want to write it to a file on disk,
    or print it, or send it over the network to another machine, it has to be
    *encoded* into bytes, and then *decoded* back into Unicode when you read it
    from the file again. Sometimes the system will "helpfully" do that encoding
    and decoding automatically for you, which is fine when it works but when it
    doesn't it can be perplexing.

    There are many, many, many different *encoding schemes*. ASCII is one. UTF-8
    is another. And then there are about a bazillion legacy encodings which, if
    you are lucky, you will never need to care about. Only some encodings can
    deal with the entire range of Unicode characters, most can only deal with a
    (typically small) subset of possible characters. E.g. ASCII only knows
    about 127 characters out of the million-plus that Unicode deals with.
    Latin-1 can handle close to 256 different characters. If you have a say in
    the matter, always use UTF-8, since it can handle the full set of Unicode
    characters in the most efficient manner.


    --
    Steven
    Steven D'Aprano, Feb 13, 2013
    #17
  18. Thanks a lot Steven, you gave me a good AHA experience! :)

    Now I understand why I had to use encoding when calling the urllib2! So basically Eclipse PyDev does this in the background for me, and its console supports utf-8, so thats why i never had to think about it before (and why some scripts tends to fail with unicode errors when run outside of the Eclipse IDE).

    cheers
    Magnus

    > Start here:
    >
    >
    >
    > "The Absolute Minimum Every Software Developer Absolutely, Positively Must
    >
    > Know About Unicode and Character Sets (No Excuses!)"
    >
    >
    >
    > http://www.joelonsoftware.com/articles/Unicode.html
    >
    >
    >
    >
    >
    > Basically, Unicode is an in-memory data format. Python knows about Unicode
    >
    > characters (to be technical: code points), but files on disk do not.
    >
    > Neither do network protocols, or terminals, or other simple devices. They
    >
    > only understand bytes.
    >
    >
    >
    > So when you have Unicode text, and you want to write it to a file on disk,
    >
    > or print it, or send it over the network to another machine, it has to be
    >
    > *encoded* into bytes, and then *decoded* back into Unicode when you read it
    >
    > from the file again. Sometimes the system will "helpfully" do that encoding
    >
    > and decoding automatically for you, which is fine when it works but when it
    >
    > doesn't it can be perplexing.
    >
    >
    >
    > There are many, many, many different *encoding schemes*. ASCII is one. UTF-8
    >
    > is another. And then there are about a bazillion legacy encodings which, if
    >
    > you are lucky, you will never need to care about. Only some encodings can
    >
    > deal with the entire range of Unicode characters, most can only deal witha
    >
    > (typically small) subset of possible characters. E.g. ASCII only knows
    >
    > about 127 characters out of the million-plus that Unicode deals with.
    >
    > Latin-1 can handle close to 256 different characters. If you have a say in
    >
    > the matter, always use UTF-8, since it can handle the full set of Unicode
    >
    > characters in the most efficient manner.
    >
    >
    >
    >
    >
    > --
    >
    > Steven
    Magnus Pettersson, Feb 13, 2013
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Maurice LING

    UnicodeEncodeError in string conversion

    Maurice LING, Mar 31, 2005, in forum: Python
    Replies:
    1
    Views:
    275
    Serge Orlov
    Mar 31, 2005
  2. Bob Swerdlow
    Replies:
    1
    Views:
    538
    Neil Hodgson
    Jul 19, 2005
  3. Francach
    Replies:
    2
    Views:
    36,423
    Diez B. Roggisch
    Nov 6, 2005
  4. erikcw
    Replies:
    3
    Views:
    1,927
    erikcw
    Apr 11, 2007
  5. Adam Funk
    Replies:
    4
    Views:
    356
    Adam Funk
    Jan 6, 2012
Loading...

Share This Page