Looking for an appropriate encoding standard that supports all languages

ata.jaf · Aug 17, 2010

I am developing a little program in Mac with wxPython.
But I have problems with the characters that are not in ASCII. Like
some special characters in French or Turkish.
So I am looking for a way to solve this. Like an encoding standard
that supports all languages. Or some other way.

Thanks
Ata Jafari

Thomas Jollans · Aug 17, 2010

I am developing a little program in Mac with wxPython.
But I have problems with the characters that are not in ASCII. Like
some special characters in French or Turkish.
So I am looking for a way to solve this. Like an encoding standard
that supports all languages. Or some other way.

Anything that supports all of Unicode will do. Like UTF-8. If your text is
mostly Latin, then just go for UTF-8, if you use other alphabets extensively,
you might want to consider UTF-16, which might the use a little less space.

ata.jaf · Aug 19, 2010

Anything that supports all of Unicode will do. Like UTF-8. If your text is
mostly Latin, then just go for UTF-8, if you use other alphabets extensively,
you might want to consider UTF-16, which might the use a little less space.

OK, I used UTF-8.
I write a line of strings in the source code and I want my program to
show that as an output on GUI. And this line of strings includes a
character like "ü". But I see that in GUI this character is replaced
with another strange characters. I mean it doesn't work.
And when I try to use UTF-16, I get an syntax error that declares
"UTF-16 stream does not start with BOM".

Steven D'Aprano · Aug 19, 2010

OK, I used UTF-8.
I write a line of strings in the source code

Do you have a source code encoding line at the start of your script?

http://www.python.org/dev/peps/pep-0263/

and I want my program to
show that as an output on GUI. And this line of strings includes a
character like "Ã¼". But I see that in GUI this character is replaced
with another strange characters. I mean it doesn't work. And when I try
to use UTF-16, I get an syntax error that declares "UTF-16 stream does
not start with BOM".

What GUI are you using?

Please COPY AND PASTE (do not retype) the EXACT error message you get,
including the entire traceback.

Martin v. Loewis · Aug 19, 2010

I write a line of strings in the source code and I want my program to

show that as an output on GUI. And this line of strings includes a
character like "ü".

Make sure you use Unicode literals in your source code, i.e. u"ü".

HTH,
Martin

ata.jaf · Aug 20, 2010

Do you have a source code encoding line at the start of your script?

http://www.python.org/dev/peps/pep-0263/

What GUI are you using?

Please COPY AND PASTE (do not retype) the EXACT error message you get,
including the entire traceback.

Yes I have a source code encoding line.
Here it is:

# -*- coding: utf_16 -*-

I am using WxPython.

And the error that I get about using utf-16 is:\

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "z.py", line 2
SyntaxError: UTF-16 stream does not start with BOM

Rami Chowdhury · Aug 20, 2010

Yes I have a source code encoding line.
Here it is:

# -*- coding: utf_16 -*-

I am using WxPython.

And the error that I get about using utf-16 is:\

Traceback (most recent call last):
Â File "<stdin>", line 1, in <module>
Â File "z.py", line 2
SyntaxError: UTF-16 stream does not start with BOM

Which encoding are you saving your script in? Very few of the text
editors I've used save to UTF-16 by default.

John Nagle · Aug 20, 2010

Try "utf_8".

Which encoding are you saving your script in? Very few of the text
editors I've used save to UTF-16 by default.

Most editors that will do Unicode save files as "utf-8".
Try that.

John Nagle

Thomas Jollans · Aug 20, 2010

OK, I used UTF-8.
I write a line of strings in the source code and I want my program to
show that as an output on GUI. And this line of strings includes a
character like "ü". But I see that in GUI this character is replaced
with another strange characters. I mean it doesn't work.
And when I try to use UTF-16, I get an syntax error that declares
"UTF-16 stream does not start with BOM".

I get the feeling you're not actually using the encoding you say you're using,
or not telling every program involved what you're doing.

1. Save the file in the correct encoding. Either tell your text editor to use
a specific encoding (UTF-8 would be a good choice), or find out what encoding
your text editor is using and use that encoding during the rest of the
process.

2. Tell Python which encoding you're using. The coding: line will do the
trick, *provided* you don't lie, and the encoding your specify in the file is
actually the encoding you're using to store the file on disk.

3. Instruct your GUI library to do the right thing. If you use unicode strings
(either by using Python 3 or by using the u"Käse" syntax in Python 2), that
should be enough, otherwise, if you're using byte strings, which you shouldn't
be doing in this case, you might have to tell the library what you're doing,
or use the customary encoding. (For GTK+, this is UTF-8. For other libraries,
it might be Latin-1, or system-dependent)

Ata Jafari · Aug 23, 2010

I get the feeling you're not actually using the encoding you say you're using,
or not telling every program involved what you're doing.

1. Save the file in the correct encoding. Either tell your text editor to use
a specific encoding (UTF-8 would be a good choice), or find out what encoding
your text editor is using and use that encoding during the rest of the
process.

2. Tell Python which encoding you're using. The coding: line will do the
trick, *provided* you don't lie, and the encoding your specify in the file is
actually the encoding you're using to store the file on disk.

3. Instruct your GUI library to do the right thing. If you use unicode strings
(either by using Python 3 or by using the u"Käse" syntax in Python 2), that
should be enough, otherwise, if you're using byte strings, which you shouldn't
be doing in this case, you might have to tell the library what you're doing,
or use the customary encoding. (For GTK+, this is UTF-8. For other libraries,
it might be Latin-1, or system-dependent)

Finally I did it.
I was doing some stupid mistakes.
Thanks alot.
Ata

Hi, I am a webflow user. I am looking for CSS code that can KEEP ALL ELEMENTS POSITIONED in the SAME spot across all resolutions	0	Oct 27, 2023
With this artifact, everyone can easily invent new languages	5	Jan 11, 2014
Is a Metaclass the appropriate way to solve this problem?	0	Aug 7, 2013
Problems of Symbol Congestion in Computer Languages	54	Feb 16, 2011
Encoding nightmare	6	May 4, 2011
AJAX vs form submission (character encoding)	2	Jan 26, 2012
An assessment of the Unicode standard	119	Aug 29, 2009
Standard IPC for Python?	4	Jan 13, 2009

Looking for an appropriate encoding standard that supports all languages

ata.jaf

Thomas Jollans

ata.jaf

Steven D'Aprano

Martin v. Loewis

ata.jaf

Rami Chowdhury

John Nagle

Thomas Jollans

Ata Jafari

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads