html, unicode and character sets

J

jb

If an html file is created in 16-bit unicode format and an appropriate
character set is used (e.g. GB2312, a flavor of chinese), will it
display correctly?

Will it be necessary to use any character codes?
 
B

Benjamin Niemann

jb said:
If an html file is created in 16-bit unicode format and an appropriate
character set is used (e.g. GB2312, a flavor of chinese), will it
display correctly?

When using 16-bit encoding, you must configure your webserver correctly, so
it sends the encoding in the Content-Type HTTP header (Content-Type:
text/html; coding=GB2312).

I don't know though, if GB2312 belongs to the widely supported encodings -
i.e. if there are browsers that do not support it.
Will it be necessary to use any character codes?

What do you mean by 'character codes'?
 
J

jb

jb said:
When using 16-bit encoding, you must configure your webserver correctly, so
it sends the encoding in the Content-Type HTTP header (Content-Type:
text/html; coding=GB2312).

That makes sense. Will it work for local files?
I don't know though, if GB2312 belongs to the widely supported encodings -
i.e. if there are browsers that do not support it.


What do you mean by 'character codes'?

....codes like this: ö

I'm working on an html generator which isn't working properly for a
chinese user. Unicode support will be added if it solves the problem.
 
B

Benjamin Niemann

I have to correct myself. GB2312 is a variable-length encoding which is
ASCII compatible. This means that it could be sufficient in most cases to
declare the encoding as
<meta http-equiv="Content-Type" content="text/html; coding=GB2312">. This
will only work, if the HTTP header does not contain the coding parameter.
It is still *strongly* recommended to declare the encoding in the HTTP
headers.
Will it work for local files?

With a META element with the coding parameter this should also work for
local files (which is the only reason I could think of to use <meta
http-equiv..> at all).
...codes like this: ö

These are called 'character entity references'. You'll have to use these
whenever you need a character which is not available in GB2312.
 
T

Toby Inkster

Benjamin said:
<meta http-equiv="Content-Type" content="text/html; coding=GB2312">. This
will only work, if the HTTP header does not contain the coding parameter.

I think you mean "charset".
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top