html, unicode and character sets

Discussion in 'HTML' started by jb, Mar 28, 2006.

  1. jb

    jb Guest

    If an html file is created in 16-bit unicode format and an appropriate
    character set is used (e.g. GB2312, a flavor of chinese), will it
    display correctly?

    Will it be necessary to use any character codes?
     
    jb, Mar 28, 2006
    #1
    1. Advertising

  2. jb wrote:

    > If an html file is created in 16-bit unicode format and an appropriate
    > character set is used (e.g. GB2312, a flavor of chinese), will it
    > display correctly?


    When using 16-bit encoding, you must configure your webserver correctly, so
    it sends the encoding in the Content-Type HTTP header (Content-Type:
    text/html; coding=GB2312).

    I don't know though, if GB2312 belongs to the widely supported encodings -
    i.e. if there are browsers that do not support it.

    > Will it be necessary to use any character codes?


    What do you mean by 'character codes'?

    --
    Benjamin Niemann
    Email: pink at odahoda dot de
    WWW: http://pink.odahoda.de/
     
    Benjamin Niemann, Mar 28, 2006
    #2
    1. Advertising

  3. jb

    jb Guest

    > jb wrote:
    >
    >> If an html file is created in 16-bit unicode format and an appropriate
    >> character set is used (e.g. GB2312, a flavor of chinese), will it
    >> display correctly?

    >
    > When using 16-bit encoding, you must configure your webserver correctly, so
    > it sends the encoding in the Content-Type HTTP header (Content-Type:
    > text/html; coding=GB2312).


    That makes sense. Will it work for local files?

    >
    > I don't know though, if GB2312 belongs to the widely supported encodings -
    > i.e. if there are browsers that do not support it.
    >
    >> Will it be necessary to use any character codes?

    >
    > What do you mean by 'character codes'?
    >


    ....codes like this: ö

    I'm working on an html generator which isn't working properly for a
    chinese user. Unicode support will be added if it solves the problem.
     
    jb, Mar 28, 2006
    #3
  4. jb wrote:

    >> jb wrote:
    >>
    >>> If an html file is created in 16-bit unicode format and an appropriate
    >>> character set is used (e.g. GB2312, a flavor of chinese), will it
    >>> display correctly?

    >>
    >> When using 16-bit encoding, you must configure your webserver correctly,
    >> so it sends the encoding in the Content-Type HTTP header (Content-Type:
    >> text/html; coding=GB2312).


    I have to correct myself. GB2312 is a variable-length encoding which is
    ASCII compatible. This means that it could be sufficient in most cases to
    declare the encoding as
    <meta http-equiv="Content-Type" content="text/html; coding=GB2312">. This
    will only work, if the HTTP header does not contain the coding parameter.
    It is still *strongly* recommended to declare the encoding in the HTTP
    headers.

    > Will it work for local files?


    With a META element with the coding parameter this should also work for
    local files (which is the only reason I could think of to use <meta
    http-equiv..> at all).

    >>> Will it be necessary to use any character codes?

    >>
    >> What do you mean by 'character codes'?
    >>

    >
    > ...codes like this: ö


    These are called 'character entity references'. You'll have to use these
    whenever you need a character which is not available in GB2312.

    --
    Benjamin Niemann
    Email: pink at odahoda dot de
    WWW: http://pink.odahoda.de/
     
    Benjamin Niemann, Mar 28, 2006
    #4
  5. jb

    Toby Inkster Guest

    Benjamin Niemann wrote:

    > <meta http-equiv="Content-Type" content="text/html; coding=GB2312">. This
    > will only work, if the HTTP header does not contain the coding parameter.


    I think you mean "charset".

    --
    Toby A Inkster BSc (Hons) ARCS
    Contact Me ~ http://tobyinkster.co.uk/contact
     
    Toby Inkster, Mar 28, 2006
    #5
  6. Toby Inkster wrote:

    > Benjamin Niemann wrote:
    >
    >> <meta http-equiv="Content-Type" content="text/html; coding=GB2312">. This
    >> will only work, if the HTTP header does not contain the coding parameter.

    >
    > I think you mean "charset".


    Oops. You're right.

    --
    Benjamin Niemann
    Email: pink at odahoda dot de
    WWW: http://pink.odahoda.de/
     
    Benjamin Niemann, Mar 29, 2006
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. BLG
    Replies:
    13
    Views:
    1,023
    John C. Bollinger
    Oct 21, 2003
  2. Michael

    character sets? unicode?

    Michael, Feb 3, 2005, in forum: Python
    Replies:
    0
    Views:
    290
    Michael
    Feb 3, 2005
  3. Kenneth McDonald
    Replies:
    1
    Views:
    843
    Carl Banks
    Dec 27, 2006
  4. Michal Ludvig

    File names, character sets and Unicode

    Michal Ludvig, Dec 12, 2008, in forum: Python
    Replies:
    1
    Views:
    316
    Marc 'BlackJack' Rintsch
    Dec 12, 2008
  5. Tyler
    Replies:
    1
    Views:
    950
    Robert Klemme
    Jul 29, 2011
Loading...

Share This Page