changing or at least detecting character encoding via javascript ?

Discussion in 'Javascript' started by David Komanek, Sep 24, 2003.

  1. Hi all,

    I have a question if it is possible to manipulate the settings of
    character encoding in Ms Internet Explorer 5.0, 5.5 and 6.0. The
    problem is that the default instalation of Ms IE seems to have hard
    selected default encoding to "Western European (ISO)", which means
    iso-8859-1. When browsing pages with some Central/Eastern European
    characters these are converted to iso-8859-1 so displayed wrong.

    I would suppose the "auto-select" option should be default, so the
    browser can select the right encoding according to the meta-tags in
    the head of webpage. But this is apparently not true.

    Please, is it possible to use JavaScript or Java applet to get the
    information about the current client character encoding settings
    and/or change it to the "auto-select" value ? How to do this ?

    Thanks in advance,

    David Komanek
    David Komanek, Sep 24, 2003
    #1
    1. Advertising

  2. Re: changing or at least detecting character encoding via javascript?

    David Komanek wrote:

    > Hi all,
    >
    > I have a question if it is possible to manipulate the settings of
    > character encoding in Ms Internet Explorer 5.0, 5.5 and 6.0. The
    > problem is that the default instalation of Ms IE seems to have hard
    > selected default encoding to "Western European (ISO)", which means
    > iso-8859-1. When browsing pages with some Central/Eastern European
    > characters these are converted to iso-8859-1 so displayed wrong.
    >
    > I would suppose the "auto-select" option should be default, so the
    > browser can select the right encoding according to the meta-tags in
    > the head of webpage. But this is apparently not true.
    >
    > Please, is it possible to use JavaScript or Java applet to get the
    > information about the current client character encoding settings
    > and/or change it to the "auto-select" value ? How to do this ?


    What about using the HTML <meta> tag:
    <meta http-equiv="Content-Type" content="text/html;
    charset=yourCharsetHere">

    --

    Martin Honnen
    http://JavaScript.FAQTs.com/
    Martin Honnen, Sep 24, 2003
    #2
    1. Advertising

  3. David Komanek

    VK Guest

    All browsers respect and treat properly the encoding set via meta tag.
    The "auto-select" option in IE is used only if the page has no encoding set
    neither by meta nor by server. Then browser tries to guess the encoding
    using characters byte values (and usually its guess is wrong, so you have to
    change it manually).

    If your browser doesn't display a page or a part of page in the proper
    encoding, it means either of following:
    1) The page has no encoding set neither by meta nor by server. Browser has
    "auto-select" enabled, it tried to guess the encoding and it missed.
    2) The encoding is set properly, but your system doesn't have a
    corresponding font to display.
    3) Several character sets are used on the same page (for example, a Latin
    and a Cyrillic ones), and the page encoding is not "utf-8". Only UTF-8
    (Unicode) allows it.
    4) Something's broken in your browser. Reinstall it.

    P.S. If you really want to go by some very special way, see document.charset
    property (read/write, but buggy).


    David Komanek <> wrote in message
    news:...
    > Hi all,
    >
    > I have a question if it is possible to manipulate the settings of
    > character encoding in Ms Internet Explorer 5.0, 5.5 and 6.0. The
    > problem is that the default instalation of Ms IE seems to have hard
    > selected default encoding to "Western European (ISO)", which means
    > iso-8859-1. When browsing pages with some Central/Eastern European
    > characters these are converted to iso-8859-1 so displayed wrong.
    >
    > I would suppose the "auto-select" option should be default, so the
    > browser can select the right encoding according to the meta-tags in
    > the head of webpage. But this is apparently not true.
    >
    > Please, is it possible to use JavaScript or Java applet to get the
    > information about the current client character encoding settings
    > and/or change it to the "auto-select" value ? How to do this ?
    >
    > Thanks in advance,
    >
    > David Komanek
    VK, Sep 24, 2003
    #3
  4. Hello!

    (David Komanek) wrote in message news:<>...
    > Hi all,
    >
    > I have a question if it is possible to manipulate the settings of
    > character encoding in Ms Internet Explorer 5.0, 5.5 and 6.0. The
    > problem is that the default instalation of Ms IE seems to have hard
    > selected default encoding to "Western European (ISO)", which means
    > iso-8859-1.


    No, there is no such thing in Internet Explorer as
    'default encoding' (Netscape/Mozilla do have such thing).

    > When browsing pages with some Central/Eastern European
    > characters these are converted to iso-8859-1 so displayed wrong.


    Martin and VK has already answered that - it's a _site_'s problem,
    it's probably does not specify its encoding so you need to choose
    it manually in IE's menu - only if theb page you visited right
    before was not Central European - then IE will show your new page
    Ok - if a new page does not specify its encoding, IE uses
    *last used encoding* to show such page.

    --
    Regards,
    Paul Gorodyansky
    "Russian On-screen Keyboard"
    (based on the JavaScript code by Matin Honnen et al):
    http://ourworld.compuserve.com/homepages/PaulGor/onscreen.htm
    Paul Gorodyansky, Sep 25, 2003
    #4
  5. Hi all,

    thank you for the responses. Unfortunately my colleague is abroad, in
    Netherlands and I have no possibility to play with his compoter (and all
    computers in his department, too :) But What I can tell for sure is
    that I have the appropriate meta-tag in the page: iso-8859-2. He says he
    has iso-8859-1 is his setting what he sees in the "view|encoding" menu
    as selected. And all the Czech characters he sees converted to the
    english equivalents. For example &Aacute; he sees as a simple "A" if I
    use the normal character. Only two ways to get the right character to
    his display which I can go is to use the &Aacute; entity itself or to
    recode the page to utf-8, right. But ïf I use the "normal character"
    (not the corresponding entity) in the html source and my colleague
    manually switches the encoding to the "Central European (ISO)", which
    means iso-8859-2, voila, he sees the character well .... but tell this
    to do to all people abroad .... :)

    I am pretty sure I have the meta-tag o.k. because I see the characters
    exaxtly as I should on my windows machine (and on many others close to
    me), even if the default codepage in Czech editions of windows is
    cp-1250 which is different one. Yes, it differs only in few characters,
    but I tried them, too - with no problems.

    I would agree, that if my colleague would have not fonts properly
    installed, he should see strange characters. But why are the characters
    implicitly converted on his side ? And why on many computers ? Is it
    possible it does his proxy ?

    Thanks,

    David




    *** Sent via Developersdex http://www.developersdex.com ***
    Don't just participate in USENET...get rewarded for it!
    David Komanek, Sep 26, 2003
    #5
  6. David,

    David Komanek <> wrote in message news:<3f746a17$0$62077$>...
    > Hi all,
    >
    > thank you for the responses. Unfortunately my colleague is abroad, in
    > Netherlands and I have no possibility to play with his compoter (and all
    > computers in his department, too :) But What I can tell for sure is
    > that I have the appropriate meta-tag in the page: iso-8859-2. He says he
    > has iso-8859-1 is his setting what he sees in the "view|encoding" menu
    > as selected.


    If you would let us know the URL it would be easier for us to
    help you.
    Any way, the above happens often with Russian too for the following reason:
    - author created good page with correct META...charset=
    - he placed .html to the Web Server of his Internet Provider
    - The Web Server of the Provider is configured in such a way that
    it places Charset=iso-8859-1 ("Western European") into
    HTTP Header that is sent along with the page itself to a reader
    ( http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html )

    - HTTP Header, but the standards, has higher priority than META...charset=
    so browser gets it as a iso-8859-1 page!

    So your friend needs to ask Web Server people if they do the above.
    For example, my Internet Provider, CompuServe, does NOT fill our
    Charset field of HTTP Header, so in my files META...charset=
    works OK.

    There is a test page that shows HTTP Header, so:
    - create a Web page *without* META...charset= in it
    - place it to the Web Server
    - go to this page, get the screen with HTTP Header and see
    what is the value of "Charset" field:
    http://www.delorie.com/web/headers.html

    If you do the above for _my_ page where there is no META...charset=
    http://ourworld.compuserve.com/homepages/PaulGor/test1251.htm
    you will see that CompuServe leaves Charset field empty...


    --
    Regards,
    Paul Gorodyansky
    "Cyrillic (Russian): instructions for Windows and Internet":
    http://ourworld.compuserve.com/homepages/PaulGor/
    Paul Gorodyansky, Sep 26, 2003
    #6
  7. David Komanek wrote:
    >
    > Hi all,
    >
    > thank you for the responses. Unfortunately my colleague is abroad, in
    > Netherlands and I have no possibility to play with his compoter (and all
    > computers in his department, too :) But What I can tell for sure is
    > that I have the appropriate meta-tag in the page: iso-8859-2. He says he
    > has iso-8859-1 is his setting what he sees in the "view|encoding" menu
    > as selected. And all the Czech characters he sees converted to the
    > english equivalents.


    I have ISO-8859-2 Test Page (because I work as Software I18n engineer),
    so you can ask your friend to check how it is shown using *my*
    Provider who does not fill out Charset of HTTP Header:
    http://ourworld.compuserve.com/homepages/paulgor/8859-2.htm

    --
    Regards,
    Paul Gorodyansky
    "Cyrillic (Russian): instructions for Windows and Internet":
    http://ourworld.compuserve.com/homepages/PaulGor/
    Paul Gorodyansky, Sep 27, 2003
    #7
  8. David Komanek

    Stephen Guest

    Re: changing or at least detecting character encoding via javascript?

    Paul Gorodyansky wrote:
    > David,
    >
    > David Komanek <> wrote in message news:<3f746a17$0$62077$>...
    >
    >>Hi all,
    >>
    >>thank you for the responses. Unfortunately my colleague is abroad, in
    >>Netherlands and I have no possibility to play with his compoter (and all
    >>computers in his department, too :) But What I can tell for sure is
    >>that I have the appropriate meta-tag in the page: iso-8859-2. He says he
    >>has iso-8859-1 is his setting what he sees in the "view|encoding" menu
    >>as selected.

    >
    >
    > If you would let us know the URL it would be easier for us to
    > help you.
    > Any way, the above happens often with Russian too for the following reason:
    > - author created good page with correct META...charset=
    > - he placed .html to the Web Server of his Internet Provider
    > - The Web Server of the Provider is configured in such a way that
    > it places Charset=iso-8859-1 ("Western European") into
    > HTTP Header that is sent along with the page itself to a reader
    > ( http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html )
    >
    > - HTTP Header, but the standards, has higher priority than META...charset=
    > so browser gets it as a iso-8859-1 page!
    >
    > So your friend needs to ask Web Server people if they do the above.
    > For example, my Internet Provider, CompuServe, does NOT fill our
    > Charset field of HTTP Header, so in my files META...charset=
    > works OK.
    >


    Is it possible that you might also get 8859-1 because the client sends
    this in the Accept-charset request header? Without providing for
    alternatives, and regardless of server configuration?

    > There is a test page that shows HTTP Header, so:
    > - create a Web page *without* META...charset= in it
    > - place it to the Web Server
    > - go to this page, get the screen with HTTP Header and see
    > what is the value of "Charset" field:
    > http://www.delorie.com/web/headers.html
    >


    OT: The above URL is an example of an application that is broken by
    Verisign's implementing "sitefinder".

    Regards
    Stephen
    Stephen, Sep 27, 2003
    #8
  9. Re: changing or at least detecting character encoding via javascript?

    Hi,

    Stephen wrote:
    >
    > Paul Gorodyansky wrote:
    > > David,
    > >
    > > David Komanek <> wrote in message news:<3f746a17$0$62077$>...
    > >
    > >>Hi all,
    > >>
    > >>thank you for the responses. Unfortunately my colleague is abroad, in
    > >>Netherlands and I have no possibility to play with his compoter (and all
    > >>computers in his department, too :) But What I can tell for sure is
    > >>that I have the appropriate meta-tag in the page: iso-8859-2. He says he
    > >>has iso-8859-1 is his setting what he sees in the "view|encoding" menu
    > >>as selected.

    > >
    > >
    > > If you would let us know the URL it would be easier for us to
    > > help you.
    > > Any way, the above happens often with Russian too for the following reason:
    > > - author created good page with correct META...charset=
    > > - he placed .html to the Web Server of his Internet Provider
    > > - The Web Server of the Provider is configured in such a way that
    > > it places Charset=iso-8859-1 ("Western European") into
    > > HTTP Header that is sent along with the page itself to a reader
    > > ( http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html )
    > >
    > > - HTTP Header, but the standards, has higher priority than META...charset=
    > > so browser gets it as a iso-8859-1 page!
    > >
    > > So your friend needs to ask Web Server people if they do the above.
    > > For example, my Internet Provider, CompuServe, does NOT fill our
    > > Charset field of HTTP Header, so in my files META...charset=
    > > works OK.
    > >

    >
    > Is it possible that you might also get 8859-1 because the client sends
    > this in the Accept-charset request header? Without providing for
    > alternatives, and regardless of server configuration?


    No, not really. First - and it's easy to verify - many browsers - and
    MS Internet Explorer is one of them - do *not* fill out Accept-Charset
    field - you can check it for example using "CGI Test Script" link
    here: http://koi8.pp.ru/frame.html?htmlreq.html

    Second, Accept-Charset is for different reason - when server has
    *several* variants of the same page, say one contains same Russian
    text in KOI8-R encoding, another - in Windows-1251 encoding, then
    a browser via Accept-Charset=koi8-r tells the server what it
    can take. Server can not *make* a document to be KOI8-R if it does not
    havev such. Same in our case - if server contains ISO-8859-2
    document and browser (f.e. Mozilla) requests ISO-8859-1, then
    it does not mean at all that server will send existing -2 document
    as -1.

    --
    Regards,
    Paul Gorodyansky
    "Cyrillic (Russian): instructions for Windows and Internet":
    http://ourworld.compuserve.com/homepages/PaulGor/
    Paul Gorodyansky, Sep 28, 2003
    #9
  10. David Komanek

    Stephen Guest

    Re: changing or at least detecting character encoding via javascript?

    Paul Gorodyansky wrote:
    > Hi,
    >
    > Stephen wrote:
    >
    >>Paul Gorodyansky wrote:
    >>
    >>>David,
    >>>
    >>> David Komanek <> wrote in message news:<3f746a17$0$62077$>...
    >>>[...snip...]

    >>
    >>Is it possible that you might also get 8859-1 because the client sends
    >>this in the Accept-charset request header? Without providing for
    >>alternatives, and regardless of server configuration?

    >
    > No, not really. First - and it's easy to verify - many browsers - and
    > MS Internet Explorer is one of them - do *not* fill out Accept-Charset
    > field - you can check it for example using "CGI Test Script" link
    > here: http://koi8.pp.ru/frame.html?htmlreq.html
    >
    > Second, Accept-Charset is for different reason - when server has
    > *several* variants of the same page, say one contains same Russian
    > text in KOI8-R encoding, another - in Windows-1251 encoding, then
    > a browser via Accept-Charset=koi8-r tells the server what it
    > can take. Server can not *make* a document to be KOI8-R if it does not
    > havev such. Same in our case - if server contains ISO-8859-2
    > document and browser (f.e. Mozilla) requests ISO-8859-1, then
    > it does not mean at all that server will send existing -2 document
    > as -1.
    >

    Of course. Thanks for the commentary. I did notice that Gecko-based
    browsers (Netscape 7.0, Moz 1.4, Firebird) do send Accept-charset. And
    contrary to what I was remembering, you are right about IE: it does not.
    Thanks again,
    Stephen
    Stephen, Sep 28, 2003
    #10
  11. Thank you all for your help.

    In the meantime I got the workaround for my problem by recoding the
    pages to utf8, as was suggested here. Because the encoding is made by
    a module in Apache on the server, where the implicit codepage served
    to clients is iso-8859-2, I just prefixed the pages with /utf8, wich
    tells the server to use the explicit encoding "utf-8". So, for
    example, one of the recoded pages, where is the problem is

    http://www.natur.cuni.cz/utf8/fem_modflow/index.php?id=4

    the original one is now as

    http://www.natur.cuni.cz/fem_modflow/index_test.php?id=4

    Please, colud somebody form non-central/eastern-european region tell
    me what (s)he sees on the page between lines

    "Organizing Committee"

    and

    "Institute of Hydrogeology, Engineering Geology and Applied
    Geophysics" ?

    The should be the name "Zbynek Hrkal", where the "e" has a special
    decoration (sthg. like tilde, but not exactly, I have no idea how to
    call this letter in english language, sorry (does anybody know ?)). I
    see it right in both encodings, with MS IE 6, Netscape 7.1, Mozilla
    ..... but my colleague in Netherlands sees it well only in utf8, not in
    original iso-8859-2. In the latter case he sees "regular e" instead.

    I do not know how to ge the http header from the server. When I
    connect to port 80 of the webserver via unix telnet and type

    GET /fem_modflow/index_test.php?id=4

    I get just the source of the webpage, no http header lines:

    # telnet www.natur.cuni.cz 80
    Trying 195.113.56.1...
    Connected to tao.natur.cuni.cz.
    Escape character is '^]'.
    GET /fem_modflow/index_test.php?id=4
    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
    <html>
    .....
    .....
    .....

    Thanks again for your comments.

    With best regards,

    David Komanek
    David Komanek, Sep 30, 2003
    #11
  12. > - HTTP Header, but the standards, has higher priority than META...charset=
    > so browser gets it as a iso-8859-1 page!
    >
    > So your friend needs to ask Web Server people if they do the above.
    > For example, my Internet Provider, CompuServe, does NOT fill our
    > Charset field of HTTP Header, so in my files META...charset=
    > works OK.


    Well, this seems to be the problem. Thank you. The header displayed by
    http://www.delorie.com/web/headers.html tells the charset should be
    "us-asci". Regardless of setting AddDefaultCharset in Apache
    httpd.conf, php.ini setting and "header()" function as the forst line
    of PHP source itself. Very strange. And even more strange is that on
    some computers the meta-tag based information about the encoding takes
    precedence and on some not ....

    David Komanek
    David Komanek, Sep 30, 2003
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    1
    Views:
    23,352
    Real Gagnon
    Oct 8, 2004
  2. bugbear
    Replies:
    0
    Views:
    337
    bugbear
    Sep 28, 2005
  3. raavi
    Replies:
    2
    Views:
    911
    raavi
    Mar 2, 2006
  4. AAaron123
    Replies:
    0
    Views:
    591
    AAaron123
    Oct 3, 2008
  5. Daniel Choi
    Replies:
    0
    Views:
    109
    Daniel Choi
    Dec 12, 2008
Loading...

Share This Page