UTF-16 + firefox +javascript = null

Discussion in 'Javascript' started by Andrew Poulos, Apr 20, 2009.

  1. I've written some javascript for a client that uses some in-house tool
    to create HTML. They are upgrading their tool so that it handles unicode
    (so they can, for example, insert Japanese ideograms directly into the
    HTML).

    The pages display in IE 7 but Firefox, Chrome etc complain about an
    illegal character and then

    Error: uncaught exception: [Exception... "Component returned failure
    code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIDOMLocation.replace]"
    nsresult: "0x80520012 (NS_ERROR_FILE_NOT_FOUND)" location: "JS frame ::
    file:///C:/Users/Andrew/Desktop/Project/Standalone/content/index.htm ::
    anonymous :: line 44" data: no]

    and no javascript runs.

    When I build the pages from scratch they display.

    When I examine their pages they are encoded in UTF-16 with Unix style
    line ending.

    I manually changed the encoding to UTF-8 but it made no difference.

    If I comment out the javascript the page displays fine.

    I'm kind of lost with unicode. Is their something I can tell their
    developers to stop them killing my javascript?

    Andrew Poulos
    Andrew Poulos, Apr 20, 2009
    #1
    1. Advertising

  2. Andrew Poulos wrote:

    > I've written some javascript for a client that uses some in-house tool
    > to create HTML. They are upgrading their tool so that it handles unicode
    > (so they can, for example, insert Japanese ideograms directly into the
    > HTML).
    >
    > The pages display in IE 7 but Firefox, Chrome etc complain about an
    > illegal character and then
    >
    > Error: uncaught exception: [Exception... "Component returned failure
    > code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIDOMLocation.replace]"
    > nsresult: "0x80520012 (NS_ERROR_FILE_NOT_FOUND)"  location: "JS frame ::
    > file:///C:/Users/Andrew/Desktop/Project/Standalone/content/index.htm ::
    > anonymous :: line 44"  data: no]
    >
    > and no javascript runs.
    >
    > When I build the pages from scratch they display.
    >
    > When I examine their pages they are encoded in UTF-16 with Unix style
    > line ending.
    >
    > I manually changed the encoding to UTF-8 but it made no difference.
    >
    > If I comment out the javascript the page displays fine.
    >
    > I'm kind of lost with unicode. Is their something I can tell their
    > developers to stop them killing my javascript?


    Welcome to Unicode! :)
    It's difficult to say anything concrete without being able to look at
    the code. My first idea is that some byte ranges cannot be recognized
    because browsers can use various UTF-16 character tables (more vs.
    less extended, esp. in the higher code points). Does it fail at the
    first Unicoded char that it meets ?

    Here are some basic thoughts:
    - Definitely go for UTF-8, not UTF-16.
    - Make sure that the HTTP-header is correct (Content-Type: text/html;
    charset=utf-8).
    - The server-script must send data that has been correctly converted
    into an Unicode set (sounds easy, I know...)
    - In HTML source code, add a meta-tag in the header <meta http-
    equiv="Content-Type" content="text/html; charset=UTF-8">.
    - A Unix style line-ending should *normally* be okay; better is to use
    \r\n anyhow.

    I understand the Unicode characters reside in javascript only. You
    could ask their IT guys to send \uXXXX. NS_ERROR_FILE_NOT_FOUND seems
    to indicate a 404 error -> maybe you are using Unicode characters in
    file names ? (=definitely to avoid)

    Hope this helps,

    --
    Bart
    Bart Van der Donck, Apr 20, 2009
    #2
    1. Advertising

  3. Bart Van der Donck wrote:
    > Andrew Poulos wrote:
    >
    >> I've written some javascript for a client that uses some in-house tool
    >> to create HTML. They are upgrading their tool so that it handles unicode
    >> (so they can, for example, insert Japanese ideograms directly into the
    >> HTML).
    >>
    >> The pages display in IE 7 but Firefox, Chrome etc complain about an
    >> illegal character and then
    >>
    >> Error: uncaught exception: [Exception... "Component returned failure
    >> code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIDOMLocation.replace]"
    >> nsresult: "0x80520012 (NS_ERROR_FILE_NOT_FOUND)" location: "JS frame ::
    >> file:///C:/Users/Andrew/Desktop/Project/Standalone/content/index.htm ::
    >> anonymous :: line 44" data: no]
    >>
    >> and no javascript runs.
    >>
    >> When I build the pages from scratch they display.
    >>
    >> When I examine their pages they are encoded in UTF-16 with Unix style
    >> line ending.
    >>
    >> I manually changed the encoding to UTF-8 but it made no difference.
    >>
    >> If I comment out the javascript the page displays fine.
    >>
    >> I'm kind of lost with unicode. Is their something I can tell their
    >> developers to stop them killing my javascript?

    >
    > Welcome to Unicode! :)
    > It's difficult to say anything concrete without being able to look at
    > the code. My first idea is that some byte ranges cannot be recognized
    > because browsers can use various UTF-16 character tables (more vs.
    > less extended, esp. in the higher code points). Does it fail at the
    > first Unicoded char that it meets ?
    >
    > Here are some basic thoughts:
    > - Definitely go for UTF-8, not UTF-16.
    > - Make sure that the HTTP-header is correct (Content-Type: text/html;
    > charset=utf-8).
    > - The server-script must send data that has been correctly converted
    > into an Unicode set (sounds easy, I know...)
    > - In HTML source code, add a meta-tag in the header <meta http-
    > equiv="Content-Type" content="text/html; charset=UTF-8">.
    > - A Unix style line-ending should *normally* be okay; better is to use
    > \r\n anyhow.
    >
    > I understand the Unicode characters reside in javascript only. You
    > could ask their IT guys to send \uXXXX. NS_ERROR_FILE_NOT_FOUND seems
    > to indicate a 404 error -> maybe you are using Unicode characters in
    > file names ? (=definitely to avoid)


    I think I found the issue. The files they produce are UTF-16. My
    javascript files are UTF-8. When Firefox goes to load my files into a
    UTF-16 web page it converts them from UTF-8 to 16 and so it appears that
    all the characters go awry.

    I'll make sure they use only UTF-8. Thank you for your help.

    Andrew Poulos
    Andrew Poulos, Apr 20, 2009
    #3
  4. Andrew Poulos wrote:
    ....
    > I think I found the issue. The files they produce are UTF-16. My
    > javascript files are UTF-8. When Firefox goes to load my files into a
    > UTF-16 web page it converts them from UTF-8 to 16 and so it appears that
    > all the characters go awry.
    > I'll make sure they use only UTF-8. Thank you for your help.


    You're welcome. <script src="X" type="text/javascript"
    charset="utf-8"> might help too in your scenario.

    --
    Bart
    Bart Van der Donck, Apr 20, 2009
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. JJBW
    Replies:
    1
    Views:
    10,049
    Joerg Jooss
    Apr 24, 2004
  2. =?Utf-8?B?QXNoYQ==?=
    Replies:
    3
    Views:
    416
  3. Replies:
    5
    Views:
    26,493
    Mike Schilling
    Mar 29, 2006
  4. Arifi Koseoglu
    Replies:
    2
    Views:
    951
    Arifi Koseoglu
    Apr 13, 2004
  5. Jimmy Shaw

    Converting from UTF-16 to UTF-32

    Jimmy Shaw, Jul 31, 2006, in forum: C++
    Replies:
    7
    Views:
    1,299
    P.J. Plauger
    Aug 1, 2006
Loading...

Share This Page