innerHTML problem in IE6

Discussion in 'HTML' started by Kiran Makam, Aug 26, 2008.

  1. Kiran Makam

    Kiran Makam Guest

    I am setting the content of a div dynamically using innerHTML
    property. If the content contains an ampersand, text after the
    ampersand is disappearing in IE6. It works properly in Firefox.

    This is my code:
    ----------------
    <body>

    <div id='div1'></div>
    <script>
    var div = document.getElementById('div1');
    div.innerHTML = "A&B";
    </script>

    </body>
    ---------------

    IE6 renders the content of div1 as 'A'
    Firefox renders the content properly as 'A&B'

    If there is a space after ampersand, IE6 renders it properly. So I
    think that IE is assuming anything after ampersand as an HTML entity
    ( like &nbsp; ).

    Is this a bug in IE6? Is there any workaround for this?

    Thanks
    Kiran Makam
    Kiran Makam, Aug 26, 2008
    #1
    1. Advertising

  2. Kiran Makam wrote:
    > I am setting the content of a div dynamically using innerHTML
    > property. If the content contains an ampersand, text after the
    > ampersand is disappearing in IE6. It works properly in Firefox.
    >
    > This is my code:
    > ----------------
    > <body>
    >
    > <div id='div1'></div>
    > <script>
    > var div = document.getElementById('div1');
    > div.innerHTML = "A&B";
    > </script>
    >


    Try:

    div.innerHTML = "A&amp;B";


    --
    Take care,

    Jonathan
    -------------------
    LITTLE WORKS STUDIO
    http://www.LittleWorksStudio.com
    Jonathan N. Little, Aug 26, 2008
    #2
    1. Advertising

  3. Kiran Makam

    Lars Eighner Guest

    In our last episode,
    <>,
    the lovely and talented Kiran Makam
    broadcast on alt.html:

    > I am setting the content of a div dynamically using innerHTML
    > property. If the content contains an ampersand, text after the
    > ampersand is disappearing in IE6. It works properly in Firefox.


    > This is my code:
    > ----------------
    ><body>


    ><div id='div1'></div>
    ><script>
    > var div = document.getElementById('div1');
    > div.innerHTML = "A&B";
    ></script>


    ></body>
    > ---------------


    > IE6 renders the content of div1 as 'A'
    > Firefox renders the content properly as 'A&B'


    > If there is a space after ampersand, IE6 renders it properly. So I
    > think that IE is assuming anything after ampersand as an HTML entity
    > ( like &nbsp; ).


    > Is this a bug in IE6?


    No, it is a bug in your markup. & should always be &amp; The browser is
    entitled to suppose any string starting with & is an attempt at a character
    entity. It may be that FF has a better error correction ability, but
    you can't blame a browser for how it handles errors.

    > Is there any workaround for this?


    Yes. Enter & as &amp;

    --
    Lars Eighner <http://larseighner.com/>
    War on Terrorism: History a Mystery
    "He's busy making history, but doesn't look back at his own, or the
    world's.... Bush would rather look forward than backward." --_Newsweek_
    Lars Eighner, Aug 26, 2008
    #3
  4. Lars Eighner wrote:
    > In our last episode,
    > <>,
    > the lovely and talented Kiran Makam
    > broadcast on alt.html:
    >
    >> I am setting the content of a div dynamically using innerHTML
    >> property. If the content contains an ampersand, text after the
    >> ampersand is disappearing in IE6. It works properly in Firefox.

    >
    >> This is my code:
    >> ----------------
    >> <body>

    >
    >> <div id='div1'></div>
    >> <script>
    >> var div = document.getElementById('div1');
    >> div.innerHTML = "A&B";
    >> </script>

    >
    >> </body>
    >> ---------------

    >
    >> IE6 renders the content of div1 as 'A'
    >> Firefox renders the content properly as 'A&B'

    >
    >> If there is a space after ampersand, IE6 renders it properly. So I
    >> think that IE is assuming anything after ampersand as an HTML entity
    >> ( like &nbsp; ).

    >
    >> Is this a bug in IE6?

    >
    > No, it is a bug in your markup. & should always be &amp; The browser is
    > entitled to suppose any string starting with & is an attempt at a character
    > entity. It may be that FF has a better error correction ability, but
    > you can't blame a browser for how it handles errors.
    >
    >> Is there any workaround for this?

    >
    > Yes. Enter & as &amp;


    To clarify for the original poster: this isn't a workaround, it's the
    proper way to escape the ampersand in HTML when it's being used as a
    literal instead of in its special role as first character in an entity code.
    Harlan Messinger, Aug 26, 2008
    #4
  5. Lars Eighner wrote:

    >> This is my code:


    As so often, a URL would have been needed, even for an apparently trivial
    piece of code. Experienced authors know this, and others should just believe
    it. :)

    >> <script>
    >> var div = document.getElementById('div1');
    >> div.innerHTML = "A&B";


    The markup is invalid due to lack of required type="..." attribute, but this
    is really just a formality. More importantly, we don't know whether this is
    supposed to be HTML or XHTML and how it has been served.

    >> IE6 renders the content of div1 as 'A'
    >> Firefox renders the content properly as 'A&B'

    >
    >> If there is a space after ampersand, IE6 renders it properly. So I
    >> think that IE is assuming anything after ampersand as an HTML entity
    >> ( like &nbsp; ).

    >
    >> Is this a bug in IE6?

    >
    > No, it is a bug in your markup.


    Whether the markup is correct depends on whether this is HTML or XHTML. In
    HTML, the content model of <script> is CDATA, which means that entity
    references are not recognized, so "&B" means just the character "&" followed
    by the character "B". In XHTML, the content model is #PCDATA, in which
    case...

    > & should always be &amp;


    .... or something equivalent.

    > The browser is entitled to suppose any string starting with & is an
    > attempt at a character entity.


    No, not in HTML when inside <script> (or <style>). Otherwise, it is
    _required_ to treat "&" as potentially starting an entity reference or a
    character reference. Error processing rules are then different for different
    situations and flavors of HTML. In HTML 4.01, "&B" must be parsed as an
    entity reference, but since no such entity has been defined, we're in the
    error processing area, and treating "&" as a data character is conventional
    in browsers in such cases. In XHTML, "&B", when not followed by a semicolon
    (possibly after some name characters) is a well-formedness violation and XML
    processors should simply report an error and refuse to display the document
    at all.

    Note: There are no grounds for assuming &B to be a "character entity" in any
    flavor of HTML. The pseudo-term "character entity" is, at best, shorthand
    for "entity reference that happens to evaluate to a one-character string".
    The entity reference &B does not evaluate to anything; it is undefined.

    Confused? Fine. Just outsource the script, avoiding the mess!

    > It may be that FF has a better error
    > correction ability, but
    > you can't blame a browser for how it handles errors.


    Oh we can, both on practical grounds and, in some cases, on formal grounds.

    >> Is there any workaround for this?

    >
    > Yes. Enter & as &amp;


    The best way to solve the problem is to put the script in an external file
    and reference it via <script type="text/javascript" src="foo.js"></script>

    Yucca
    Jukka K. Korpela, Aug 26, 2008
    #5
  6. Ben C wrote:

    > In markup like:
    >
    > <script>
    > div.innerHTML = "A&B";
    > </script>
    >
    > "A&B" is certainly inside a script element. But is it also inside a
    > <div> element?


    A tricky question, which I tried to avoid. In terms of HTML specifications,
    it is not inside any <div> element, since whatever happens via scripting is
    outside the scope of those specs.

    As http://msdn.microsoft.com/en-us/library/ms533897.aspx says so eloquently,
    "There is no public standard that applies to this [innerHTML] property".
    That vendor-specific page says:
    "When the innerHTML property is set, the given string completely replaces
    the existing content of the object. If the string contains HTML tags, the
    string is parsed and formatted as it is placed into the document."

    I think it is fair to read this so that they promise to parse the content as
    HTML. This in turn means that &B would be detected as undefined entity
    reference. If, on the other hand, A&amp;B were used, then it would be first
    parsed (as <script> element content, assuming HTML 4.01 rules) as such, and
    the second parsing would recognize &amp; as a reference that denotes the &
    character. But they don't say exactly how the parsing works.

    > We can imagine that the browser recursively enters its HTML parser to
    > evaluate innerHTML,


    Why, oh why, do people speak of recursion when they mean iteration?

    > so its HTML parser will see something like this:
    >
    > <div>
    > A&B
    > </div>
    >
    > where it would be required to treat & as potentially starting an
    > entity reference or a character reference as you say.


    No, I don't think it sees any <div> tag. It is parsing the string "A&B", and
    I agree with the idea that here "&" should be treated as a special
    character, here starting an entity reference. But the widely accepted
    fallback for undefined entity references is to treat them "literally", i.e.
    as if e.g. "&B" were really defined to mean "&B".

    Yucca
    Jukka K. Korpela, Aug 27, 2008
    #6
  7. Ben C wrote:

    >>> We can imagine that the browser recursively enters its HTML parser
    >>> to evaluate innerHTML,

    >>
    >> Why, oh why, do people speak of recursion when they mean iteration?

    >
    > I don't know why people would do that. I only speak of recursion when
    > I mean recursion.


    Which recursion is involved when a browser, having parsed HTML data, starts
    interpreting it, finds some client-side script code, executes it, then
    starts parsing the data that results from the execution? (In this case, as
    so often, the generation of that data is trivial, since it is a string
    constant, but that's irrelevant here.) Answer: There is no recursion
    involved. The parsing was finished long before the script execution started,
    at the logical level at least, and then new parsing was initiated. It's
    really not even iteration, except in a trivial sense.

    Parsing HTML could itself be recursive (i.e., a parser routine might call
    itself), and that would be natural in a sense since HTML is defined
    recursively. But tag soup slurpers don't do that, and generally, recursive
    parsing is less efficient than non-recursive parsing.

    Yucca
    Jukka K. Korpela, Aug 27, 2008
    #7
  8. Ben C <> writes:

    > How would you parse HTML more efficiently than by using recursive
    > parsing?


    I don't know about other parsers, but Expat uses callback functions
    that it calls when it finds an opening tag, closing tag, text node,
    comment, etc. It's event driven, not recursive - the parser function
    never calls itself.

    sherm--

    --
    My blog: http://shermspace.blogspot.com
    Cocoa programming in Perl: http://camelbones.sourceforge.net
    Sherm Pendley, Aug 28, 2008
    #8
  9. Kiran Makam

    Neredbojias Guest

    On 27 Aug 2008, Ben C <> wrote:

    > I said we can "imagine" that the browser recursively enters its HTML
    > parser. I'm not talking about particular implementations, although I see
    > no reason why they wouldn't use recursion here.


    From the Neredbojias dictionary:

    Recursion - The proximate deployment of more than one swear word, any of
    which is not phraseologically related to the others.

    Iteration - Improper or excessive use of a pronoun.

    Hope that clears this up.

    --
    Neredbojias
    http://www.neredbojias.net/
    Great Sights and Sounds
    http://adult.neredbojias.net/ (adult)
    Neredbojias, Aug 28, 2008
    #9
  10. Ben C wrote:

    > I said we can "imagine" that the browser recursively enters its HTML
    > parser.


    There's no reason to imagine anything more complex than I described.

    > I'm not sure what you mean by "interpreting" HTML data.


    Processing it by some semantic rules, such as the rule that <script> element
    content is script code that needs to be passed to a script interpreter. This
    is something that can only be performed after the element has been parsed.

    > The basic
    > operation here is to build a DOM tree out of HTML.


    That's irrelevant. The point is that the HTML markup _has been parsed_, and
    then you start doing something else. If you will then start parsing HTML
    again, it ain't no recursion. It's just another instance of parsing.

    >> Parsing HTML could itself be recursive (i.e., a parser routine might
    >> call itself), and that would be natural in a sense since HTML is
    >> defined recursively. But tag soup slurpers don't do that

    >
    > Who cares about tag soup slurpers or knows what the hell they do?


    The innerHTML construct is all about tag slurpers, existing browsers, not
    ideal browsers as defined in specifications.

    > How would you parse HTML more efficiently than by using recursive
    > parsing?


    Browsers have done that for years. You just look at tags and turn them to
    actions. You see <strong>, you start bolding. You see </strong>, you turn
    bolding off. There are browser features that resemble structural processing,
    and newer browsers might even be good at it, but in fact structural
    processing can be performed by using explicit stacks, instead of the
    implicit stacking involved in recursion.

    I could write a nonrecursive HTML parser for you, but then I would have
    to... charge you for it.

    Yucca
    Jukka K. Korpela, Aug 28, 2008
    #10
  11. Ben C wrote:

    >> Processing it by some semantic rules, such as the rule that <script>
    >> element content is script code that needs to be passed to a script
    >> interpreter. This is something that can only be performed after the
    >> element has been parsed.

    >
    > OK, but the <script> element has to be interpreted before elements
    > after it in the source are.


    Not at all. Actually, it need not be interpreted at all. Browsers may well
    ignore the content of <script> elements, and they often do, but they still
    need to _parse_ them (if not for anything else, in order to recognize the
    end of the element).

    >> That's irrelevant. The point is that the HTML markup _has been
    >> parsed_, and then you start doing something else. If you will then
    >> start parsing HTML again, it ain't no recursion. It's just another
    >> instance of parsing.

    >
    > You're presupposing an unnecessarily complicated implementation.


    No, I'm just describing what happens conceptually. A parser is a parser even
    if integrated into a grotesquely large program.

    > You're saying the program looks something like this:


    No, I'm not saying anything about timing, such as processing some part of an
    HTML document while the rest is still being parsed. Running a parser and a
    script interpreter in parallel does not imply that if the script interpreter
    invokes another instance of the parse, it would be some kind of recursion.

    So you _are_ confusing recursion with iteration, or actually mere new
    invocation - as many people do.

    > Yes I realize Microsoft invented innerHTML, but OperaFirefoxSafari
    > implement it and they are not tag slurpers.


    They slurp tags more than you'd think. Check out whether they are still
    sensitive to the presence or absence of the "optional" </p> tag as regards
    to styling. Last time I checked, I was disappointed.

    > Yes everyone knows that. But it's normal when describing an algorithm
    > to say it is "recursive" even if when you come to implement it you
    > avoid actually writing a function that calls itself.


    There's no recursive algorithm involved in the handling of innerHTML.

    Yucca
    Jukka K. Korpela, Aug 28, 2008
    #11
  12. Ben C <> writes:

    > On 2008-08-27, Sherm Pendley <> wrote:
    >> Ben C <> writes:
    >>
    >>> How would you parse HTML more efficiently than by using recursive
    >>> parsing?

    >>
    >> I don't know about other parsers, but Expat uses callback functions
    >> that it calls when it finds an opening tag, closing tag, text node,
    >> comment, etc. It's event driven, not recursive - the parser function
    >> never calls itself.

    >
    > Indeed, and neither does the tree builder you implement in the
    > callbacks-- it has to either maintain an explicit stack or use parent
    > pointers on the tree nodes it is generating.
    >
    > But none of that is any more efficient than doing it recursively, it's
    > just one way of trying to separate things.


    It's not faster, but I'd say it's more memory-efficient. Instead of a
    deep call stack + your data tree, you have just the tree. And it's
    easier for a lot of programmers to understand - for some reason, a lot
    of people have trouble with recursion.

    sherm--

    --
    My blog: http://shermspace.blogspot.com
    Cocoa programming in Perl: http://camelbones.sourceforge.net
    Sherm Pendley, Aug 28, 2008
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ivor O'Connor
    Replies:
    4
    Views:
    819
    Isofarro
    Nov 25, 2003
  2. Phil N

    innerHTML / IE6 /NN7

    Phil N, Jul 13, 2003, in forum: Javascript
    Replies:
    5
    Views:
    245
    Phil N
    Jul 14, 2003
  3. sonic
    Replies:
    5
    Views:
    261
    Randy Webb
    Jul 11, 2006
  4. Pugi!
    Replies:
    0
    Views:
    224
    Pugi!
    Feb 5, 2007
  5. Kiran Makam
    Replies:
    4
    Views:
    190
    Kiran Makam
    Aug 26, 2008
Loading...

Share This Page