XHTML user agent behavior regarding empty elements

M

Mikko Ohtamaa

From XML specification:

[Definition: An element with no content is said to be empty.] The
representation of an empty element is either a start-tag immediately
followed by an end-tag, or an empty-element tag.

(This means that <foo></foo> is equal to <foo/>)

From XHTML specification:

C.3. Element Minimization and Empty Element Content
Given an empty instance of an element whose content model is not EMPTY
(for example, an empty title or paragraph) do not use the minimized form
(e.g. use <p> </p> and not <p />).

From XML point of view <div/> and <div></div> are equal. However, XHTML,
which should be valid XML, recommends(?) to use <div></div> only. Should
XHTML browsers accept empty-element tags?

A little testing shows that this is not the case. Both IE 5.5 and Netscape
7.0 fail to render following XHTML code correctly. They consider
empty-element tag <div/> equal to <div>.

This is nuisance, since when you are producing XHTML from XML with XSLT
transform, XSLT transformers present empty elements using empty-element
tag notation. You must use external postprocessor to change <div/>
elements to <div></div> pairs.

<?xml version="1.0" encoding="utf-8" ?>
<html>
<body>

<div style="margin-left: 10%; background: blue">
A working sample.
<div style="margin-left: 10%; background: red">
Lalihoo!
<div id="blaah"></div>
Am I red?
</div>
Am I blue?
</div>

<br/>

<div style="margin-left: 10%; background: blue">
Hiihoo!

<div style="margin-left: 10%; background: red">
Lalihoo!
<div id="blaah"/>
Am I red?
</div>
Am I blue? No, I am red because I am confused.
</div>
</body>
</html>
 
J

Julian F. Reschke

Mikko Ohtamaa said:
From XML specification:

[Definition: An element with no content is said to be empty.] The
representation of an empty element is either a start-tag immediately
followed by an end-tag, or an empty-element tag.

(This means that <foo></foo> is equal to <foo/>)

From XHTML specification:

C.3. Element Minimization and Empty Element Content
Given an empty instance of an element whose content model is not EMPTY
(for example, an empty title or paragraph) do not use the minimized form
(e.g. use <p> </p> and not <p />).

From XML point of view <div/> and <div></div> are equal. However, XHTML,
which should be valid XML, recommends(?) to use <div></div> only. Should
XHTML browsers accept empty-element tags?

a) The quote in C.3 is from the (non-normative) chapter "HTML compatibility
guidelines".

b) They must.
A little testing shows that this is not the case. Both IE 5.5 and Netscape
7.0 fail to render following XHTML code correctly. They consider
empty-element tag <div/> equal to <div>.

IE is known not to support XHTML. For NS 7, this may be a bug that needs to
be fixed. Make sure that you are serving the XHTML in a way that the browser
is *aware* that this is not HTML, though.

Julian
 
J

Johannes Koch

Mikko said:
From XML specification:

[Definition: An element with no content is said to be empty.] The
representation of an empty element is either a start-tag immediately
followed by an end-tag, or an empty-element tag.

(This means that <foo></foo> is equal to <foo/>)

From XHTML specification:

C.3. Element Minimization and Empty Element Content
Given an empty instance of an element whose content model is not EMPTY
(for example, an empty title or paragraph) do not use the minimized form
(e.g. use <p> </p> and not <p />).

From XML point of view <div/> and <div></div> are equal.

From XML 1.0 Second Edition:
Empty-element tags may be used for any element which has no content,
whether or not it is declared using the keyword EMPTY. For
interoperability, the empty-element tag should be used, and should only
be used, for elements which are declared EMPTY.
However, XHTML,
which should be valid XML, recommends(?) to use <div></div> only. Should
XHTML browsers accept empty-element tags?

Yes, they should.
A little testing shows that this is not the case. Both IE 5.5 and Netscape
7.0 fail to render following XHTML code correctly.

IE 5.5 is no XHTML browser, maybe it can be called an XML browser.
In various browsers XML rules are only applied when the content is known
to be XML (via an appropriate Content-Type HTTP header).
They consider
empty-element tag <div/> equal to <div>.

In tag soup mode.

No f'up2 set, because it may be interesting for both groups.
 
J

Jukka K. Korpela

From XHTML specification:

C.3. Element Minimization and Empty Element Content
Given an empty instance of an element whose content model is not
EMPTY (for example, an empty title or paragraph) do not use the
minimized form (e.g. use <p> </p> and not <p />).

I think it needs to be mentioned that the HTML 4.01 specification
explicitly frowns upon empty paragraphs and says authors should not use
them and browsers shoulds ignore them. It's not clear whether <p> </p> is
empty or not; a space character as the content is not the same as lack of
A little testing shows that this is not the case. Both IE 5.5 and
Netscape 7.0 fail to render following XHTML code correctly. They
consider empty-element tag <div/> equal to <div>.

No wonder. And rumors say that there are even some small browsers that
process the construct <div/> _correctly_ by HTML rules as valid up to and
This is nuisance, since when you are producing XHTML from XML with
XSLT transform, XSLT transformers present empty elements using
empty-element tag notation. You must use external postprocessor to
change <div/> elements to <div></div> pairs.

Why do you generate elements with empty content in the first place?
What is the meaning of a <div> element with no content, give that the
<div> element has no semantics except in the abstract sense that it
constitutes a block-element element?

Empty elements are extremely confusing, see
http://www.cs.tut.fi/~jkorpela/html/empty.html
 
J

Johannes Koch

Mikko said:
I am using MSXML (Microsoft XML engine) to transform XML data to XHTML
reports.

Why do you want to create _X_HTML reports, when several browsers don't
know about _X_HTML. Produce HTML instead.
In XSLT it is too heavy to check if each element will be empty and
implement a wrapper for it.

<xsl:template match="foo">
<xsl:if test="normalize-space(.) != ''">
<div class="{local-name()}">
<xsl:value-of select="."/>
</div>
</xsl:if>
</xsl:template>

Is this really too heavy?

xpost and f'up2 ctx
 
D

David Madore

Mikko Ohtamaa in litteris
From XML point of view <div/> and <div></div> are equal. However, XHTML,
which should be valid XML, recommends(?) to use <div></div> only. Should
XHTML browsers accept empty-element tags?

If the document is served with MIME content-type
"application/xhtml+xml", then <div /> _must_ be treated as equivalent
to <div></div>; on the other hand, if the document is served with MIME
content-type "text/html", then the browser is free to treat the
content as a soup of tag.

A little testing shows that this is not the case. Both IE 5.5 and Netscape
7.0 fail to render following XHTML code correctly. They consider
empty-element tag <div/> equal to <div>.

Mozilla (and Mozilla derivatives, such as Netscape7) treat <div/> as
equivalent to <div> when parsing the document as HTML, but as
<div></div> when parsing it as XHTML. The difference is determined by
the MIME content-type as explained above, or, in the absence of
higher-level protocol information, by the extension.

Note that Mozilla is about the only browser which supports the
application/xhtml+xml content-type anyway.
This is nuisance, since when you are producing XHTML from XML with XSLT
transform, XSLT transformers present empty elements using empty-element
tag notation. You must use external postprocessor to change <div/>
elements to <div></div> pairs.

Simply use <xsl:comment> to create a comment inside the <div> element
if it has any chance of being empty: this will prevent it from being
minimized. I use "<!-- EMPTY -->" in this context.
 
A

Alan J. Flavell

Simply use <xsl:comment> to create a comment inside the <div> element
if it has any chance of being empty: this will prevent it from being
minimized. I use "<!-- EMPTY -->" in this context.

The div element is designed to contain, well, "content". If there
isn't any content, then it's semantically meaningless (syntax or no
syntax). Surely the logical move would be to take it out, rather than
looking for other kinds of content-free clutter to stick into it?

(I did once have a program that ran faster by inserting a NOP, but
that's a different story entirely.)

all the best
 
D

David Madore

"Alan J. Flavell" in litteris
The div element is designed to contain, well, "content". If there
isn't any content, then it's semantically meaningless (syntax or no
syntax). Surely the logical move would be to take it out, rather than
looking for other kinds of content-free clutter to stick into it?

Generally speaking, I agree with you. There are rare cases, however,
where I find an empty <div> or <span> element to be useful and
appropriate. Here's one:

<div style="border: solid">
<img src="pornpicture.jpg" width="120" height="240"
alt="[Highly erotic image]" style="float: left" />
<p>To the left is a picture of me. Blah, blah, blah.</p>
<div style="clear: both"><!-- EMPTY --></div>
</div>

- in other words, the empty <div> is used to make sure that the border
of the outer <div> fully goes around the image even if the text is too
short for that.

Another case is when you want to style an element using the CSS
"content" property: sometimes there is nothing else to put in the
element. One intereting hack consists of using the CSS "content"
property on an empty <span> element as it seems to be the only way to
include foreign text in an HTML document without embedding it.
Similarly, using the Mozilla-invented XBL language it might turn out
to be useful to bind to empty <div> or <span> elements.

Another case is when the <div> or <span> element starts empty, but
receives dynamical content through the Document Object Model, e.g.,
via ECMAscript. Of course, the DOM might be used to create the <div>
or <span> element itself, but it might then be a major hassle to get
it in the right place, whereas an empty <div> or <span> element with a
correct id tag is so simple to locate in the DOM!

Speaking of which, of course, an empty <div> might be useful if you
want several anchors pointing to the same place in an HTML document.
It isn't very elegant, and I would advise against it in general, but
sometimes it seems to be the right thing to do.

But, again, in general, I agree with you: unless content generation
makes it very hard to tell in advance whether the <div> will be empty,
it is better to leave out empty <div>s.

Besides, I was using <div> just as an example: there are other
possibly empty tags to which the poster's question might validly
apply. <script> springs to my mind. (Unfortunately, as far as
<script> goes, there is the nasty problem of XML's PCDATA versus
SGML's CDATA content...)
 
J

Jukka K. Korpela

There are rare cases, however,
where I find an empty <div> or <span> element to be useful and
appropriate.

Let's see you examples:
<div style="clear: both"><!-- EMPTY --></div>

You should assign clear: both to the next element. If there is no next
element in the document, no clearing is needed.
Another case is when you want to style an element using the CSS
"content" property:

The content property applies to :before and :after pseudo-elements only,
so you just need to select whether you wish to have the text inserted
before or after some text in the document.
One intereting hack consists of using the CSS "content"
property on an empty <span> element as it seems to be the only way to
include foreign text in an HTML document without embedding it.

Would that really fall within the principle of using CSS for optional
presentational suggestions? It's hardly a good argument in favor of
something that it would be needed for a hack that shouldn't be used. But
even for such a hack, you can simply assign the content property to a
suitable pseudo-element (as you need to do anyway, but the point is that
the pseudo-element can be derived from a real element, as opposite to an
artificial element with empty content).
Similarly, using the Mozilla-invented XBL language it might turn out
to be useful to bind to empty <div> or <span> elements.

A similar case indeed, except that you're referring to a browser-specific
invention, it seems.
Another case is when the <div> or <span> element starts empty, but
receives dynamical content through the Document Object Model, e.g.,
via ECMAscript.

This is the kind of emptyness that potentially makes sense in SGML-based
markup, but whether it makes sense in authoring for the WWW is less clear.
Of course, the DOM might be used to create the <div>
or <span> element itself,

I think you just objected your own example. If scripting is actually used
to change the document's structure by adding elements, why would you hide
this with making them technically static?
Speaking of which, of course, an empty <div> might be useful if you
want several anchors pointing to the same place in an HTML document.
It isn't very elegant, and I would advise against it in general, but
sometimes it seems to be the right thing to do.

The need still needs to be proven.
 
D

David Madore

"Jukka K. Korpela" in litteris
Let's see you examples:



You should assign clear: both to the next element. If there is no next
element in the document, no clearing is needed.

Maybe you didn't read my example completely. I'm not using the
"clear" property to clear the next element, but to clear the border of
the surrounding <div>.

Here's an example (except that I didn't have a nice porn picture to
use, sorry): please compare

<URL: http://www.eleves.ens.fr:8080/home/madore/.test/float1.html >
and
<URL: http://www.eleves.ens.fr:8080/home/madore/.test/float2.html >

(the first uses an empty <div> as I suggest, and the second puts the
float property on the next element).

All browsers I have at hand display them differently, and that is also
what I understand from the CSS spec should be done. And evidently
there are cases when the first presentation is wanted, not the second:
The content property applies to :before and :after pseudo-elements only,
so you just need to select whether you wish to have the text inserted
before or after some text in the document.

Sometimes the content is generated and it is extremely difficult to
get at the previous or next generated element.
Would that really fall within the principle of using CSS for optional
presentational suggestions? It's hardly a good argument in favor of
something that it would be needed for a hack that shouldn't be used.

I would very much prefer if the fathers and normalizers of HTML had
foreseen the usefulness of a tag to include plain text (or
inline-level HTML) from a foreign source within HTML (without creating
a block-level element for embedding). But given that this tag doesn't
exist, what else can I do? I agree that it's a hack to use CSS for
that, and most often contrary to the goals and principles of CSS
(though not always: sometimes the inserted text *is* optional and of
presentational nature), but until someone suggests a better
solution...
But
even for such a hack, you can simply assign the content property to a
suitable pseudo-element (as you need to do anyway, but the point is that
the pseudo-element can be derived from a real element, as opposite to an
artificial element with empty content).

See above: if the content is generated, it is not always easy, or even
possible, to get at the previous or next element.

Or it may be simply a matter of elegance. For example, consider this:

<p>Stylesheet name (if applicable): [<span
id="insert-stylesheet-name-here"><!-- EMPTY --></span>]</p>

with a CSS rule like

#insert-stylesheet-name-here:before { content: "Foobar"; }

in the "Foobar" stylesheet, and similarly in the others. Now it is
true that I might also write this as

<p>Stylesheet name (if applicable): [<span
id="insert-stylesheet-name-here">]</span></p>

I just happen to think it is more elegant to use an empty <span> tag,
because it avoids misbalancing the braces.

(Of course, you might then point out that the <span> shouldn't be
empty, it should contain the word "none", and CSS should be used to
avoid displaying that word when a stylesheet is active. Right. We
could continue the byzantine discussion indefinitely in this line.)
A similar case indeed, except that you're referring to a browser-specific
invention, it seems.

Yes, and so? There's nothing wrong with browser-specific inventions
if they're useful and are employed in a way that gracefully degrades
on other browsers.
This is the kind of emptyness that potentially makes sense in SGML-based
markup, but whether it makes sense in authoring for the WWW is less clear.

I'm not sure I understand this comment.
I think you just objected your own example. If scripting is actually used
to change the document's structure by adding elements, why would you hide
this with making them technically static?

It's not a matter of hiding the fact that dynamic content will be
inserted. It's just that if there is a small (and optional) amount of
it, it is much simpler to dump it in an already existent, but empty,
The need still needs to be proven.

Why is the burden of the proof on my shoulders? Suppose you proved
that the need cannot arise?

It seems that in every case I've given (except the first, where I
still see no workaround) you've told me "this isn't absolutely
necessary" and I've answered "yes, but it's convenient". I hope we
can agree on this: that empty <div> or <span> elements are not
necessary, but they are sometimes convenient. Now suppose you told me
what is *wrong* about them?

If there is some kind of dogmatic reason ("Natura abhorret vacuum"?)
for not ever using empty <div> or <span> tags, then I will refrain
from further discussion. My religion doesn't forbid empty <div> or
<span> tags: it just frowns upon their *gratuitous* use, but allows
them when they make things simpler, or more convenient, and when no
other inconvenience results (and I'd like to know what inconvenience
can be caused by an empty tag). In that case, let us just let our
religions be at peace and people can make their own mind as to what
gospel they will follow. I do not intend to flame or debate endlessly
about what is The Right Thing.

On the other hand, if you have an important practical reason for not
using empty <div> and <span> tags (such as "this-or-that browser will
break to pieces upon encountering them" or "they cause a serious
accessibility problem for people with this-or-that disability"), then
I would certainly like to hear it.

Cheers,
 
A

Alan J. Flavell

Sometimes the content is generated and it is extremely difficult to
get at the previous or next generated element.

We are still free to discuss the quality of the end result, surely, no
matter what technique was used to generate it? If the tools then
prove inadequate to the task, we would have to decide which is more
important - to use the tools at hand, or to produce a quality product.

I have been known to pass the result through a post-filter where I
wasn't satisfied with the output of some tool that I needed to use for
other reasons; and no doubt I'll be doing the same again if/when a
similar situation arises.
 
D

David Madore

"Alan J. Flavell" in litteris
We are still free to discuss the quality of the end result, surely, no
matter what technique was used to generate it? If the tools then
prove inadequate to the task, we would have to decide which is more
important - to use the tools at hand, or to produce a quality product.

Certainly. But I still fail to see why having empty <div> or <span>
elements degrades the "quality" of an (X)HTML document, apart from the
dogmatic "you're not supposed to" which in my opinion is certainly not
a sufficient argument to justify going through the pains of
post-processing the document in order to remove these empty tags (and
somehow relocate their style properties).

Cheers,
 
J

Jukka K. Korpela

Maybe you didn't read my example completely. I'm not using the
"clear" property to clear the next element, but to clear the border of
the surrounding <div>.

The meaning of the clear property is to stop floating, so I cannot see why
you could not use it the way I suggested. It seems to be that you are
imitating <br clear="..."> in CSS, rather than making full use of CSS
possibilities. I don't see how you would "clear the border"; a border
property affects the element that it is assigned to, and you can assign a
height property to the element if you wish to make it taller than its
content needs.
Sometimes the content is generated and it is extremely difficult to
get at the previous or next generated element.

You're referring to content generated by server- or client-side scripting
or preprocessing, right? The content generated by the CSS 'content'
property is something different. In any case, the tools you use for
generating content e.g. server-side should be selected to match the needs,
not vice versa.
I would very much prefer if the fathers and normalizers of HTML had
foreseen the usefulness of a tag to include plain text (or
inline-level HTML) from a foreign source within HTML (without creating
a block-level element for embedding).

Well, they did in a sense - but browsers have not implemented the SGML way
of using entities (except in the trivial sense of supporting a predefined
set of entity references that expand to character references).

I agree with the idea that a simple markup system like HTML should have
had a simple include feature. But CSS is _not_ the solution to that. There
are several better approaches, as describe in the c.i.w.a.h. FAQ.
(though not always: sometimes the inserted text *is* optional and of
presentational nature)

Then it should be something that accompanies the presentation of some
existing element. Besides, in WWW authoring the whole idea of CSS
generated content is mostly just theoretical, due to lack of support by
the current market leader among browsers.
<p>Stylesheet name (if applicable): [<span
id="insert-stylesheet-name-here"><!-- EMPTY --></span>]</p>

I fail to see what this relates to. Why would a document contain style
sheet names that way?
Yes, and so? There's nothing wrong with browser-specific inventions
if they're useful and are employed in a way that gracefully degrades
on other browsers.

The point is that you make arguments in favor of hacks, on the grounds
that some hacks need them.
It seems that in every case I've given (except the first, where I
still see no workaround) you've told me "this isn't absolutely
necessary" and I've answered "yes, but it's convenient".

I think for that for every case, including the first, I have shown that
there is no need for using a said:
On the other hand, if you have an important practical reason for not
using empty <div> and <span> tags

First, there is no practical need for <div> and <span> elements with empty
content (to use the proper terms).

Second, we have the precedent of <p></p>, which has caused much confusion
- it has been used for layout, and the HTML specification explicitly says
that it should not be used, and that browsers should ignore such elements.
And browsers do not generally do that, so we really have a confusion.

Third, to take a simple example, such elements mess up the document
appearance when a user style sheet is used in order to make all <div>
elements bordered, so that the structure can be seen.

Followups trimmed - I think we are now so far from general XML that this
belongs to the HTML group only.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,040
Latest member
papereejit

Latest Threads

Top