Minidom empty script element bug

D

Derek Basch

Hello All,

I ran into a problem while dynamically constructing XHTML documents using
minidom. If you create a script tag such as:

script_node_0 = self.doc.createElement("script")
script_node_0.setAttribute("type", "text/javascript")
script_node_0.setAttribute("src", "../test.js")

minidom renders it as:

<script src='../test.js' type='text/javascript'/>

Which is incorrect because:

XHTML 1.0 specs, Appendix C
~~~~@~~~~
C.3 Element Minimization and Empty Element Content

Given an empty instance of an element whose content model is not EMPTY (for
example, an empty title or paragraph) do not use the minimized form (e.g.
use <p> </p> and not <p />)
~~~~@~~~~

reference for further explanation:
http://lists.evolt.org/archive/Week-of-Mon-20020304/105951.html

So, the rendered page completely fails on IE6 because it actually handles the
empty script element correctly. Mozilla handles the element incorrectly and
instantiates the javascript.

How do I get minidom to NOT render an empty script element? Should I submit a
bug report?

Thanks for the help,
Derek Basch






__________________________________
Do you Yahoo!?
Yahoo! Small Business - Try our new resources site!
http://smallbusiness.yahoo.com/resources/
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Derek said:
XHTML 1.0 specs, Appendix C
~~~~@~~~~
C.3 Element Minimization and Empty Element Content

Given an empty instance of an element whose content model is not EMPTY (for
example, an empty title or paragraph) do not use the minimized form (e.g.
use <p> </p> and not <p />)
~~~~@~~~~

I'd like to point out that this is *not* a minidom bug. minidom cannot
possibly know that the document type is XHTML, and that strange, non-XML
rules apply to XHTML (i.e. rules which are not present in XML itself).

I'd also like to point out that XHTML Appendix C is informative (i.e.
non-normative), meaning that failure to comply to it does not imply
non-compliance with XHTML. An XML file which uses the minimized form
for the script element is still proper, well-formed, valid XHTML.
How do I get minidom to NOT render an empty script element? Should I submit a
bug report?

That said, I think there is a simple solution: add an empty Text node
to the script element:

script_node_0.appendChild(doc.createText(u""))

[Disclaimer: this is untested; from reading the source, I think it
should work]

Regards,
Martin
 
D

Derek Basch

Martin said:
Derek said:
XHTML 1.0 specs, Appendix C
~~~~@~~~~
C.3 Element Minimization and Empty Element Content

Given an empty instance of an element whose content model is not EMPTY (for
example, an empty title or paragraph) do not use the minimized form (e.g.
use <p> </p> and not <p />)
~~~~@~~~~

I'd like to point out that this is *not* a minidom bug. minidom cannot
possibly know that the document type is XHTML, and that strange, non-XML
rules apply to XHTML (i.e. rules which are not present in XML itself).

I'd also like to point out that XHTML Appendix C is informative (i.e.
non-normative), meaning that failure to comply to it does not imply
non-compliance with XHTML. An XML file which uses the minimized form
for the script element is still proper, well-formed, valid XHTML.
How do I get minidom to NOT render an empty script element? Should I submit a
bug report?

That said, I think there is a simple solution: add an empty Text node
to the script element:

script_node_0.appendChild(doc.createText(u""))

[Disclaimer: this is untested; from reading the source, I think it
should work]

Regards,
Martin


Thanks Martin. That fixed it. I had to change your code a bit to this:

script_node_0.appendChild(self.doc.createTextNode(""))

maybe you meant createTextNode?

I started digging through the dom modules on this path:

XHTMLPrettyPrint -> XHTMLPrinter -> Printer

and found this comment:

try:
#The following stanza courtesy Martin von Loewis
import codecs # Python 1.6+ only
from types import UnicodeType

So I guess you are pretty qualified to answer my question! You are
correct that this is not a minidom bug now that I think about it.

However, it seems proper that XHTMLPrinter (or some other module)
should allow the developer to use either normative or non-normative
XHTML design guidlines to achieve some sane degree of HTML user agent
compatablilty. Maybe something like this in Printer.py:

def visitElement(self, node):
...........
if len(node.childNodes):
self._write('>')
self._depth = self._depth + 1
self.visitNodeList(node.childNodes)
self._depth = self._depth - 1
if not self._html or (node.tagName not in
HTML_FORBIDDEN_END):
not (self._inText and inline) and self._tryIndent()
self._write('</%s>' % node.tagName)
elif not self._html and node.tagName not in
XHTML_NON_NORMATIVES:
self._write('/>')
elif node.tagName not in HTML_FORBIDDEN_END:
self._write('></%s>' % node.tagName)
else:
self._write('>')

of course this would only take care of the "C.3. Element Minimization
and Empty Element Content" guideline but you get the general idea.

Anyways, thanks for the help again and feel free to shoot down my
suggestions :)

Derek Basch
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Derek said:
maybe you meant createTextNode?

Yes, that's what I meant :)
However, it seems proper that XHTMLPrinter (or some other module)
should allow the developer to use either normative or non-normative
XHTML design guidlines to achieve some sane degree of HTML user agent
compatablilty.

This is now PyXML, right? I also maintain PyXML...
Yes, XHtmlPrinter would be the right place to deal with XHTML
idiosyncrasies.
Anyways, thanks for the help again and feel free to shoot down my
suggestions :)

The general approach sounds good; feel free to submit a patch
to sf.net/projects/pyxml. I would recommend to implement Annex C
to the letter, i.e. only avoid the minimized form if the content
model is not EMPTY.

Regards,
Martin
 
D

Derek Basch

Cross post from XML-SIG:

--- Walter Dörwald said:
Martin v. Löwis sagte:
Derek said:
[...]
How do I get minidom to NOT render an empty script element? Should
I
submit a bug report?
That said, I think there is a simple solution: add an empty Text
node to
the script element:
script_node_0.appendChild(doc.createText(u""))

[Disclaimer: this is untested; from reading the source, I think it
should
work]

If this doesn't work, you might want to try XIST
(http://www.livinglogic.de/Python/xist)
instead of minidom. XIST knows that the script element is not EMPTY, and when
the
output is in HTML compatible XML an end tag will be produced:
src="../test.js").asBytes(xhtml=1)
<script src="../test.js" type="text/javascript"></script>

Using pure XML mode gives:
src="../test.js").asBytes(xhtml=2)
<script src="../test.js" type="text/javascript"/>

Bye,
Walter Dörwald

Wow! XIST is very elegant. Perfectly designed for what it is supposed
to do.

"XIST is an extensible HTML/XML generator written in Python."

I guess there isn't much point in "fixing" the pyXML XHTMLPrinter when
something as cool as XIST exists (pun intended).

Kid also seems really neat. I like the TAL like features. However, it
seems less mature than XIST.

There seems to be lots of functionality crossover between the two but
it is good that there is enough demand for XML output functionality in
python to support two distinct modules.

Thanks Everyone!,
Derek Basch
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top