Effect of DTD declaration on XSL processing?

Discussion in 'XML' started by Simon Brooke, Jul 24, 2003.

  1. Simon Brooke

    Simon Brooke Guest

    Test Test HTML file

    Test file to observe effect of DTD declaration on XSL processing.
     
    Simon Brooke, Jul 24, 2003
    #1
    1. Advertising

  2. Simon Brooke

    Dean Tiegs Guest

    Simon Brooke <> writes:

    > (i) explain this behaviour and the reasons for it, and


    Namespaces. In your XHTML with DTD, the html element and all its
    descendents are in a namespace because of fixed-value attribute
    declarations in the DTD. In the XHTML without DTD, the elements are
    in no namespace.

    In your XSLT, <xsl:template match="html"> matches only html elements
    in no namespace.

    > (ii) suggest a recipe for XSLT stylesheets which work irrespective
    > of whether the DTD declaration is present or not?


    I think the only way is change all your XPath expressions to match
    against both possibilities. For example



    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0"
    xmlns:ht="http://www.w3.org/1999/xhtml">
    <xsl:template match="html | ht:html">


    or


    <xsl:template match="*[local-name() = 'html']">

    --
    Dean Tiegs, NE¼-20-52-25-W4
    “Confortare et esto robustusâ€
    http://telusplanet.net/public/dctiegs/
     
    Dean Tiegs, Jul 24, 2003
    #2
    1. Advertising

  3. Simon Brooke wrote:
    > I thought I knew this stuff. I've been teaching XML since 1998. But
    > I've just stumbled on something which is so bizarre (to me) that I'm
    > beginning to think I don't understand anything at all.
    >
    > Consider the attached files. They comprise one very simple XSL
    > transform, one copy of a very simple valid XHTML file with the DTD
    > declaration (without which it would not be valid), and one copy of the
    > same XHTML file but with the DTD declaration removed. Note that
    > with-dtd.html has been tested against the W3C validator.


    According to the spezification you have no XHTML-Dokument
    <http://www.w3.org/TR/xhtml1/#strict>:

    3.
    The root element of the document must contain an xmlns declaration for
    the XHTML namespace [XMLNS]. The namespace for XHTML is defined to be
    http://www.w3.org/1999/xhtml. An example root element might look like:

    <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
    --
    Johannes Koch
    In te domine speravi; non confundar in aeternum.
    (Te Deum, 4th cent.)
     
    Johannes Koch, Jul 24, 2003
    #3
  4. Simon Brooke

    Simon Brooke Guest

    Dean Tiegs <> writes:

    > Simon Brooke <> writes:
    >
    > > (i) explain this behaviour and the reasons for it, and

    >
    > Namespaces. In your XHTML with DTD, the html element and all its
    > descendents are in a namespace because of fixed-value attribute
    > declarations in the DTD. In the XHTML without DTD, the elements are
    > in no namespace.


    Many thanks for that, that's clear and straightforward...

    > In your XSLT, <xsl:template match="html"> matches only html elements
    > in no namespace.
    >
    > > (ii) suggest a recipe for XSLT stylesheets which work irrespective
    > > of whether the DTD declaration is present or not?

    >
    > I think the only way is change all your XPath expressions to match
    > against both possibilities. For example
    >
    > <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    > version="1.0"
    > xmlns:ht="http://www.w3.org/1999/xhtml">
    > <xsl:template match="html | ht:html">


    OK, this works. The effect on performance is quite surprising:

    -[simon]-> time xsltproc test.xsl with-dtd.html
    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
    <!--
    matched html
    -->

    real 0m11.328s
    user 0m0.020s
    sys 0m0.020s

    -[simon]-> time xsltproc test.xsl without-dtd.html
    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
    <!--
    matched html
    -->

    real 0m0.005s
    user 0m0.000s
    sys 0m0.000s

    Presumably the extra time is taken fetching and parsing the DTD. I
    *hate* generating code which isn't valid, but I'm going to have to
    make a choice here between valid code and reasonable performance!

    --
    (Simon Brooke) http://www.jasmine.org.uk/~simon/

    ;; my other religion is Emacs
     
    Simon Brooke, Jul 24, 2003
    #4
  5. In article <>,
    Simon Brooke <> wrote:
    >real 0m11.328s
    >user 0m0.020s
    >sys 0m0.020s


    >real 0m0.005s
    >user 0m0.000s
    >sys 0m0.000s


    >Presumably the extra time is taken fetching and parsing the DTD.


    Fetching, since the cpu time is still only .04 seconds.

    Can't you use a catalog to get a local copy instead? Or failing that
    an http proxy?

    -- Richard
    --
    Spam filter: to mail me from a .com/.net site, put my surname in the headers.

    FreeBSD rules!
     
    Richard Tobin, Jul 25, 2003
    #5
  6. Simon Brooke

    Simon Brooke Guest

    (Richard Tobin) writes:

    > In article <>,
    > Simon Brooke <> wrote:
    > >real 0m11.328s
    > >user 0m0.020s
    > >sys 0m0.020s

    >
    > >real 0m0.005s
    > >user 0m0.000s
    > >sys 0m0.000s

    >
    > >Presumably the extra time is taken fetching and parsing the DTD.

    >
    > Fetching, since the cpu time is still only .04 seconds.
    >
    > Can't you use a catalog to get a local copy instead? Or failing that
    > an http proxy?


    Not reliably, in all the places this software runs. However, it turns
    out not to matter very much because in the real application the time
    hit occurs only the first time the DTD declaration is seen, and as
    this software tends to have uptimes of more than six months a ten
    second hit at startup time is not that painful. I'm just glad that I
    now understand what's going on!

    The only thing that bothers me is what happens if the software doesn't
    have access to the public internet at all and consequently can't fetch
    the DTD. Presumably it will barf horribly and I'll have to do
    something about that.

    The easiest thing, of course, would be to not generate the DTD
    declaration in the first place, but for purely aesthetic reasons I
    don't want to do that!

    --
    (Simon Brooke) http://www.jasmine.org.uk/~simon/

    ;; my other religion is Emacs
     
    Simon Brooke, Jul 25, 2003
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Joseph Tilian
    Replies:
    0
    Views:
    355
    Joseph Tilian
    Dec 21, 2004
  2. Ronald Fischer
    Replies:
    4
    Views:
    1,762
    Ronald Fischer
    Mar 17, 2005
  3. Ken Larson
    Replies:
    1
    Views:
    589
    Philippe Poulard
    Feb 17, 2004
  4. Replies:
    1
    Views:
    3,613
    A. Bolmarcich
    May 27, 2005
  5. test
    Replies:
    2
    Views:
    2,046
    Oliver Wong
    Jul 28, 2006
Loading...

Share This Page