[XSLT] Absolute URI of an unparsed entity and catalog

Discussion in 'XML' started by Vincent Lefevre, Sep 10, 2003.

  1. I would like to know if the base URI considered to resolve an unparsed
    entity defined by a relative URI should be the URI before or after its
    rewriting due to a possible catalog.

    Let's take an example. Here's my XML file:

    <?xml version="1.0"?>
    <!DOCTYPE para
    PUBLIC "-//Norman Walsh//DTD Website Full V2.4.0//EN"
    "http://docbook.sourceforge.net/release/website/2.4.0/website-full.dtd"
    [
    <!ENTITY % entities SYSTEM "http://www.vinc17.org/www.ent">
    %entities;
    ]>
    <para><olink targetdocent="local.index.en">test</olink></para>

    and my XSLT file:

    <?xml version="1.0"?>
    <xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:template match="olink">
    <a href="{unparsed-entity-uri(@targetdocent)}">
    <xsl:apply-templates/>
    </a>
    </xsl:template>
    </xsl:stylesheet>

    http://www.vinc17.org/www.ent is a file in which I define unparsed
    entities that are relative to http://www.vinc17.org/. For instance:

    <!ENTITY local.index.en SYSTEM "index.en.html" NDATA XML>

    As I don't want to connect to http://www.vinc17.org/ to generate the
    URI, I use a catalog with the following entry:

    <rewriteSystem systemIdStartString="http://www.vinc17.org/www.ent"
    rewritePrefix="file:///home/lefevre/wd/www-new/www.ent"/>

    (in fact, http://www.vinc17.org/www.ent doesn't even exist in the
    reality, however the XSLT processor doesn't have to know that). But
    then, xsltproc generates the following file:

    <?xml version="1.0"?>
    <a href="file:///home/lefevre/wd/www-new/index.en.html">test</a>

    instead of:

    <?xml version="1.0"?>
    <a href="http://www.vinc17.org/index.en.html">test</a>

    Is that correct? I would have said that since the XSLT specifications
    don't define the notion of catalog, a catalog should be regarded only
    as a cacheing system (i.e. transparent for XML generation by XSLT); in
    this case, I should have got the version with http://www.vinc17.org/.

    Otherwise, I would have been interested in a different version of the
    unparsed-entity-uri function that would have yielded a relative URI. If
    I define all the URIs and filenames with relative names, then xsltproc
    does generate a relative URI, but this URI is relative to the current
    directory and not the document defining the entity; therefore this is
    not acceptable.

    --
    Vincent Lefèvre <> - Web: <http://www.vinc17.org/> - 100%
    validated (X)HTML - Acorn Risc PC, Yellow Pig 17, Championnat International
    des Jeux Mathématiques et Logiques, TETRHEX, etc.
    Work: CR INRIA - computer arithmetic / SPACES project at LORIA
    Vincent Lefevre, Sep 10, 2003
    #1
    1. Advertising

  2. Vincent Lefevre

    Bob Foster Guest

    I believe the answer to all such questions should be before. The catalog
    should not "show through" to the infoset in any way.

    Without any knowledge of how the catalog is implemented, I would guess the
    problem is in the entity resolver. It is probably returning the system id of
    the actual location it fetched the resource from rather than, as it should,
    the "virtual" system id it was handed.

    Bob Foster

    "Vincent Lefevre" <> wrote in message
    news:20030909230157$...
    > I would like to know if the base URI considered to resolve an unparsed
    > entity defined by a relative URI should be the URI before or after its
    > rewriting due to a possible catalog.
    >
    > Let's take an example. Here's my XML file:
    >
    > <?xml version="1.0"?>
    > <!DOCTYPE para
    > PUBLIC "-//Norman Walsh//DTD Website Full V2.4.0//EN"
    > "http://docbook.sourceforge.net/release/website/2.4.0/website-full.dtd"
    > [
    > <!ENTITY % entities SYSTEM "http://www.vinc17.org/www.ent">
    > %entities;
    > ]>
    > <para><olink targetdocent="local.index.en">test</olink></para>
    >
    > and my XSLT file:
    >
    > <?xml version="1.0"?>
    > <xsl:stylesheet version="1.0"
    > xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    > <xsl:template match="olink">
    > <a href="{unparsed-entity-uri(@targetdocent)}">
    > <xsl:apply-templates/>
    > </a>
    > </xsl:template>
    > </xsl:stylesheet>
    >
    > http://www.vinc17.org/www.ent is a file in which I define unparsed
    > entities that are relative to http://www.vinc17.org/. For instance:
    >
    > <!ENTITY local.index.en SYSTEM "index.en.html" NDATA XML>
    >
    > As I don't want to connect to http://www.vinc17.org/ to generate the
    > URI, I use a catalog with the following entry:
    >
    > <rewriteSystem systemIdStartString="http://www.vinc17.org/www.ent"
    > rewritePrefix="file:///home/lefevre/wd/www-new/www.ent"/>
    >
    > (in fact, http://www.vinc17.org/www.ent doesn't even exist in the
    > reality, however the XSLT processor doesn't have to know that). But
    > then, xsltproc generates the following file:
    >
    > <?xml version="1.0"?>
    > <a href="file:///home/lefevre/wd/www-new/index.en.html">test</a>
    >
    > instead of:
    >
    > <?xml version="1.0"?>
    > <a href="http://www.vinc17.org/index.en.html">test</a>
    >
    > Is that correct? I would have said that since the XSLT specifications
    > don't define the notion of catalog, a catalog should be regarded only
    > as a cacheing system (i.e. transparent for XML generation by XSLT); in
    > this case, I should have got the version with http://www.vinc17.org/.
    >
    > Otherwise, I would have been interested in a different version of the
    > unparsed-entity-uri function that would have yielded a relative URI. If
    > I define all the URIs and filenames with relative names, then xsltproc
    > does generate a relative URI, but this URI is relative to the current
    > directory and not the document defining the entity; therefore this is
    > not acceptable.
    >
    > --
    > Vincent Lefèvre <> - Web: <http://www.vinc17.org/> -

    100%
    > validated (X)HTML - Acorn Risc PC, Yellow Pig 17, Championnat

    International
    > des Jeux Mathématiques et Logiques, TETRHEX, etc.
    > Work: CR INRIA - computer arithmetic / SPACES project at LORIA
    Bob Foster, Sep 10, 2003
    #2
    1. Advertising

  3. 3.3 Unparsed Entities

    The root node has a mapping that gives the URI for each unparsed entity
    declared in the document's DTD. The URI is generated from the system
    identifier and public identifier specified in the entity declaration.
    The XSLT processor may use the public identifier to generate a URI for
    the entity instead of the URI specified in the system identifier. If the
    XSLT processor does not use the public identifier to generate the URI,
    it must use the system identifier; if the system identifier is a
    relative URI, it must be resolved into an absolute URI using the URI of
    the resource containing the entity declaration as the base URI [RFC2396].

    Vincent Lefevre wrote:
    > I would like to know if the base URI considered to resolve an unparsed
    > entity defined by a relative URI should be the URI before or after its
    > rewriting due to a possible catalog.
    >
    > Let's take an example. Here's my XML file:
    >
    > <?xml version="1.0"?>
    > <!DOCTYPE para
    > PUBLIC "-//Norman Walsh//DTD Website Full V2.4.0//EN"
    > "http://docbook.sourceforge.net/release/website/2.4.0/website-full.dtd"
    > [
    > <!ENTITY % entities SYSTEM "http://www.vinc17.org/www.ent">
    > %entities;
    > ]>
    > <para><olink targetdocent="local.index.en">test</olink></para>
    >
    > and my XSLT file:
    >
    > <?xml version="1.0"?>
    > <xsl:stylesheet version="1.0"
    > xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    > <xsl:template match="olink">
    > <a href="{unparsed-entity-uri(@targetdocent)}">
    > <xsl:apply-templates/>
    > </a>
    > </xsl:template>
    > </xsl:stylesheet>
    >
    > http://www.vinc17.org/www.ent is a file in which I define unparsed
    > entities that are relative to http://www.vinc17.org/. For instance:
    >
    > <!ENTITY local.index.en SYSTEM "index.en.html" NDATA XML>
    >
    > As I don't want to connect to http://www.vinc17.org/ to generate the
    > URI, I use a catalog with the following entry:
    >
    > <rewriteSystem systemIdStartString="http://www.vinc17.org/www.ent"
    > rewritePrefix="file:///home/lefevre/wd/www-new/www.ent"/>
    >
    > (in fact, http://www.vinc17.org/www.ent doesn't even exist in the
    > reality, however the XSLT processor doesn't have to know that). But
    > then, xsltproc generates the following file:
    >
    > <?xml version="1.0"?>
    > <a href="file:///home/lefevre/wd/www-new/index.en.html">test</a>
    >
    > instead of:
    >
    > <?xml version="1.0"?>
    > <a href="http://www.vinc17.org/index.en.html">test</a>
    >
    > Is that correct? I would have said that since the XSLT specifications
    > don't define the notion of catalog, a catalog should be regarded only
    > as a cacheing system (i.e. transparent for XML generation by XSLT); in
    > this case, I should have got the version with http://www.vinc17.org/.
    >
    > Otherwise, I would have been interested in a different version of the
    > unparsed-entity-uri function that would have yielded a relative URI. If
    > I define all the URIs and filenames with relative names, then xsltproc
    > does generate a relative URI, but this URI is relative to the current
    > directory and not the document defining the entity; therefore this is
    > not acceptable.
    >

    --
    Cordialement,

    ///
    (. .)
    -----ooO--(_)--Ooo-----
    | Philippe Poulard |
    -----------------------
    Philippe Poulard, Sep 10, 2003
    #3
  4. In article <bjn449$ape$>,
    Philippe Poulard <> wrote:

    > 3.3 Unparsed Entities


    > The root node has a mapping that gives the URI for each unparsed entity
    > declared in the document's DTD. The URI is generated from the system
    > identifier and public identifier specified in the entity declaration.
    > The XSLT processor may use the public identifier to generate a URI for
    > the entity instead of the URI specified in the system identifier. If the
    > XSLT processor does not use the public identifier to generate the URI,
    > it must use the system identifier; if the system identifier is a
    > relative URI, it must be resolved into an absolute URI using the URI of
    > the resource containing the entity declaration as the base URI [RFC2396].


    I know how to read the specs. :) But what if catalogs are used?
    My point was that this paragraph doesn't mention catalogs; thus, the
    XSLT processor should behave as if there were no catalogs (catalogs
    are just a transparent way of cacheing resources). In this case, this
    would mean that there is a bug in xsltproc. Before reporting a bug,
    I'd like to know whether my interpretation is correct or not.

    --
    Vincent Lefèvre <> - Web: <http://www.vinc17.org/> - 100%
    validated (X)HTML - Acorn Risc PC, Yellow Pig 17, Championnat International
    des Jeux Mathématiques et Logiques, TETRHEX, etc.
    Work: CR INRIA - computer arithmetic / SPACES project at LORIA
    Vincent Lefevre, Sep 10, 2003
    #4
  5. Vincent Lefevre

    Bob Foster Guest

    "Vincent Lefevre" <> wrote in message
    news:20030910215441$...
    > Before reporting a bug,
    > I'd like to know whether my interpretation is correct or not.


    Nobody is going to be able to give you chapter and verse on this (as the
    recent attempt illustrates). You can use "if the system identifier is a
    relative URI, it must be resolved into an absolute URI using the URI of the
    resource containing the entity declaration as the base URI" to support
    either side of this question. However, your point of view makes sense and
    the behavior you report is very counter-intuitive, so if I were you I'd file
    a bug report.

    If the response comes back, as I would expect, "We can't do anything because
    the catalog is reporting the wrong URI," then file a bug against the
    catalog. Or fix it yourself and contribute the fix.

    Bob Foster
    Bob Foster, Sep 11, 2003
    #5
  6. In article <kyO7b.214397$>,
    Bob Foster <> wrote:

    > Nobody is going to be able to give you chapter and verse on this (as
    > the recent attempt illustrates). You can use "if the system
    > identifier is a relative URI, it must be resolved into an absolute
    > URI using the URI of the resource containing the entity declaration
    > as the base URI" to support either side of this question. However,
    > your point of view makes sense and the behavior you report is very
    > counter-intuitive, so if I were you I'd file a bug report.


    OK, done: http://bugzilla.gnome.org/show_bug.cgi?id=122001

    Thanks,

    --
    Vincent Lefèvre <> - Web: <http://www.vinc17.org/> - 100%
    validated (X)HTML - Acorn Risc PC, Yellow Pig 17, Championnat International
    des Jeux Mathématiques et Logiques, TETRHEX, etc.
    Work: CR INRIA - computer arithmetic / SPACES project at LORIA
    Vincent Lefevre, Sep 11, 2003
    #6
  7. Vincent Lefevre wrote:
    > In article <bjn449$ape$>,
    > Philippe Poulard <> wrote:
    >
    >
    >>3.3 Unparsed Entities

    >
    >
    >>The root node has a mapping that gives the URI for each unparsed entity
    >>declared in the document's DTD. The URI is generated from the system
    >>identifier and public identifier specified in the entity declaration.
    >>The XSLT processor may use the public identifier to generate a URI for
    >>the entity instead of the URI specified in the system identifier. If the
    >>XSLT processor does not use the public identifier to generate the URI,
    >>it must use the system identifier; if the system identifier is a
    >>relative URI, it must be resolved into an absolute URI using the URI of
    >>the resource containing the entity declaration as the base URI [RFC2396].

    >
    >
    > I know how to read the specs. :) But what if catalogs are used?
    > My point was that this paragraph doesn't mention catalogs; thus, the
    > XSLT processor should behave as if there were no catalogs (catalogs
    > are just a transparent way of cacheing resources). In this case, this
    > would mean that there is a bug in xsltproc. Before reporting a bug,
    > I'd like to know whether my interpretation is correct or not.
    >


    Well, in my opinion, catalogs are just a convenient way to resolve
    entities, so the rules should be the same with or without catalogs, that
    is to say that as the caching relies on a local file system (in your
    case) the behaviour describe on the specs will be applied.

    However, [in java] it is possible to an external resource to endorse a
    specific BASE URI (case of XSLT) or SYSTEM ID or PUBLIC ID (case of XML)
    javax.xml.transform.Source#setSystemId()
    org.xml.sax.InputSource#setPublicId()
    org.xml.sax.InputSource#setSystemId()

    I don't really know if catalogs can do that.
    See the specs, as you know how to read them :)
    --
    Cordialement,

    ///
    (. .)
    -----ooO--(_)--Ooo-----
    | Philippe Poulard |
    -----------------------
    Philippe Poulard, Sep 12, 2003
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Pavel
    Replies:
    2
    Views:
    1,618
    Peter Flynn
    Aug 4, 2004
  2. etheriau
    Replies:
    1
    Views:
    656
    Pavel
    Aug 23, 2004
  3. D McGilvray
    Replies:
    2
    Views:
    359
    D McGilvray
    Aug 1, 2007
  4. markla
    Replies:
    1
    Views:
    522
    Steven Cheng
    Oct 6, 2008
  5. Turbo
    Replies:
    2
    Views:
    140
    Turbo
    Nov 1, 2006
Loading...

Share This Page