Re: Problems creating an automatic index for XHTML with XSLT

Discussion in 'XML' started by Dimitre Novatchev, Sep 10, 2003.

  1. Excuse me, but it is not clear what exactly you want to produce from your
    source xhtml -- how the output is related to the input (they seem
    essentially to have the same structure), how the output must be structured
    and what requirements it must satisfy.

    In other words, can you define what you mean by "index"?


    =====
    Cheers,

    Dimitre Novatchev.
    http://fxsl.sourceforge.net/ -- the home of FXSL



    "Alex Geller" <> wrote in message
    news:-ig.de...
    > Hi,
    > I am trying to add an index up front of an XSLT document.
    > The style should spot H1,H2 and H3s and make some sort of index from it
    > (currently I use nested OL/LI).
    > Example:
    > $cat test.html
    > <HTML>
    > <BODY>
    > <H1>H1 1</H1>
    > <H2>H2 1.1</H2>
    > <H3>H3 1.1.1</H3>
    > <H3>H3 1.1.2</H3>
    > <H3>H3 1.1.3</H3>
    > <H2>H2 1.2</H2>
    > <H3>H3 1.2.1</H3>
    > <H3>H3 1.2.2</H3>
    > <H3>H3 1.2.3</H3>
    > <H1>H1 2</H1>
    > <H2>H2 2.1</H2>
    > <H3>H3 2.1.1</H3>
    > <H3>H3 2.1.2</H3>
    > <H3>H3 2.1.3</H3>
    > <H2>H2 2.2</H2>
    > <H3>H3 2.2.1</H3>
    > <H3>H3 2.2.2</H3>
    > <H3>H3 2.2.3</H3>
    > </BODY>
    > </HTML>
    > $xalan -in test.html -xsl mkindex.xslt -out result.html
    > $cat result.html
    > <?xml version="1.0" encoding="UTF-8"?>
    > <HTML>
    > <BODY>
    > <H1>Index</H1>
    > <OL>
    > <LI><a href="#H1 1">H1 1</a>
    > <OL>
    > <LI><a href="#H2 1.1">H2 1.1</a>
    > <OL>
    > <LI><a href="#H3 1.1.1">H3 1.1.1</a></LI>
    > <LI><a href="#H3 1.1.2">H3 1.1.2</a>
    > </LI><LI><a href="#H3 1.1.3">H3 1.1.3</a></LI>
    > </OL>
    > </LI>
    > <LI><a href="#H2 1.2">H2 1.2</a>
    > <OL>
    > <LI><a href="#H3 1.2.1">H3 1.2.1</a></LI>
    > <LI><a href="#H3 1.2.2">H3 1.2.2</a></LI>
    > <LI><a href="#H3 1.2.3">H3 1.2.3</a></LI>
    > </OL>
    > </LI>
    > </OL>
    > </LI>
    > <LI><a href="#H1 2">H1 2</a>
    > <OL>
    > <LI><a href="#H2 2.1">H2 2.1</a>
    > <OL>
    > <LI><a href="#H3 2.1.1">H3 2.1.1</a></LI>
    > <LI><a href="#H3 2.1.2">H3 2.1.2</a>
    > </LI><LI><a href="#H3 2.1.3">H3 2.1.3</a></LI>
    > </OL>
    > </LI>
    > <LI><a href="#H2 2.2">H2 2.2</a>
    > <OL>
    > <LI><a href="#H3 2.2.1">H3 2.2.1</a></LI>
    > <LI><a href="#H3 2.2.2">H3 2.2.2</a></LI>
    > <LI><a href="#H3 2.2.3">H3 2.2.3</a></LI>
    > </OL>
    > </LI>
    > </OL>
    > </LI>
    > </OL>
    > <a name="H1 1"><H1>H1 1</H1></a>
    > <a name="H2 1.1"><H2>H2 1.1</H2></a>
    > <a name="H3 1.1.1"><H3>H3 1.1.1</H3></a>
    > <a name="H3 1.1.2"><H3>H3 1.1.2</H3></a>
    > <a name="H3 1.1.3"><H3>H3 1.1.3</H3></a>
    > <a name="H2 1.2"><H2>H2 1.2</H2></a>
    > <a name="H3 1.2.1"><H3>H3 1.2.1</H3></a>
    > <a name="H3 1.2.2"><H3>H3 1.2.2</H3></a>
    > <a name="H3 1.2.3"><H3>H3 1.2.3</H3></a>
    > <a name="H1 2"><H1>H1 2</H1></a>
    > <a name="H2 2.1"><H2>H2 2.1</H2></a>
    > <a name="H3 2.1.1"><H3>H3 2.1.1</H3></a>
    > <a name="H3 2.1.2"><H3>H3 2.1.2</H3></a>
    > <a name="H3 2.1.3"><H3>H3 2.1.3</H3></a>
    > <a name="H2 2.2"><H2>H2 2.2</H2></a>
    > <a name="H3 2.2.1"><H3>H3 2.2.1</H3></a>
    > <a name="H3 2.2.2"><H3>H3 2.2.2</H3></a>
    > <a name="H3 2.2.3"><H3>H3 2.2.3</H3></a>
    > </BODY>
    > </HTML>
    > I have found a solution that works for me but which is not very good.

    Maybe
    > it's helpful as a starting point or just to prove that I have tried a
    > little before posting.
    > The solution has at least the following problems:
    > - In order to detect all H2 silblings between the two H1 elements H1a and
    > H1b I create a node list of all siblings following H1a and then, by
    > conditional, check whether the previous H1 sibling of the current H2 is

    H1.
    > The check is done by comparing the text of the H1 nodes. This breaks as
    > soon as two adjacent H1s have the same text.
    > - The template fails as soon as the Hn tags are not siblings in the same
    > list. Suppose we introduce a <div> in the document so that one or more Hn
    > elements become descendants of this element, then my scheme breaks.
    > $cat mkindex.xslt
    > <?xml version="1.0" encoding="ISO-8859-1"?>
    > <xsl:stylesheet
    > xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    > version="1.0">
    > <xsl:eek:utput method="xml"/>
    >
    > <xsl:template match="@*|node()">
    > <xsl:copy>
    > <xsl:apply-templates select="@*|node()"/>
    > </xsl:copy>
    > </xsl:template>
    > <xsl:template match="BODY">
    > <BODY>
    > <H1>Index</H1>
    > <OL>
    > <xsl:for-each select="H1">
    > <xsl:variable name="h1text" select="text()"/>
    > <LI>
    > <a href="#{text()}">
    > <xsl:value-of select="text()"/>
    > </a>
    > <OL>
    > <xsl:for-each select="following-sibling::H2">
    > <xsl:if
    > test="preceding-sibling::H1[position()=1]/text()=$h1text">
    > <xsl:variable name="h2text"
    > select="text()"/>
    > <LI>
    > <a href="#{text()}">
    > <xsl:value-of

    select="text()"/>
    > </a>
    > <OL>
    > <xsl:for-each
    > select="following-sibling::H3">
    > <xsl:if
    > test="preceding-sibling::H2[position()=1]/text()=$h2text">
    > <LI>
    > <a

    href="#{text()}">
    > <xsl:value-of
    > select="text()"/>
    > </a>
    > </LI>
    > </xsl:if>
    > </xsl:for-each>
    > </OL>
    > </LI>
    > </xsl:if>
    > </xsl:for-each>
    > </OL>
    > </LI>
    > </xsl:for-each>
    > </OL>
    > <xsl:apply-templates/>
    > </BODY>
    > </xsl:template>
    > <xsl:template match="H1">
    > <a name="{text()}">
    > <H1>
    > <xsl:apply-templates/>
    > </H1>
    > </a>
    > </xsl:template>
    > <xsl:template match="H2">
    > <a name="{text()}">
    > <H2>
    > <xsl:apply-templates/>
    > </H2>
    > </a>
    > </xsl:template>
    > <xsl:template match="H3">
    > <a name="{text()}">
    > <H3>
    > <xsl:apply-templates/>
    > </H3>
    > </a>
    > </xsl:template>
    > </xsl:stylesheet>
    > Thank you for your help
    > Regards,
    > Alex
     
    Dimitre Novatchev, Sep 10, 2003
    #1
    1. Advertising

  2. Dimitre Novatchev

    Alex Geller Guest

    Hi Dimitre,
    Dimitre Novatchev wrote:

    > Excuse me, but it is not clear what exactly you want to produce from your
    > source xhtml -- how the output is related to the input (they seem
    > essentially to have the same structure),

    Well, not quite. The input H?s have a flat structure (siblings) while in the
    output they are nested (Hn+1 become descendands of Hn). Maybe you were
    fooled by the indentation of the input HTML.
    >how the output must be structured

    Exactly as shown (best is, you view both the input and the output in a
    browser).
    > and what requirements it must satisfy.


    >
    > In other words, can you define what you mean by "index"?


    I want an automatic generation of a table of contents up front of an
    arbitrary HTML document where chapters and subchapters are denoted using H?
    tags. The style should search for these tags in the document, create the
    table of contents from those tags and then copy the document itself. The
    items in the table of contents should be linked vi <a href=.." to their
    respective chapters in the document. The table of content should have a
    hirachical structure using numbered lists as shown in the example. The
    rules for the structure could be defined as follows:
    Let v be a vector of all H? elements found in a pre order traversal of the
    document tree.
    For example:
    v=H1,H2,H2,H3,H2,H3,H1,H1,H2,H3,H2,H3
    We call n of a Hn element, it's hierarchy value.
    Create a resulttree r so that it contains all nodes from the source vector
    v. In this resulttree r every node vn from the source vector v
    becomes the child of it's preceding sibling vn-1 if the hierarchy value of
    vn is lower than the hirarchy value of vn-1
    r=H1(H2,H2(H3),H2(H3)),H1,H1(H2(H3),H2(H3).

    Thank you,
    Alex
     
    Alex Geller, Sep 11, 2003
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Guest
    Replies:
    1
    Views:
    771
    Guest
    Jun 29, 2004
  2. Marrow
    Replies:
    3
    Views:
    394
    Alex Geller
    Sep 12, 2003
  3. Replies:
    7
    Views:
    903
  4. Usha2009
    Replies:
    0
    Views:
    1,141
    Usha2009
    Dec 20, 2009
  5. Tomasz Chmielewski

    sorting index-15, index-9, index-110 "the human way"?

    Tomasz Chmielewski, Mar 4, 2008, in forum: Perl Misc
    Replies:
    4
    Views:
    307
    Tomasz Chmielewski
    Mar 4, 2008
Loading...

Share This Page