XSL: how to remove nodes from the XML tree? (advanced)

Discussion in 'XML' started by pavel.repkin@gmail.com, Apr 26, 2007.

  1. Guest

    Hey!
    How would you do the following task?

    Let you have an XML tree on input.
    Suppose, there is a special kind of node you want to remove.
    Let it have "bad" name.
    Each "bad" node has a parent node, obviously.
    In case there are no children left in the parent after removal of all
    the "bad" nodes, the parent must also be removed.
    And this rule is applied to all the ancetors of the "bad" node
    recursively.

    How would you do this in XSLT?
    I don't know :(

    Example input:
    <a>
    <b>
    <c>
    <bad/>
    <bad/>
    </c>
    </b>
    <d>
    <bad/>
    </d>
    <e/>
    </a>

    Desired output:
    <a>
    <e/>
    </a>

    Pasha
     
    , Apr 26, 2007
    #1
    1. Advertising

  2. Standard approach: Start with the identity transform, then add templates
    for anything that doesn't want to simply be copied over.

    You want to discard any node that contains <bad/> somewhere in its
    subtree. Those nodes can be expressed as *[.//bad]. Write a template
    that matches that and outputs nothing.

    Exception: You want to keep the top-level element. Write a template that
    explicitly matches it and always outputs it, recursively processing its
    contents. Or modify the "anything containing bad" pattern to explicitly
    not match the top-level element.

    Details are left as an exercise for the student.

    --
    Joe Kesselman / Beware the fury of a patient man. -- John Dryden
     
    Joseph Kesselman, Apr 26, 2007
    #2
    1. Advertising

  3. roy axenov Guest

    wrote:
    > Suppose, there is a special kind of node you want to
    > remove. Let it have "bad" name. Each "bad" node has a
    > parent node, obviously. In case there are no children left
    > in the parent after removal of all the "bad" nodes, the
    > parent must also be removed. And this rule is applied to
    > all the ancetors of the "bad" node recursively.
    >
    > How would you do this in XSLT?


    Try reading XPath/XSLT tutorials. Note that this is a bit
    tricky to implement in XSLT1, you would need some fairly
    evil XPath expressions to filter out unneeded nodes. XSLT2
    would make things much easier for you. Reading something
    about identity transformation and exclusion templates
    should be extremely useful.

    Just for the heck of it:

    <xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0">
    <xsl:template match="@*|node()">
    <xsl:copy>
    <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
    </xsl:template>
    <xsl:template
    match=
    "
    *
    [..][descendant::bad]
    [not(descendant::*[not(*)][not(self::bad)])]
    "/>
    </xsl:stylesheet>

    Hm, let's see...

    "bad.xml" 12L, 96C written
    > xsltproc bad.xsl bad.xml

    <?xml version="1.0"?>
    <a>


    <e/>
    </a>
    >


    Yep. It even seems to work on your sample document.

    Oh, and stop using the google groups. GG never worked all
    that well for posting on the usenet newsgroups, but it got
    beyond bad in the last few days--seems like their ng
    archives suddenly broke down in a fairly spectacular
    fashion, and no one even bothered to fix them.

    --
    roy axenov
     
    roy axenov, Apr 26, 2007
    #3
  4. M Guest

    Hi Roy,

    Not quite there...

    If you apply your stylesheet to...
    <a>
    <bad/>
    <b>
    <c>
    <bad/>
    <bad/>
    </c>
    </b>
    <d>
    <bad/>
    </d>
    <e/>
    </a>

    You get left with a <bad> element left in.

    I think the empty template needs to be...
    <xsl:template match="*[descendant-or-self::bad and parent::*]"/>


    Cheers
    M

    "roy axenov" <> wrote in message
    news:f0qpcn$i5h$...
    >
    > wrote:
    > > Suppose, there is a special kind of node you want to
    > > remove. Let it have "bad" name. Each "bad" node has a
    > > parent node, obviously. In case there are no children left
    > > in the parent after removal of all the "bad" nodes, the
    > > parent must also be removed. And this rule is applied to
    > > all the ancetors of the "bad" node recursively.
    > >
    > > How would you do this in XSLT?

    >
    > Try reading XPath/XSLT tutorials. Note that this is a bit
    > tricky to implement in XSLT1, you would need some fairly
    > evil XPath expressions to filter out unneeded nodes. XSLT2
    > would make things much easier for you. Reading something
    > about identity transformation and exclusion templates
    > should be extremely useful.
    >
    > Just for the heck of it:
    >
    > <xsl:stylesheet
    > xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    > version="1.0">
    > <xsl:template match="@*|node()">
    > <xsl:copy>
    > <xsl:apply-templates select="@*|node()"/>
    > </xsl:copy>
    > </xsl:template>
    > <xsl:template
    > match=
    > "
    > *
    > [..][descendant::bad]
    > [not(descendant::*[not(*)][not(self::bad)])]
    > "/>
    > </xsl:stylesheet>
    >
    > Hm, let's see...
    >
    > "bad.xml" 12L, 96C written
    > > xsltproc bad.xsl bad.xml

    > <?xml version="1.0"?>
    > <a>
    >
    >
    > <e/>
    > </a>
    > >

    >
    > Yep. It even seems to work on your sample document.
    >
    > Oh, and stop using the google groups. GG never worked all
    > that well for posting on the usenet newsgroups, but it got
    > beyond bad in the last few days--seems like their ng
    > archives suddenly broke down in a fairly spectacular
    > fashion, and no one even bothered to fix them.
    >
    > --
    > roy axenov
     
    M, Apr 27, 2007
    #4
  5. Re: how to remove nodes from the XML tree? (advanced)

    As replied in another newsgroup, here is one solution:

    <xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:eek:utput omit-xml-declaration="yes" indent="yes"/>

    <xsl:strip-space elements="*"/>

    <xsl:template match="node()|@*">
    <xsl:copy>
    <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
    </xsl:template>

    <xsl:template match="bad"/>

    <xsl:template match=
    "*[* and not(descendant::*[not(*) and not(self::bad)])]"/>
    </xsl:stylesheet>

    Let's have a little bit more complex xml, such as this one:

    <a>
    <b>
    <c>
    <bad/>
    <bad/>
    </c>
    </b>
    <d>
    <bad/>
    </d>
    <e/>
    <f>
    <bad/>
    <good/>
    <bad/>
    </f>
    </a>


    The transformation above produces the required result:

    <a>
    <e/>
    <f>
    <good/>
    </f>
    </a>


    Cheers,
    Dimitre Novatchev


    <> wrote in message
    news:...
    > Hey!
    > How would you do the following task?
    >
    > Let you have an XML tree on input.
    > Suppose, there is a special kind of node you want to remove.
    > Let it have "bad" name.
    > Each "bad" node has a parent node, obviously.
    > In case there are no children left in the parent after removal of all
    > the "bad" nodes, the parent must also be removed.
    > And this rule is applied to all the ancetors of the "bad" node
    > recursively.
    >
    > How would you do this in XSLT?
    > I don't know :(
    >
    > Example input:
    > <a>
    > <b>
    > <c>
    > <bad/>
    > <bad/>
    > </c>
    > </b>
    > <d>
    > <bad/>
    > </d>
    > <e/>
    > </a>
    >
    > Desired output:
    > <a>
    > <e/>
    > </a>
    >
    > Pasha
    >
     
    Dimitre Novatchev, Apr 29, 2007
    #5
  6. Pavel Lepin Guest

    M <> wrote in
    <1tpYh.1549$>:
    > "roy axenov" <> wrote in message
    > news:f0qpcn$i5h$...
    >> wrote:
    >> > Suppose, there is a special kind of node you want to
    >> > remove. Let it have "bad" name. Each "bad" node has a
    >> > parent node, obviously. In case there are no children
    >> > left in the parent after removal of all the "bad"
    >> > nodes, the parent must also be removed. And this rule
    >> > is applied to all the ancetors of the "bad" node
    >> > recursively.

    >>
    >> > xsltproc bad.xsl bad.xml

    >> <?xml version="1.0"?>
    >> <a>
    >>
    >>
    >> <e/>
    >> </a>
    >> >

    >>
    >> Yep. It even seems to work on your sample document.

    >
    > I think the empty template needs to be...
    > <xsl:template match="*[descendant-or-self::bad and
    > parent::*]"/>


    That would remove any element that has any <bad/>
    descendants. I don't believe that's what the OP was asking
    for.

    --
    Pavel Lepin
     
    Pavel Lepin, May 2, 2007
    #6
  7. Guest

    Re: how to remove nodes from the XML tree? (advanced)

    Dimitre, thank you so much, the transformation works perfectly!
    Besides, I have even managed to understand how does it work. :) Hope
    to use this technique in future.

    Very professional, thanks!
     
    , May 2, 2007
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jim Bancroft

    Use XSL to remove "outdated" nodes?

    Jim Bancroft, Jul 2, 2003, in forum: XML
    Replies:
    6
    Views:
    1,057
    Jim Bancroft
    Jul 2, 2003
  2. gavnosis
    Replies:
    0
    Views:
    527
    gavnosis
    Aug 2, 2003
  3. th3dude
    Replies:
    0
    Views:
    437
    th3dude
    Jan 3, 2007
  4. Eric
    Replies:
    1
    Views:
    2,056
    Pavel Lepin
    Feb 29, 2008
  5. raki
    Replies:
    1
    Views:
    1,154
    Alexey Smirnov
    Jun 24, 2009
Loading...

Share This Page