Need help remake xsl transformation

Discussion in 'XML' started by sommarlov@gmail.com, Jun 19, 2006.

  1. Guest

    Hi everyone
    >From one of our systems an xml file is produced. I need to validate

    this file before we send it to an external system for a very lenghty
    process. I cannot change the xml file layout.
    The solution i got today is very slow, and i need help to find another
    solution.

    Here is the xml file. It consists of a list of position ids (ESTOXX50
    INDEX_BM_E and FTSE INDEX_BM_E), and below that a list of tags for each
    position id. What i want to do is see that each entry not being in the
    <groupCustomBucketList> list has an entry in each of the
    <groupCustomBucket> tags below. And vice versa; that each position id
    from each tag exists in the list of <equity>. See xsl transformation
    below.

    <?xml version="1.0" encoding="utf-8"?>
    <?xml-stylesheet type="text/xsl" href="t.xsl"?>
    <positions>
    <equity>
    <positionId>ESTOXX50 INDEX_BM_E</positionId>
    </equity>
    <equity>
    <positionId>FTSE INDEX_BM_E</positionId>
    </equity>

    <groupCustomBucketList>
    <groupCustomBucket>
    <customDimensionName>Branch</customDimensionName>
    <customBucketValue>BENCHMARK</customBucketValue>
    <positionIdList>
    <positionId>BMK ZENIT ESTOXX50 INDEX_BM_E</positionId>
    <positionId>BMK ZENIT FTSE INDEX_BM_E</positionId>
    </positionIdList>
    </groupCustomBucket>
    <groupCustomBucket>
    <customDimensionName>Folder</customDimensionName>
    <customBucketValue>BZ_ESTOX50</customBucketValue>
    <positionIdList>
    <positionId>BMK ZENIT ESTOXX50 INDEX_BM_E</positionId>
    </positionIdList>
    </groupCustomBucket>
    <groupCustomBucket>
    <customDimensionName>Folder</customDimensionName>
    <customBucketValue>BZ_FTSE</customBucketValue>
    <positionIdList>
    <positionId>BMK ZENIT FTSE INDEX_BM_E</positionId>
    </positionIdList>
    </groupCustomBucket>
    <groupCustomBucket>
    <customDimensionName>Portfolio</customDimensionName>
    <customBucketValue>BMK_ZENIT</customBucketValue>
    <positionIdList>
    <positionId>BMK ZENIT ESTOXX50 INDEX_BM_E</positionId>
    <positionId>BMK ZENIT FTSE INDEX_BM_E</positionId>
    </positionIdList>
    </groupCustomBucket>
    <groupCustomBucket>
    <customDimensionName>CurrencyRegion</customDimensionName>
    <customBucketValue>EUR</customBucketValue>
    <positionIdList>
    <positionId>BMK ZENIT ESTOXX50 INDEX_BM_E</positionId>
    </positionIdList>
    </groupCustomBucket>
    </groupCustomBucketList>
    </positions>

    -----------------
    Here is the xsl file. What i use is loads of call-template executes
    which i guess is the performance issue. The code below works, but it's
    really messy. And slow.
    I have two "functions" loop_position and loop_tag that validates each
    tag type against the position ids.


    <?xml version="1.0"?>
    <xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:variable
    name="tagstoscan">Branch,Portfolio,Folder,CurrencyRegion,</xsl:variable>

    <xsl:template match="/">
    <xsl:element name="positions">
    <xsl:attribute name="nbofcolumns">
    <xsl:call-template name="count_nb_of_tags">
    <xsl:with-param name="tags"><xsl:value-of select="$tagstoscan"
    /></xsl:with-param>
    <xsl:with-param name="count">0</xsl:with-param>
    </xsl:call-template>
    </xsl:attribute>
    <!-- Find tags that are illegal -->
    <xsl:call-template name="loop">
    <xsl:with-param name="tags"><xsl:value-of select="$tagstoscan"
    /></xsl:with-param>
    </xsl:call-template>
    </xsl:element>
    </xsl:template>

    <!-- Count the number of tags we are processing -->
    <xsl:template name="count_nb_of_tags">
    <xsl:param name="tags" />
    <xsl:param name="tag" select="substring-before($tags, ',')" />
    <xsl:param name="count" />

    <xsl:if test="string-length($tag) = 0"><xsl:value-of select="$count"
    /> </xsl:if>

    <xsl:if test="string-length($tags) > 0">
    <xsl:call-template name="count_nb_of_tags">
    <xsl:with-param name="tags" select="substring-after($tags, ',')" />
    <xsl:with-param name="count" select="$count + 1" />
    </xsl:call-template>
    </xsl:if>
    </xsl:template>

    <!-- Loop all tags we are processing, parsing the xml. Check two
    directions: positions to tags, and reverse -->
    <xsl:template name="loop">
    <xsl:param name="tags" />
    <xsl:param name="tag" select="substring-before($tags, ',')" />

    <xsl:if test="string-length($tag) > 0">
    <xsl:element name="position">
    <xsl:attribute name="positionId"></xsl:attribute>
    <xsl:call-template name="loop_position">
    <xsl:with-param name="tags" select="$tag" />
    </xsl:call-template>
    <xsl:call-template name="loop_tag">
    <xsl:with-param name="tags" select="$tag" />
    </xsl:call-template>
    </xsl:element>
    </xsl:if>

    <xsl:if test="string-length($tags) > 0">
    <xsl:call-template name="loop">
    <xsl:with-param name="tags" select="substring-after($tags, ',')" />
    </xsl:call-template>
    </xsl:if>
    </xsl:template>

    <!-- Tag parsing -->
    <xsl:template name="loop_tag">
    <xsl:param name="tags" />
    <xsl:for-each select="positions/*/positionId">
    <xsl:call-template name="find_id_in_taglist">
    <xsl:with-param name="id" select="." />
    <xsl:with-param name="tag" select="$tags" />
    </xsl:call-template>
    </xsl:for-each>
    </xsl:template>

    <xsl:template name="find_id_in_taglist">
    <xsl:param name="id" />
    <xsl:param name="tag" />
    <xsl:if
    test="string-length(/positions/groupCustomBucketList/groupCustomBucket/customDimensionName[.
    = $tag]/../positionIdList/positionId[. = $id]) = 0">

    <xsl:attribute name="positionId"><xsl:value-of select="$id"
    /></xsl:attribute>
    <xsl:variable name="fixedid"><xsl:call-template
    name="remove_space"><xsl:with-param name="string" select="$tag"
    /></xsl:call-template></xsl:variable>
    <xsl:attribute name="{$fixedid}">1</xsl:attribute>
    </xsl:if>
    </xsl:template>

    <!-- Position parsing -->
    <xsl:template name="loop_position">
    <xsl:param name="tags" />
    <xsl:for-each
    select="/positions/groupCustomBucketList/groupCustomBucket/customDimensionName[.
    = $tags]/../positionIdList/positionId">
    <xsl:call-template name="find_id_in_positionlist">
    <xsl:with-param name="id" select="." />
    <xsl:with-param name="tag" select="$tags" />
    </xsl:call-template>
    </xsl:for-each>
    </xsl:template>

    <xsl:template name="find_id_in_positionlist">
    <xsl:param name="id" />
    <xsl:param name="tag" />
    <xsl:if test="string-length(/positions/*/positionId[. = $id]) = 0">
    <xsl:attribute name="positionId"><xsl:value-of select="$id"
    /></xsl:attribute>
    <xsl:variable name="fixedid"><xsl:call-template
    name="remove_space"><xsl:with-param name="string" select="$tag"
    /></xsl:call-template></xsl:variable>
    <xsl:attribute name="{$fixedid}">1</xsl:attribute>
    </xsl:if>
    </xsl:template>

    <!-- Remove spaces -->
    <xsl:template name="remove_space">
    <xsl:param name="string" />
    <xsl:choose>
    <xsl:when test="contains($string, ' ')">
    <xsl:call-template name="remove_space">
    <xsl:with-param name="string">
    <xsl:value-of select="substring-before($string, ' ')"
    /><xsl:value-of select="substring-after($string, ' ')" />
    </xsl:with-param>
    </xsl:call-template>
    </xsl:when>
    <xsl:eek:therwise>
    <xsl:value-of select="$string" />
    </xsl:eek:therwise>
    </xsl:choose>
    </xsl:template>

    <!-- Override default template rules -->

    <xsl:template match="*|/" mode="m">
    <!-- Do nothing. Override default rule -->
    </xsl:template>

    <xsl:template match="processing-instruction()|comment()" >
    <!-- Do nothing. Override default rule -->
    </xsl:template>

    <xsl:template match="text() | @*">
    <!-- Do nothing. Override default rule -->
    </xsl:template>

    </xsl:stylesheet>


    Regards,
    /Johan
    , Jun 19, 2006
    #1
    1. Advertising

  2. Convolving sets against each other is expensive. Try recasting the problem.

    For example: your second constraint is that the union of the two index
    lists is precisely equal to the list of entries, after duplicates are
    eliminated. That can be computed by collecting the sets, sorting them,
    ensuring no dupes exist, and then doing a comparison of the result. That
    may be faster (especially if you know a priori that some of these
    subsets are already sorted.)

    Establishing that the intersection of the two index sets is empty,
    similarly, might be run faster if you test it by establishing that the
    length of the sorted-unique union of the two is equal to the sum of the
    sorted-unique lengths of each index set.

    But I suspect the fastest way to do this particular set of tests would
    be to drop down to a lower level and handle it in SAX or DOM, building
    hashtables or similar content-addressable retrieval mechanisms. The fact
    that XSLT is a complete programming language for manipulating XML
    doesn't necessarily mean it's the optimal one for all tasks.

    --
    () ASCII Ribbon Campaign | Joe Kesselman
    /\ Stamp out HTML e-mail! | System architexture and kinetic poetry
    Joe Kesselman, Jun 22, 2006
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. George Durzi

    Timeout on Xsl Transformation

    George Durzi, Dec 29, 2003, in forum: ASP .Net
    Replies:
    0
    Views:
    518
    George Durzi
    Dec 29, 2003
  2. coincoin
    Replies:
    0
    Views:
    579
    coincoin
    Aug 5, 2003
  3. Mike Conmackie
    Replies:
    4
    Views:
    736
    Mike Conmackie
    Apr 29, 2004
  4. Replies:
    1
    Views:
    3,600
    A. Bolmarcich
    May 27, 2005
  5. Replies:
    0
    Views:
    530
Loading...

Share This Page