Q: using xlst to "congeal" adjecent sections

Discussion in 'XML' started by Malcolm Dew-Jones, Feb 27, 2008.

  1. (First - my terminology may well be bogus, I hope you understand me
    though.)

    I have some ms word documents that will be used as the input for a
    different purpose in a database. To ease this process I want to take
    certain adjacent sections of the documents and "congeal" them into single
    sections.

    The format within the xml output of msword is simple to see, but I haven't
    played with xslt for a while, (and not much at that) so am looking for
    examples or suggestions of how to do the following.

    For example, I have the following two "sections"


    <w:r wsp:rsidR="00A5105E" wsp:rsidRPr="00EC0118">
    <w:rPr>
    <w:rFonts w:ascii="Arial" w:h-ansi="Arial" />
    <wx:font wx:val="Arial" />
    <w:highlight w:val="yellow" />
    </w:rPr>
    <w:t>FIRST PART </w:t>
    </w:r>
    <w:r wsp:rsidR="00A61057">
    <w:rPr>
    <w:rFonts w:ascii="Arial" w:h-ansi="Arial" />
    <wx:font wx:val="Arial" />
    <w:highlight w:val="yellow" />
    </w:rPr>
    <w:t>AND THE SECOND PART</w:t>
    </w:r>

    I want to end up with a single section that has the FIRST PART AND THE
    SECOND PART combined. I don't think I need to care about the id numbers,
    but even if I do I will worry about that later. The result would then look
    like this

    <w:r wsp:rsidR="*"> (the value of * doesn't matter to me yet)
    <w:rPr>
    <w:rFonts w:ascii="Arial" w:h-ansi="Arial" />
    <wx:font wx:val="Arial" />
    <w:highlight w:val="yellow" />
    </w:rPr>
    <w:t>FIRST PART AND THE SECOND PART</w:t>
    </w:r>

    The thing that makes each section the same is that the <w:rPr>...</w:rPr>
    are the same in the adjacent sections, and I only care about the sections
    that have that exact formatting shown above (i.e. w:ascii="Arial" etc.) so
    a tranform could have those values hard coded if it makes it easier.

    Anyway, as I said, examples or suggestions for setting up an xslt to do
    this would be appreciated.

    Thanks
     
    Malcolm Dew-Jones, Feb 27, 2008
    #1
    1. Advertising

  2. Off-the-cuff answer:


    The usual approach for this sort of thing is to write two templates to
    handle the two distinct cases.

    Start by figuring out a match pattern that selects all the elments
    you're interested in.

    Modify that to create two match patterns: one that matches the first
    such instance (one with no preceeding matching siblings) and one that
    matches all the others. (Or the last and all-the-rest; either way.)

    Make a template fired by the first pattern that gathers the contents of
    it and its adjacent matching siblings.

    Make a template fired by the second pattern which discards the elements
    which match it, since they were handled by the other template.

    Plug those two into a stylesheet which handles the rest of the document,
    typically the identity transformation.

    Done.


    The XSLT FAQ websiteshould have some examples. I suspect that, given the
    complexity of what you're matching on, you'll want to take advantage of
    keys.

    --
    Joe Kesselman / Beware the fury of a patient man. -- John Dryden
     
    Joseph Kesselman, Feb 27, 2008
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Hai Nguyen

    ACCESS -> (XML + XLST) -> to files

    Hai Nguyen, Jan 11, 2004, in forum: ASP .Net
    Replies:
    3
    Views:
    403
    Hai Nguyen
    Jan 11, 2004
  2. ad
    Replies:
    1
    Views:
    1,660
    Teemu Keiski
    Feb 25, 2006
  3. john smith

    getting parent node using XLST

    john smith, May 3, 2005, in forum: XML
    Replies:
    2
    Views:
    10,797
    David Carlisle
    May 3, 2005
  4. Raman
    Replies:
    6
    Views:
    4,833
    santosh
    Aug 3, 2007
  5. puzzlecracker

    enumerate all adjecent substrings in the file

    puzzlecracker, Dec 11, 2005, in forum: Perl Misc
    Replies:
    9
    Views:
    144
    Anno Siegel
    Dec 13, 2005
Loading...

Share This Page