Flattening out an XML document

Discussion in 'XML' started by David Gersic, May 24, 2005.

  1. David Gersic

    David Gersic Guest

    I'm working with an HR system and trying to deal with an XML document
    that contains a bunch of personal data (unique to the person) and one or
    more sets of job data (a person can be hired more than once), all
    expressed in a single XML document. I want to flatten out the multiple
    job data parts by building a much larger XML document. (This may not
    make sense by itself, but it's part of a larger project.)

    Dummying up an example to illustrate, a hire document could look like:

    <?xml version="1.0" encoding="UTF-8"?>
    <nds dtdversion="2.0">
    <input>
    <modify class-name="NIU_HR_EDIR_PERSON" event-id="PSDEVL+148" src-dn="101000">
    <association>NIU_HR_EDIR_PERSON/101000</association>
    <modify-attr attr-name="ASSOC_ID">
    <remove-all-values/>
    <add-value>
    <value>101000</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="RANK">
    <remove-all-values/>
    <add-value>
    <value>ASCP</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="FIRST_NAME">
    <remove-all-values/>
    <add-value>
    <value>Jane</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="MIDDLE_NAME">
    <remove-all-values/>
    <add-value>
    <value>S</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="LAST_NAME">
    <remove-all-values/>
    <add-value>
    <value>Doe</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="BIRTH_DATE">
    <remove-all-values/>
    <add-value>
    <value>01/14/1974</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="NATIONAL_ID">
    <remove-all-values/>
    <add-value>
    <value>999887777</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="COUNTRY">
    <remove-all-values/>
    <add-value>
    <value>USA</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="ADDRESS1">
    11:31:07 10740 Drvrs: <remove-all-values/>
    <add-value>
    <value>1060 West Addison</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="CITY">
    <remove-all-values/>
    <add-value>
    <value>Chicago</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="STATE">
    <remove-all-values/>
    <add-value>
    <value>IL</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="POSTAL">
    <remove-all-values/>
    <add-value>
    <value>60613</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="MAIL_DROP">
    <remove-all-values/>
    </modify-attr>
    <modify-attr attr-name="NIU_HR_EDIR_JOB">
    <add-value>
    <value>
    <component name="EMPL_RCD">0</component>
    <component name="EFFDT">01/01/2005</component>
    <component name="JOB_INDICATOR">P</component>
    <component name="EMPL_STATUS">A</component>
    <component name="DEPT_ID">UB00000</component>
    <component name="DEPT_LONG_DESCR">Info Service</component>
    <component name="JOBCODE">1330</component>
    <component name="POS_DESCR">Director, Enterprise Info Sys</component>
    <component name="POSITION_NBR">00004958</component>
    <component name="LOCATION">SP 310</component>
    <component name="BUSINESS_UNIT">SPS03</component>
    <component name="SUPERVISOR_ID"/>
    <component name="REPORTS_TO"/>
    <component name="REG_TEMP">R</component>
    <component name="FULL_PART_TIME">P</component>
    <component name="EMPL_TYPE">S</component>
    </value>
    </add-value>
    <add-value>
    <value>
    <component name="EMPL_RCD">1</component>
    <component name="EFFDT">01/01/2005</component>
    <component name="JOB_INDICATOR">P</component>
    <component name="EMPL_STATUS">A</component>
    <component name="DEPT_ID">SL00000</component>
    <component name="DEPT_LONG_DESCR">Human Resource Services</component>
    <component name="JOBCODE">1330</component>
    <component name="POS_DESCR">Director, Enterprise Info Sys</component>
    <component name="POSITION_NBR">00004554</component>
    <component name="LOCATION">HRS LOBBY</component>
    <component name="BUSINESS_UNIT">SPS03</component>
    <component name="SUPERVISOR_ID"/>
    <component name="REPORTS_TO"/>
    <component name="REG_TEMP">R</component>
    <component name="FULL_PART_TIME">P</component>
    <component name="EMPL_TYPE">S</component>
    </value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="TransactionValue">
    <add-value>
    <value>Component: PERSONAL_DATA Page: PERSONAL_DATA1B Mode: C</value>
    </add-value>
    </modify-attr>
    </modify>
    </input>
    </nds>

    The interesting parts here are just that there is a bunch of personal
    data (first name, last name, birthdate, etc.) listed. For this person,
    there are two job entries, denoted by NIU_HR_EDIR_JOB, and made unique
    by the EMPL_RCD component (0, 1, etc.) of each NIU_HR_EDIR_JOB node.

    What I want to do is match at the <modify> level, and build <modify> sections
    for each of the NIU_HR_EDIR_JOB attributes, containing a fully described
    personal+job entry, using the personal data from the source document,
    and merging in each job entry, so that I'd have something like this:

    <?xml version="1.0" encoding="UTF-8"?>
    <nds dtdversion="2.0">
    <input>
    <modify class-name="NIU_HR_EDIR_PERSON" event-id="PSDEVL+148" src-dn="101000">
    <association>NIU_HR_EDIR_PERSON/101000</association>
    <modify-attr attr-name="ASSOC_ID">
    <remove-all-values/>
    <add-value>
    <value>101000</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="RANK">
    <remove-all-values/>
    <add-value>
    <value>ASCP</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="FIRST_NAME">
    <remove-all-values/>
    <add-value>
    <value>Jane</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="MIDDLE_NAME">
    <remove-all-values/>
    <add-value>
    <value>S</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="LAST_NAME">
    <remove-all-values/>
    <add-value>
    <value>Doe</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="BIRTH_DATE">
    <remove-all-values/>
    <add-value>
    <value>01/14/1974</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="NATIONAL_ID">
    <remove-all-values/>
    <add-value>
    <value>999887777</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="COUNTRY">
    <remove-all-values/>
    <add-value>
    <value>USA</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="ADDRESS1">
    11:31:07 10740 Drvrs: <remove-all-values/>
    <add-value>
    <value>1060 West Addison</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="CITY">
    <remove-all-values/>
    <add-value>
    <value>Chicago</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="STATE">
    <remove-all-values/>
    <add-value>
    <value>IL</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="POSTAL">
    <remove-all-values/>
    <add-value>
    <value>60613</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="MAIL_DROP">
    <remove-all-values/>
    </modify-attr>
    <modify-attr attr-name="NIU_HR_EDIR_JOB">
    <add-value>
    <value>
    <component name="EMPL_RCD">0</component>
    <component name="EFFDT">01/01/2005</component>
    <component name="JOB_INDICATOR">P</component>
    <component name="EMPL_STATUS">A</component>
    <component name="DEPT_ID">UB00000</component>
    <component name="DEPT_LONG_DESCR">Info Service</component>
    <component name="JOBCODE">1330</component>
    <component name="POS_DESCR">Director, Enterprise Info Sys</component>
    <component name="POSITION_NBR">00004958</component>
    <component name="LOCATION">SP 310</component>
    <component name="BUSINESS_UNIT">SPS03</component>
    <component name="SUPERVISOR_ID"/>
    <component name="REPORTS_TO"/>
    <component name="REG_TEMP">R</component>
    <component name="FULL_PART_TIME">P</component>
    <component name="EMPL_TYPE">S</component>
    </value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="TransactionValue">
    <add-value>
    <value>Component: PERSONAL_DATA Page: PERSONAL_DATA1B Mode: C</value>
    </add-value>
    </modify-attr>
    </modify>

    <modify class-name="NIU_HR_EDIR_PERSON" event-id="PSDEVL+148" src-dn="101000">
    <association>NIU_HR_EDIR_PERSON/101000</association>
    <modify-attr attr-name="ASSOC_ID">
    <remove-all-values/>
    <add-value>
    <value>101000</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="RANK">
    <remove-all-values/>
    <add-value>
    <value>ASCP</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="FIRST_NAME">
    <remove-all-values/>
    <add-value>
    <value>Jane</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="MIDDLE_NAME">
    <remove-all-values/>
    <add-value>
    <value>S</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="LAST_NAME">
    <remove-all-values/>
    <add-value>
    <value>Doe</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="BIRTH_DATE">
    <remove-all-values/>
    <add-value>
    <value>01/14/1974</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="NATIONAL_ID">
    <remove-all-values/>
    <add-value>
    <value>999887777</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="COUNTRY">
    <remove-all-values/>
    <add-value>
    <value>USA</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="ADDRESS1">
    11:31:07 10740 Drvrs: <remove-all-values/>
    <add-value>
    <value>1060 West Addison</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="CITY">
    <remove-all-values/>
    <add-value>
    <value>Chicago</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="STATE">
    <remove-all-values/>
    <add-value>
    <value>IL</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="POSTAL">
    <remove-all-values/>
    <add-value>
    <value>60613</value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="MAIL_DROP">
    <remove-all-values/>
    </modify-attr>
    <modify-attr attr-name="NIU_HR_EDIR_JOB">
    <add-value>
    <value>
    <component name="EMPL_RCD">1</component>
    <component name="EFFDT">01/01/2005</component>
    <component name="JOB_INDICATOR">P</component>
    <component name="EMPL_STATUS">A</component>
    <component name="DEPT_ID">SL00000</component>
    <component name="DEPT_LONG_DESCR">Human Resource Services</component>
    <component name="JOBCODE">1330</component>
    <component name="POS_DESCR">Director, Enterprise Info Sys</component>
    <component name="POSITION_NBR">00004554</component>
    <component name="LOCATION">HRS LOBBY</component>
    <component name="BUSINESS_UNIT">SPS03</component>
    <component name="SUPERVISOR_ID"/>
    <component name="REPORTS_TO"/>
    <component name="REG_TEMP">R</component>
    <component name="FULL_PART_TIME">P</component>
    <component name="EMPL_TYPE">S</component>
    </value>
    </add-value>
    </modify-attr>
    <modify-attr attr-name="TransactionValue">
    <add-value>
    <value>Component: PERSONAL_DATA Page: PERSONAL_DATA1B Mode: C</value>
    </add-value>
    </modify-attr>
    </modify>
    </input>
    </nds>

    I believe that my best bet for handling one or more NIU_HR_EDIR_JOB
    sections is using <xsl:for-each>, but since the NIU_HR_EDIR_JOB info
    is burried within the <modify>, I'm having trouble getting the rest
    of the document, and getting to only (each) one of the NIU_HR_EDIR_JOB
    nodes. The closest I've come, so far is making the copies of most
    things (I'm losing the <modify> somewhere), and failing to strip
    out the extra NIU_HR_EDIR_JOB nodes as it does so. Several hours of
    Googling have not turned up a solution.

    Right now, my XSLT looks like:

    <xsl:template match="node()|@*">
    <xsl:copy>
    <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
    </xsl:template>

    <xsl:template match="modify">
    <xsl:for-each select="modify-attr[@attr-name='NIU_HR_EDIR_JOB']/add-value">
    <xsl:copy>
    <!-- copy through the element attributes -->
    <xsl:apply-templates select="/nds/input/modify/node()"/>
    <!-- copy through child elements except for this one -->
    <!-- (doesn't work...) <xsl:apply-templates select="node()[not(self::modify-attr[@attr-name='NIU_HR_EDIR_JOB']/add-value)]"/> -->
    </xsl:copy>
    </xsl:for-each>
    </xsl:template>

    I know that this is close, but I can't quite see it. I also know that
    I've been looking at it for too long, and I'm probably missing something
    obvious. I'm hoping that somebody can point out my error here. Solutions,
    suggestions, or ideas appreciated. Please post followups. The email
    address is valid, but I'd rather see followups posted here if possible.


    --
    | David Gersic www.zaccaria-pinball.com dgersic_@_niu.edu |
    | OSI Layers - People don't need to see Paula Abdul. |
    | Email address is munged to avoid spammers. Remove the underscores. |
     
    David Gersic, May 24, 2005
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Tony Prichard
    Replies:
    0
    Views:
    788
    Tony Prichard
    Dec 12, 2003
  2. delgados129
    Replies:
    2
    Views:
    811
    delgados129
    Apr 25, 2005
  3. gangesmaster

    a flattening operator?

    gangesmaster, Apr 18, 2006, in forum: Python
    Replies:
    2
    Views:
    313
    Michael Tobis
    Apr 22, 2006
  4. Replies:
    3
    Views:
    466
    Eddie Corns
    Mar 22, 2007
  5. Replies:
    7
    Views:
    551
Loading...

Share This Page