Flattening out an XML document

D

David Gersic

I'm working with an HR system and trying to deal with an XML document
that contains a bunch of personal data (unique to the person) and one or
more sets of job data (a person can be hired more than once), all
expressed in a single XML document. I want to flatten out the multiple
job data parts by building a much larger XML document. (This may not
make sense by itself, but it's part of a larger project.)

Dummying up an example to illustrate, a hire document could look like:

<?xml version="1.0" encoding="UTF-8"?>
<nds dtdversion="2.0">
<input>
<modify class-name="NIU_HR_EDIR_PERSON" event-id="PSDEVL+148" src-dn="101000">
<association>NIU_HR_EDIR_PERSON/101000</association>
<modify-attr attr-name="ASSOC_ID">
<remove-all-values/>
<add-value>
<value>101000</value>
</add-value>
</modify-attr>
<modify-attr attr-name="RANK">
<remove-all-values/>
<add-value>
<value>ASCP</value>
</add-value>
</modify-attr>
<modify-attr attr-name="FIRST_NAME">
<remove-all-values/>
<add-value>
<value>Jane</value>
</add-value>
</modify-attr>
<modify-attr attr-name="MIDDLE_NAME">
<remove-all-values/>
<add-value>
<value>S</value>
</add-value>
</modify-attr>
<modify-attr attr-name="LAST_NAME">
<remove-all-values/>
<add-value>
<value>Doe</value>
</add-value>
</modify-attr>
<modify-attr attr-name="BIRTH_DATE">
<remove-all-values/>
<add-value>
<value>01/14/1974</value>
</add-value>
</modify-attr>
<modify-attr attr-name="NATIONAL_ID">
<remove-all-values/>
<add-value>
<value>999887777</value>
</add-value>
</modify-attr>
<modify-attr attr-name="COUNTRY">
<remove-all-values/>
<add-value>
<value>USA</value>
</add-value>
</modify-attr>
<modify-attr attr-name="ADDRESS1">
11:31:07 10740 Drvrs: <remove-all-values/>
<add-value>
<value>1060 West Addison</value>
</add-value>
</modify-attr>
<modify-attr attr-name="CITY">
<remove-all-values/>
<add-value>
<value>Chicago</value>
</add-value>
</modify-attr>
<modify-attr attr-name="STATE">
<remove-all-values/>
<add-value>
<value>IL</value>
</add-value>
</modify-attr>
<modify-attr attr-name="POSTAL">
<remove-all-values/>
<add-value>
<value>60613</value>
</add-value>
</modify-attr>
<modify-attr attr-name="MAIL_DROP">
<remove-all-values/>
</modify-attr>
<modify-attr attr-name="NIU_HR_EDIR_JOB">
<add-value>
<value>
<component name="EMPL_RCD">0</component>
<component name="EFFDT">01/01/2005</component>
<component name="JOB_INDICATOR">P</component>
<component name="EMPL_STATUS">A</component>
<component name="DEPT_ID">UB00000</component>
<component name="DEPT_LONG_DESCR">Info Service</component>
<component name="JOBCODE">1330</component>
<component name="POS_DESCR">Director, Enterprise Info Sys</component>
<component name="POSITION_NBR">00004958</component>
<component name="LOCATION">SP 310</component>
<component name="BUSINESS_UNIT">SPS03</component>
<component name="SUPERVISOR_ID"/>
<component name="REPORTS_TO"/>
<component name="REG_TEMP">R</component>
<component name="FULL_PART_TIME">P</component>
<component name="EMPL_TYPE">S</component>
</value>
</add-value>
<add-value>
<value>
<component name="EMPL_RCD">1</component>
<component name="EFFDT">01/01/2005</component>
<component name="JOB_INDICATOR">P</component>
<component name="EMPL_STATUS">A</component>
<component name="DEPT_ID">SL00000</component>
<component name="DEPT_LONG_DESCR">Human Resource Services</component>
<component name="JOBCODE">1330</component>
<component name="POS_DESCR">Director, Enterprise Info Sys</component>
<component name="POSITION_NBR">00004554</component>
<component name="LOCATION">HRS LOBBY</component>
<component name="BUSINESS_UNIT">SPS03</component>
<component name="SUPERVISOR_ID"/>
<component name="REPORTS_TO"/>
<component name="REG_TEMP">R</component>
<component name="FULL_PART_TIME">P</component>
<component name="EMPL_TYPE">S</component>
</value>
</add-value>
</modify-attr>
<modify-attr attr-name="TransactionValue">
<add-value>
<value>Component: PERSONAL_DATA Page: PERSONAL_DATA1B Mode: C</value>
</add-value>
</modify-attr>
</modify>
</input>
</nds>

The interesting parts here are just that there is a bunch of personal
data (first name, last name, birthdate, etc.) listed. For this person,
there are two job entries, denoted by NIU_HR_EDIR_JOB, and made unique
by the EMPL_RCD component (0, 1, etc.) of each NIU_HR_EDIR_JOB node.

What I want to do is match at the <modify> level, and build <modify> sections
for each of the NIU_HR_EDIR_JOB attributes, containing a fully described
personal+job entry, using the personal data from the source document,
and merging in each job entry, so that I'd have something like this:

<?xml version="1.0" encoding="UTF-8"?>
<nds dtdversion="2.0">
<input>
<modify class-name="NIU_HR_EDIR_PERSON" event-id="PSDEVL+148" src-dn="101000">
<association>NIU_HR_EDIR_PERSON/101000</association>
<modify-attr attr-name="ASSOC_ID">
<remove-all-values/>
<add-value>
<value>101000</value>
</add-value>
</modify-attr>
<modify-attr attr-name="RANK">
<remove-all-values/>
<add-value>
<value>ASCP</value>
</add-value>
</modify-attr>
<modify-attr attr-name="FIRST_NAME">
<remove-all-values/>
<add-value>
<value>Jane</value>
</add-value>
</modify-attr>
<modify-attr attr-name="MIDDLE_NAME">
<remove-all-values/>
<add-value>
<value>S</value>
</add-value>
</modify-attr>
<modify-attr attr-name="LAST_NAME">
<remove-all-values/>
<add-value>
<value>Doe</value>
</add-value>
</modify-attr>
<modify-attr attr-name="BIRTH_DATE">
<remove-all-values/>
<add-value>
<value>01/14/1974</value>
</add-value>
</modify-attr>
<modify-attr attr-name="NATIONAL_ID">
<remove-all-values/>
<add-value>
<value>999887777</value>
</add-value>
</modify-attr>
<modify-attr attr-name="COUNTRY">
<remove-all-values/>
<add-value>
<value>USA</value>
</add-value>
</modify-attr>
<modify-attr attr-name="ADDRESS1">
11:31:07 10740 Drvrs: <remove-all-values/>
<add-value>
<value>1060 West Addison</value>
</add-value>
</modify-attr>
<modify-attr attr-name="CITY">
<remove-all-values/>
<add-value>
<value>Chicago</value>
</add-value>
</modify-attr>
<modify-attr attr-name="STATE">
<remove-all-values/>
<add-value>
<value>IL</value>
</add-value>
</modify-attr>
<modify-attr attr-name="POSTAL">
<remove-all-values/>
<add-value>
<value>60613</value>
</add-value>
</modify-attr>
<modify-attr attr-name="MAIL_DROP">
<remove-all-values/>
</modify-attr>
<modify-attr attr-name="NIU_HR_EDIR_JOB">
<add-value>
<value>
<component name="EMPL_RCD">0</component>
<component name="EFFDT">01/01/2005</component>
<component name="JOB_INDICATOR">P</component>
<component name="EMPL_STATUS">A</component>
<component name="DEPT_ID">UB00000</component>
<component name="DEPT_LONG_DESCR">Info Service</component>
<component name="JOBCODE">1330</component>
<component name="POS_DESCR">Director, Enterprise Info Sys</component>
<component name="POSITION_NBR">00004958</component>
<component name="LOCATION">SP 310</component>
<component name="BUSINESS_UNIT">SPS03</component>
<component name="SUPERVISOR_ID"/>
<component name="REPORTS_TO"/>
<component name="REG_TEMP">R</component>
<component name="FULL_PART_TIME">P</component>
<component name="EMPL_TYPE">S</component>
</value>
</add-value>
</modify-attr>
<modify-attr attr-name="TransactionValue">
<add-value>
<value>Component: PERSONAL_DATA Page: PERSONAL_DATA1B Mode: C</value>
</add-value>
</modify-attr>
</modify>

<modify class-name="NIU_HR_EDIR_PERSON" event-id="PSDEVL+148" src-dn="101000">
<association>NIU_HR_EDIR_PERSON/101000</association>
<modify-attr attr-name="ASSOC_ID">
<remove-all-values/>
<add-value>
<value>101000</value>
</add-value>
</modify-attr>
<modify-attr attr-name="RANK">
<remove-all-values/>
<add-value>
<value>ASCP</value>
</add-value>
</modify-attr>
<modify-attr attr-name="FIRST_NAME">
<remove-all-values/>
<add-value>
<value>Jane</value>
</add-value>
</modify-attr>
<modify-attr attr-name="MIDDLE_NAME">
<remove-all-values/>
<add-value>
<value>S</value>
</add-value>
</modify-attr>
<modify-attr attr-name="LAST_NAME">
<remove-all-values/>
<add-value>
<value>Doe</value>
</add-value>
</modify-attr>
<modify-attr attr-name="BIRTH_DATE">
<remove-all-values/>
<add-value>
<value>01/14/1974</value>
</add-value>
</modify-attr>
<modify-attr attr-name="NATIONAL_ID">
<remove-all-values/>
<add-value>
<value>999887777</value>
</add-value>
</modify-attr>
<modify-attr attr-name="COUNTRY">
<remove-all-values/>
<add-value>
<value>USA</value>
</add-value>
</modify-attr>
<modify-attr attr-name="ADDRESS1">
11:31:07 10740 Drvrs: <remove-all-values/>
<add-value>
<value>1060 West Addison</value>
</add-value>
</modify-attr>
<modify-attr attr-name="CITY">
<remove-all-values/>
<add-value>
<value>Chicago</value>
</add-value>
</modify-attr>
<modify-attr attr-name="STATE">
<remove-all-values/>
<add-value>
<value>IL</value>
</add-value>
</modify-attr>
<modify-attr attr-name="POSTAL">
<remove-all-values/>
<add-value>
<value>60613</value>
</add-value>
</modify-attr>
<modify-attr attr-name="MAIL_DROP">
<remove-all-values/>
</modify-attr>
<modify-attr attr-name="NIU_HR_EDIR_JOB">
<add-value>
<value>
<component name="EMPL_RCD">1</component>
<component name="EFFDT">01/01/2005</component>
<component name="JOB_INDICATOR">P</component>
<component name="EMPL_STATUS">A</component>
<component name="DEPT_ID">SL00000</component>
<component name="DEPT_LONG_DESCR">Human Resource Services</component>
<component name="JOBCODE">1330</component>
<component name="POS_DESCR">Director, Enterprise Info Sys</component>
<component name="POSITION_NBR">00004554</component>
<component name="LOCATION">HRS LOBBY</component>
<component name="BUSINESS_UNIT">SPS03</component>
<component name="SUPERVISOR_ID"/>
<component name="REPORTS_TO"/>
<component name="REG_TEMP">R</component>
<component name="FULL_PART_TIME">P</component>
<component name="EMPL_TYPE">S</component>
</value>
</add-value>
</modify-attr>
<modify-attr attr-name="TransactionValue">
<add-value>
<value>Component: PERSONAL_DATA Page: PERSONAL_DATA1B Mode: C</value>
</add-value>
</modify-attr>
</modify>
</input>
</nds>

I believe that my best bet for handling one or more NIU_HR_EDIR_JOB
sections is using <xsl:for-each>, but since the NIU_HR_EDIR_JOB info
is burried within the <modify>, I'm having trouble getting the rest
of the document, and getting to only (each) one of the NIU_HR_EDIR_JOB
nodes. The closest I've come, so far is making the copies of most
things (I'm losing the <modify> somewhere), and failing to strip
out the extra NIU_HR_EDIR_JOB nodes as it does so. Several hours of
Googling have not turned up a solution.

Right now, my XSLT looks like:

<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>

<xsl:template match="modify">
<xsl:for-each select="modify-attr[@attr-name='NIU_HR_EDIR_JOB']/add-value">
<xsl:copy>
<!-- copy through the element attributes -->
<xsl:apply-templates select="/nds/input/modify/node()"/>
<!-- copy through child elements except for this one -->
<!-- (doesn't work...) <xsl:apply-templates select="node()[not(self::modify-attr[@attr-name='NIU_HR_EDIR_JOB']/add-value)]"/> -->
</xsl:copy>
</xsl:for-each>
</xsl:template>

I know that this is close, but I can't quite see it. I also know that
I've been looking at it for too long, and I'm probably missing something
obvious. I'm hoping that somebody can point out my error here. Solutions,
suggestions, or ideas appreciated. Please post followups. The email
address is valid, but I'd rather see followups posted here if possible.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top