merge two xml files based on common key

L

Luke Airig

I have two xml files that I need to merge on their common field
(date_time). The merged output file needs to have the date_time field
and all fields from both of the two input files. I am using the Saxon
xml parser. Can someone please help me to get this to work? The two
xml input files follow, as well as my attempt to write the merge xsl
(merge_on_date_time.xsl):

lrv_1_transaction_cid_1.xml:
----------------------------
<?xml version="1.0"?>
<root>
<record>
<date_time> 2003/12/10.16:08 </date_time>
<driver_id> TRANSIT_VEHICLE_ID1 </driver_id>
<vehicle_id> FARE_TYPE_CD1 </vehicle_id>
<duty_shift_id> DUTY_SHIFT_ID_1 </duty_shift_id>
<route_id> ROUTE_ID_1 </route_id>
<cid_terminal_id> CID_TERMINAL_ID </cid_terminal_id>
<tag_id> TAG_ID </tag_id>
</record>
</root>

lrv_1_gps_cid.xml:
------------------
<?xml version="1.0"?>
<root>
<record>
<stop_location_id> STOP_LOCATION_ID1 </stop_location_id>
<latitude> 39.814658 </latitude>
<longitude> -105.183682 </longitude>
<date_time> 2003/12/10.16:08 </date_time>
<vehicle_id> TRANSIT_VEHICLE_ID1 </vehicle_id>
<fare_type_cd> FARE_TYPE_CD1 </fare_type_cd>
<blacklist_cd> V </blacklist_cd>
</record>
</root>

merge_on_date_time.xsl:
-----------------------
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:eek:utput method="xml"/>

<!-- load the merge file -->
<xsl:variable name="transactions"
select="document('lrv_1_gps_cid.xml')"/>
<xsl:template match="/">
<root>
<xsl:for-each select="root/record">
<!-- cache the key: date_time -->
<xsl:variable name="date_time">
<xsl:value-of select="@date_time"/></xsl:variable>
<!-- copy the child nodes -->
<record>
<xsl:copy-of select="child::*"/>
<!-- copy the children of the matching record node from the
merge file -->
<xsl:copy-of
select="$transactions/root/record[@date_time=$date_time]/child::*"/>
</record>
</xsl:for-each>
</root>
</xsl:template>
</xsl:stylesheet>

TIA
 
P

Patrick TJ McPhee

% (date_time). The merged output file needs to have the date_time field
% and all fields from both of the two input files. I am using the Saxon

Note that the `date_time' field is an element.

[...]

% lrv_1_transaction_cid_1.xml:
[...]
% <date_time> 2003/12/10.16:08 </date_time>
[...]
% lrv_1_gps_cid.xml:
[...]
% <date_time> 2003/12/10.16:08 </date_time>

Also note that the contents of these two date_time elements are not
the same.

[...]

% merge_on_date_time.xsl:

[...]

% <xsl:variable name="date_time">
% <xsl:value-of select="@date_time"/></xsl:variable>

What you're doing here is getting the value of the date_time
attribute, which doesn't exist.

[...]

% <xsl:copy-of
% select="$transactions/root/record[@date_time=$date_time]/child::*"/>

Which you then compare to a date_time attibute, which doesn't exist.
The first part of the solution is to lose the @s. The second part
is to account for the difference in the number of spaces. You can
use the xpath function normalize-space() to do that, changing those
to excerpts to

<xsl:variable name="date_time" select="normalize-space(date_time)"/>

and

<xsl:copy-of
select="$transactions/root/record[normalize-space(date_time)
=$date_time]/child::*"/>

This leads to a couple of other possible problems. The date_time
element is duplicated. You can filter it out by adding
[name() != 'date_time']
to the end of the select expression.

The other is that the xml is not formatted too well. You might be
able to deal with that (assuming you care) by adding the attribute
indent='yes' to the xsl:method element.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,731
Messages
2,569,432
Members
44,832
Latest member
GlennSmall

Latest Threads

Top