Convert xml to CSV using xsltproc

Discussion in 'XML' started by loc, Feb 21, 2010.

  1. loc

    loc Guest

    I'm trying to convert an xml file into CSV using xsltproc.

    #file.xml
    <?xml-version ="1.0"standalone="no"?>
    <NAXML-POSJournal version="3.3">
    <TransmissionHeader>
    <StoreLocationID>207</StoreLocationID>
    </TransmissionHeader>
    <JournalReport>
    <JournalHeader>
    <ReportSequenceNumber>74</ReportSequenceNumber>
    <PrimaryReportPeriod>2</PrimaryReportPeriod>
    <SecondaryReportPeriod>1</SecondaryReportPeriod>
    <BeginDate>2010-02-11</BeginDate>
    <BeginTime>03:58:42</BeginTime>
    <EndDate>2100-01-01</EndDate>
    <EndTime>00:00:00</EndTime>
    </JournalHeader>
    <SaleEvent>
    <BusinessDate>2010-02-11</BusinessDate>
    <TransactionDetailGroup>
    <TransactionLine status="normal">
    <ItemLine>
    <ItemCode>
    <POSCodeFormat format="upcA"></POSCodeFormat>
    <POSCode>028400079037</POSCode>
    <POSCodeModifier name="pc">1</POSCodeModifier>
    </ItemCode>
    </ItemLine>
    </TransactionLine>
    <TransactionLine status="normal">
    <ItemLine>
    <ItemCode>
    <POSCodeFormat format="upcA"></POSCodeFormat>
    <POSCode>049000051148</POSCode>
    <POSCodeModifier name="pc">1</POSCodeModifier>
    </ItemCode>
    </ItemLine>
    </TransactionLine>
    </TransactionDetailGroup>
    </SaleEvent>
    </JournalReport>
    </NAXML-POSJournal>


    Here is the stylesheet:

    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0">
    <xsl:eek:utput method="text"/>
    <xsl:template match="NAXML-POSJournal/JournalReport/SaleEvent/
    TransactionDetailGroup">
    <xsl:for-each select="*">
    <xsl:value-of select="."/>
    <xsl:text>,</xsl:text>
    <xsl:if test="not(position() = last())">
    <xsl:text>
    </xsl:text>
    </xsl:if>
    </xsl:for-each>
    </xsl:template>
    </xsl:stylesheet>


    The output I'm looking for is the values for the following:

    <StoreLocationID>,<BusinessDate>,<POSCodeFormat>,<POSCode>

    I'd like to get a new line with that info for each <TransactionLine>
    How can I make this work?
    loc, Feb 21, 2010
    #1
    1. Advertising

  2. loc wrote:
    > I'm trying to convert an xml file into CSV using xsltproc.
    >
    > #file.xml
    > <?xml-version ="1.0"standalone="no"?>
    > <NAXML-POSJournal version="3.3">
    > <TransmissionHeader>
    > <StoreLocationID>207</StoreLocationID>
    > </TransmissionHeader>
    > <JournalReport>
    > <JournalHeader>
    > <ReportSequenceNumber>74</ReportSequenceNumber>
    > <PrimaryReportPeriod>2</PrimaryReportPeriod>
    > <SecondaryReportPeriod>1</SecondaryReportPeriod>
    > <BeginDate>2010-02-11</BeginDate>
    > <BeginTime>03:58:42</BeginTime>
    > <EndDate>2100-01-01</EndDate>
    > <EndTime>00:00:00</EndTime>
    > </JournalHeader>
    > <SaleEvent>
    > <BusinessDate>2010-02-11</BusinessDate>
    > <TransactionDetailGroup>
    > <TransactionLine status="normal">
    > <ItemLine>
    > <ItemCode>
    > <POSCodeFormat format="upcA"></POSCodeFormat>
    > <POSCode>028400079037</POSCode>
    > <POSCodeModifier name="pc">1</POSCodeModifier>
    > </ItemCode>
    > </ItemLine>
    > </TransactionLine>
    > <TransactionLine status="normal">
    > <ItemLine>
    > <ItemCode>
    > <POSCodeFormat format="upcA"></POSCodeFormat>
    > <POSCode>049000051148</POSCode>
    > <POSCodeModifier name="pc">1</POSCodeModifier>
    > </ItemCode>
    > </ItemLine>
    > </TransactionLine>
    > </TransactionDetailGroup>
    > </SaleEvent>
    > </JournalReport>
    > </NAXML-POSJournal>



    > The output I'm looking for is the values for the following:
    >
    > <StoreLocationID>,<BusinessDate>,<POSCodeFormat>,<POSCode>
    >
    > I'd like to get a new line with that info for each <TransactionLine>
    > How can I make this work?


    Then process each 'TransactionLine' element and output what you want to
    output:

    <xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0">

    <xsl:strip-space elements="*"/>
    <xsl:eek:utput method="text"/>

    <xsl:template match="/">
    <xsl:apply-templates
    select="NAXML-POSJournal/JournalReport/SaleEvent/TransactionDetailGroup/TransactionLine"/>
    </xsl:template>

    <xsl:template match="TransactionLine">
    <xsl:value-of
    select="/NAXML-POSJournal/TransmissionHeader/StoreLocationID"/>
    <xsl:text>,</xsl:text>
    <xsl:value-of
    select="/NAXML-POSJournal/JournalReport/SaleEvent/BusinessDate"/>
    <xsl:text>,</xsl:text>
    <xsl:value-of select="ItemLine/ItemCode/POSCodeFormat/@format"/>
    <xsl:text>,</xsl:text>
    <xsl:value-of select="ItemLine/ItemCode/POSCode"/>
    <xsl:text>
    </xsl:text>
    </xsl:template>

    </xsl:stylesheet>

    --

    Martin Honnen
    http://msmvps.com/blogs/martin_honnen/
    Martin Honnen, Feb 21, 2010
    #2
    1. Advertising

  3. loc

    loc Guest

    On Feb 21, 6:10 am, Martin Honnen <> wrote:
    > loc wrote:
    > > I'm trying to convert an xml file into CSV using xsltproc.

    >
    > > #file.xml
    > > <?xml-version ="1.0"standalone="no"?>
    > >  <NAXML-POSJournal version="3.3">
    > >   <TransmissionHeader>
    > >    <StoreLocationID>207</StoreLocationID>
    > >   </TransmissionHeader>
    > >   <JournalReport>
    > >    <JournalHeader>
    > >     <ReportSequenceNumber>74</ReportSequenceNumber>
    > >     <PrimaryReportPeriod>2</PrimaryReportPeriod>
    > >     <SecondaryReportPeriod>1</SecondaryReportPeriod>
    > >     <BeginDate>2010-02-11</BeginDate>
    > >     <BeginTime>03:58:42</BeginTime>
    > >     <EndDate>2100-01-01</EndDate>
    > >     <EndTime>00:00:00</EndTime>
    > >    </JournalHeader>
    > >    <SaleEvent>
    > >     <BusinessDate>2010-02-11</BusinessDate>
    > >     <TransactionDetailGroup>
    > >      <TransactionLine status="normal">
    > >      <ItemLine>
    > >       <ItemCode>
    > >        <POSCodeFormat format="upcA"></POSCodeFormat>
    > >        <POSCode>028400079037</POSCode>
    > >        <POSCodeModifier name="pc">1</POSCodeModifier>
    > >       </ItemCode>
    > >      </ItemLine>
    > >     </TransactionLine>
    > >    <TransactionLine status="normal">
    > >    <ItemLine>
    > >     <ItemCode>
    > >      <POSCodeFormat format="upcA"></POSCodeFormat>
    > >      <POSCode>049000051148</POSCode>
    > >      <POSCodeModifier name="pc">1</POSCodeModifier>
    > >     </ItemCode>
    > >    </ItemLine>
    > >   </TransactionLine>
    > > </TransactionDetailGroup>
    > > </SaleEvent>
    > > </JournalReport>
    > > </NAXML-POSJournal>
    > > The output I'm looking for is the values for the following:

    >
    > > <StoreLocationID>,<BusinessDate>,<POSCodeFormat>,<POSCode>

    >
    > > I'd like to get a new line with that info for each <TransactionLine>
    > > How can I make this work?

    >
    > Then process each 'TransactionLine' element and output what you want to
    > output:
    >
    > <xsl:stylesheet
    >    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    >    version="1.0">
    >
    >    <xsl:strip-space elements="*"/>
    >    <xsl:eek:utput method="text"/>
    >
    >    <xsl:template match="/">
    >      <xsl:apply-templates
    > select="NAXML-POSJournal/JournalReport/SaleEvent/TransactionDetailGroup/TransactionLine"/>
    >    </xsl:template>
    >
    >    <xsl:template match="TransactionLine">
    >      <xsl:value-of
    > select="/NAXML-POSJournal/TransmissionHeader/StoreLocationID"/>
    >      <xsl:text>,</xsl:text>
    >      <xsl:value-of
    > select="/NAXML-POSJournal/JournalReport/SaleEvent/BusinessDate"/>
    >      <xsl:text>,</xsl:text>
    >      <xsl:value-of select="ItemLine/ItemCode/POSCodeFormat/@format"/>
    >      <xsl:text>,</xsl:text>
    >      <xsl:value-of select="ItemLine/ItemCode/POSCode"/>
    >      <xsl:text>
    </xsl:text>
    >    </xsl:template>
    >
    > </xsl:stylesheet>
    >
    > --
    >
    >         Martin Honnen
    >        http://msmvps.com/blogs/martin_honnen/


    Thanks, it works great, just what I wanted. One question, I'm just
    trying to get a better understanding of how this works, why isn't a
    for-each needed even though there are multiple matches for
    <TransactionLine> and the data under it?
    loc, Feb 21, 2010
    #3
  4. loc wrote:
    > On Feb 21, 6:10 am, Martin Honnen <> wrote:


    >> Then process each 'TransactionLine' element and output what you want to
    >> output:
    >>
    >> <xsl:stylesheet
    >> xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    >> version="1.0">
    >>
    >> <xsl:strip-space elements="*"/>
    >> <xsl:eek:utput method="text"/>
    >>
    >> <xsl:template match="/">
    >> <xsl:apply-templates
    >> select="NAXML-POSJournal/JournalReport/SaleEvent/TransactionDetailGroup/TransactionLine"/>
    >> </xsl:template>
    >>
    >> <xsl:template match="TransactionLine">
    >> <xsl:value-of
    >> select="/NAXML-POSJournal/TransmissionHeader/StoreLocationID"/>
    >> <xsl:text>,</xsl:text>
    >> <xsl:value-of
    >> select="/NAXML-POSJournal/JournalReport/SaleEvent/BusinessDate"/>
    >> <xsl:text>,</xsl:text>
    >> <xsl:value-of select="ItemLine/ItemCode/POSCodeFormat/@format"/>
    >> <xsl:text>,</xsl:text>
    >> <xsl:value-of select="ItemLine/ItemCode/POSCode"/>
    >> <xsl:text>
    </xsl:text>
    >> </xsl:template>
    >>
    >> </xsl:stylesheet>


    > Thanks, it works great, just what I wanted. One question, I'm just
    > trying to get a better understanding of how this works, why isn't a
    > for-each needed even though there are multiple matches for
    > <TransactionLine> and the data under it?


    The apply-templates
    select="NAXML-POSJournal/JournalReport/SaleEvent/TransactionDetailGroup/TransactionLine"
    selects all 'TransactionLine' elements for processing, that is the
    reason you do not need a for-each.

    --

    Martin Honnen
    http://msmvps.com/blogs/martin_honnen/
    Martin Honnen, Feb 21, 2010
    #4
  5. loc

    loc Guest

    On Feb 21, 12:39 pm, Martin Honnen <> wrote:
    > loc wrote:
    > > On Feb 21, 6:10 am, Martin Honnen <> wrote:
    > >> Then process each 'TransactionLine' element and output what you want to
    > >> output:

    >
    > >> <xsl:stylesheet
    > >>    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    > >>    version="1.0">

    >
    > >>    <xsl:strip-space elements="*"/>
    > >>    <xsl:eek:utput method="text"/>

    >
    > >>    <xsl:template match="/">
    > >>      <xsl:apply-templates
    > >> select="NAXML-POSJournal/JournalReport/SaleEvent/TransactionDetailGroup/TransactionLine"/>
    > >>    </xsl:template>

    >
    > >>    <xsl:template match="TransactionLine">
    > >>      <xsl:value-of
    > >> select="/NAXML-POSJournal/TransmissionHeader/StoreLocationID"/>
    > >>      <xsl:text>,</xsl:text>
    > >>      <xsl:value-of
    > >> select="/NAXML-POSJournal/JournalReport/SaleEvent/BusinessDate"/>
    > >>      <xsl:text>,</xsl:text>
    > >>      <xsl:value-of select="ItemLine/ItemCode/POSCodeFormat/@format"/>
    > >>      <xsl:text>,</xsl:text>
    > >>      <xsl:value-of select="ItemLine/ItemCode/POSCode"/>
    > >>      <xsl:text>
    </xsl:text>
    > >>    </xsl:template>

    >
    > >> </xsl:stylesheet>

    > > Thanks, it works great, just what I wanted.  One question, I'm just
    > > trying to get a better understanding of how this works, why isn't a
    > > for-each needed even though there are multiple matches for
    > > <TransactionLine> and the data under it?

    >
    > The apply-templates
    > select="NAXML-POSJournal/JournalReport/SaleEvent/TransactionDetailGroup/TransactionLine"
    > selects all 'TransactionLine' elements for processing, that is the
    > reason you do not need a for-each.
    >
    > --
    >
    >         Martin Honnen
    >        http://msmvps.com/blogs/martin_honnen/


    It's giving me the data I want, but there is an error, it doesn't like
    the first line of my xml file

    bash$ xsltproc style.xsl sale.xml
    post.xml:1: parser warning : xmlParsePITarget: invalid name prefix
    'xml'
    <?xml-version ="1.0"standalone="no"?>
    ^
    207,2010-02-11,upcA,028400079037
    207,2010-02-11,upcA,049000051148

    Should I just cut that line off, it then works without the error. I
    don't have control over how the xml file is generated, but I could
    modify it with `sed' or just delete that line.
    loc, Feb 21, 2010
    #5
  6. loc wrote:

    > It's giving me the data I want, but there is an error, it doesn't like
    > the first line of my xml file
    >
    > bash$ xsltproc style.xsl sale.xml
    > post.xml:1: parser warning : xmlParsePITarget: invalid name prefix
    > 'xml'
    > <?xml-version ="1.0"standalone="no"?>


    I am afraid that's not XML, a legal XML declaration is
    <?xml version="1.0" standalone="no"?>
    so you will need to fix that line if you want to parse it as XML (which
    you need to process it with an XSLT processor).


    --

    Martin Honnen
    http://msmvps.com/blogs/martin_honnen/
    Martin Honnen, Feb 21, 2010
    #6
  7. loc wrote:
    > Thanks, it works great, just what I wanted. One question, I'm just
    > trying to get a better understanding of how this works, why isn't a
    > for-each needed even though there are multiple matches for
    > <TransactionLine> and the data under it?


    Apply-templates operates on all the nodes which match its select=
    pattern. Effectively, that's an implied for-each.

    (Actually, it may be better to reverse that and think of for-each as
    applying a private inline template, but it isn't obvious why that's true
    until you've worked with XSLT for a while.)

    --
    Joe Kesselman,
    http://www.love-song-productions.com/people/keshlam/index.html

    {} ASCII Ribbon Campaign | "may'ron DaroQbe'chugh vaj bIrIQbej" --
    /\ Stamp out HTML mail! | "Put down the squeezebox & nobody gets hurt."
    Joe Kesselman, Feb 22, 2010
    #7
  8. loc wrote:
    > <?xml-version ="1.0"standalone="no"?>
    > ^


    The space it's pointing to, after xml-version, is legal per the XML
    Recommendation (see http://www.w3.org/TR/REC-xml/#NT-XMLDecl,
    particularly production 25). Assuming it really is a space character
    rather than an &nbsp;.

    But the fact that you're missing a space before "standalone" is
    definitely an error. See http://www.w3.org/TR/REC-xml/#NT-SDDecl
    (production 32); note that it requires a leading whitespace character.

    So: If the latter is really in your file, whatever's producing that
    document is not generating well-formed XML. Fix it (preferable, since it
    will continue to upset everything that has to interface with it), or
    preprocess to fix this problem.

    If that doesn't cure the problem, and you're sure the space after
    xml-version really is an XML whitespace character, that would appear to
    be a bug in your XML parser.


    --
    Joe Kesselman,
    http://www.love-song-productions.com/people/keshlam/index.html

    {} ASCII Ribbon Campaign | "may'ron DaroQbe'chugh vaj bIrIQbej" --
    /\ Stamp out HTML mail! | "Put down the squeezebox & nobody gets hurt."
    Joe Kesselman, Feb 22, 2010
    #8
  9. On 21/02/2010 20:02, Martin Honnen wrote:
    > loc wrote:
    >
    >> It's giving me the data I want, but there is an error, it doesn't like
    >> the first line of my xml file
    >>
    >> bash$ xsltproc style.xsl sale.xml
    >> post.xml:1: parser warning : xmlParsePITarget: invalid name prefix
    >> 'xml'
    >> <?xml-version ="1.0"standalone="no"?>

    >
    > I am afraid that's not XML, a legal XML declaration is
    > <?xml version="1.0" standalone="no"?>
    > so you will need to fix that line if you want to parse it as XML (which
    > you need to process it with an XSLT processor).
    >
    >



    One "quick fix" for this line would be to simply drop it:

    bash$ awk 'NR>1' sale.xml | xsltproc style.xsl -

    Hermann
    Hermann Peifer, Feb 22, 2010
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. yzzzzz
    Replies:
    0
    Views:
    712
    yzzzzz
    Nov 12, 2003
  2. Brett

    xsltproc and Entities

    Brett, Feb 27, 2004, in forum: XML
    Replies:
    1
    Views:
    908
    Alain Ketterlin
    Mar 1, 2004
  3. Loudin
    Replies:
    1
    Views:
    760
    David Carlisle
    Jul 1, 2005
  4. Ramon
    Replies:
    2
    Views:
    1,760
    Ramon
    Aug 19, 2006
  5. Replies:
    0
    Views:
    799
Loading...

Share This Page