REXML advice - output

Discussion in 'Ruby' started by Stuart Clarke, Sep 16, 2010.

  1. Hey all,

    I would like to pick your brains about Rexml and how to report from it.
    For example, I am reading an XML file using references to each XML tag
    like so:

    doc.root.each_element("/UserData/List/ItemInfo/Title") {|e|
    report.puts "Title: #{e.text}"
    }
    doc.root.each_element("/UserData/List/ItemInfo/Date") {|e|
    report.puts "Date: #{e.text}"
    }

    The 'report.puts' writes this data out to a CSV file. At present I get a
    list of all the titles in the XML file followed a list of the dates.
    What I need it to get the side by side in a CSV file like so

    Title Date
    Item1 20th Jan 2009
    Item2 12th Feb 2010

    Does anyone have any suggestions on a suitable workflow for this?

    Many thanks
    --
    Posted via http://www.ruby-forum.com/.
     
    Stuart Clarke, Sep 16, 2010
    #1
    1. Advertising

  2. On Thu, Sep 16, 2010 at 4:42 PM, Stuart Clarke
    <> wrote:
    > Hey all,
    >
    > I would like to pick your brains about Rexml and how to report from it.
    > For example, I am reading an XML file using references to each XML tag
    > like so:
    >
    > doc.root.each_element("/UserData/List/ItemInfo/Title") {|e|
    > =A0report.puts "Title: #{e.text}"
    > }
    > doc.root.each_element("/UserData/List/ItemInfo/Date") {|e|
    > =A0report.puts "Date: #{e.text}"
    > }
    >
    > The 'report.puts' writes this data out to a CSV file. At present I get a
    > list of all the titles in the XML file followed a list of the dates.
    > What I need it to get the side by side in a CSV file like so
    >
    > Title =A0 =A0 =A0 =A0 =A0 =A0 Date
    > Item1 =A0 =A0 =A0 =A0 =A0 =A0 20th Jan 2009
    > Item2 =A0 =A0 =A0 =A0 =A0 =A0 12th Feb 2010
    >
    > Does anyone have any suggestions on a suitable workflow for this?


    Just iterate over all "ItemInfo" elements and print values from sub
    elements (which you can select via a relative XPath).

    Kind regards

    robert

    --=20
    remember.guy do |as, often| as.you_can - without end
    http://blog.rubybestpractices.com/
     
    Robert Klemme, Sep 16, 2010
    #2
    1. Advertising

  3. Robert Klemme wrote:
    > On Thu, Sep 16, 2010 at 4:42 PM, Stuart Clarke
    > <> wrote:
    >> �report.puts "Date: #{e.text}"
    >> Does anyone have any suggestions on a suitable workflow for this?

    > Just iterate over all "ItemInfo" elements and print values from sub
    > elements (which you can select via a relative XPath).
    >
    > Kind regards
    >
    > robert

    Thanks for getting back to me. I will look into this and see how I get
    on.

    Thanks a lot Robert.
    --
    Posted via http://www.ruby-forum.com/.
     
    Stuart Clarke, Sep 17, 2010
    #3
  4. Robert Klemme wrote:
    > On Thu, Sep 16, 2010 at 4:42 PM, Stuart Clarke
    > <> wrote:
    >> �report.puts "Date: #{e.text}"
    >> Does anyone have any suggestions on a suitable workflow for this?

    > Just iterate over all "ItemInfo" elements and print values from sub
    > elements (which you can select via a relative XPath).
    >
    > Kind regards
    >
    > robert


    To confirm I am following you correctly, I have now got the following:

    info = doc.elements.to_a("//UserData/List/ItemInfo/")

    Printing out info gives a line per line entry of all children under the
    tag ItemInfo.

    First of all, is this what you meant? Am I correct to assume that at
    this point, you are suggesting I write this data to a CSV file stripping
    off the tags with a regex or something? Is this correct?

    Many thanks and apologies if I have misunderstood.

    --
    Posted via http://www.ruby-forum.com/.
     
    Stuart Clarke, Sep 20, 2010
    #4
  5. Stuart Clarke

    Guest

    I managed to mess-up clicking "Send" on Friday, so I'm trying again (-:

    > Stuart Clarke wrote:
    >> Robert Klemme wrote:
    >>> Stuart Clarke wrote:
    >>>> report.puts "Date: #{e.text}"
    >>>> Does anyone have any suggestions on a suitable workflow for this?
    >>>
    >>> Just iterate over all "ItemInfo" elements and print values from sub
    >>> elements (which you can select via a relative XPath).

    >>
    >> Thanks for getting back to me. I will look into this and see how I get
    >> on.

    >
    > I pulled this out of a script I use quite a bit and hacked your XPath int=

    o it:
    >
    > ARGV.each do |filename|
    > =A0 doc =3D REXML::Document.new( File.new( filename ) )
    >
    > =A0 doc.elements.each("/UserData/List/ItemInfo"){|e|
    > =A0 =A0print e.elements["Title"].text, "\t"
    > =A0 =A0puts e.elements["Date"].text
    > =A0end
    > end
     
    , Sep 20, 2010
    #5
  6. Could anybody help me with an issue you I am having with some XML I am
    reading. I am using xpath to read 2 different parts of an XML file,
    which looks a lot like this

    <Data>
    <DoneList><Vector><Count>84</Count>
    <FullItemInfo>
    <Count>0</Count>
    <ItemInfo>
    <Title>BLAH LAH</Title>
    <Id>12345</Id>
    </ItemInfo>
    </Vector></DoneList>
    <FullItemInfo>
    NEXT ITEM AS BOVE

    Then I have further data, which is slightly different
    <NotDoneList><Vector><Count>84</Count>
    <FullItemInfo>
    <Count>0</Count>
    <ItemInfo>
    <Title>BLAH LAH</Title>
    <Id>12345</Id>
    </ItemInfo>
    </Vector></DoneList>
    <FullItemInfo>
    </Data>

    As you can see, the tags are the same but the first is DoneList and the
    second NotDoneList. I need to process each set seperately and each set
    can contain more than 1 entry. My code to give a CSV file is

    doc = REXML::Document.new(d) #call REXML to open the XML file
    #To get NotDoneList data
    doc.elements.each("//NotDoneList/Vector/Count/FullItemInfo") do |e|
    detail =
    (
    e.elements['ItemInfo/Title'].text << "," <<
    e.elements['ItemInfo/Id'].text
    )
    puts detail
    end

    #To get DoneList data
    doc.elements.each("//DoneList/Vector/Count/FullItemInfo") do |e|
    detail =
    (
    e.elements['ItemInfo/Title'].text << "," <<
    e.elements['ItemInfo/Id'].text
    )
    puts detail
    end

    When I run this, no data in extracted and no errors are given. In
    contrast if I do
    doc.elements.each("//FullItemInfo") do |e|
    I am able to extract all the information for both the NotDoneList and
    DoneList, however this is not what I want. I want to address each data
    set separately. The eventual idea will be to produce a report of all
    items in the NotDoneList and another report for those in the DoneList.
    I guess I am doing something wrong but I cannot see it.

    Can anyone see what I am doing wrong with this? I would really
    appreciate any help as I cannot figure it out.

    Many thanks

    --
    Posted via http://www.ruby-forum.com/.
     
    Stuart Clarke, Oct 26, 2010
    #6
  7. Stuart Clarke

    Guest

    > <Data>
    > <DoneList><Vector><Count>84</Count>
    > <FullItemInfo>
    > <Count>0</Count>
    > <ItemInfo>
    > <Title>BLAH LAH</Title>
    > <Id>12345</Id>
    > </ItemInfo>
    > </Vector></DoneList>
    > <FullItemInfo>
    > NEXT ITEM AS BOVE


    Data
    DoneList
    Vector
    Count /Count
    FullItemInfo
    Count /Count
    ItemInfo
    Title /Title
    Id /Id
    /ItemInfo
    /Vector
    /DoneList
    FullItemInfo

    The XML example you provided seems to have mismatched tags?

    > Then I have further data, which is slightly different
    > <NotDoneList><Vector><Count>84</Count>
    > <FullItemInfo>
    > <Count>0</Count>
    > <ItemInfo>
    > <Title>BLAH LAH</Title>
    > <Id>12345</Id>
    > </ItemInfo>
    > </Vector></DoneList>
    > <FullItemInfo>
    > </Data>


    NotDoneList
    Vector
    Count /Count
    FullItemInfo
    Count /Count
    ItemInfo
    Title /Title
    Id /Id
    /ItemInfo
    /Vector
    /DoneList
    FullItemInfo
    /Data

    > doc.elements.each("//NotDoneList/Vector/Count/FullItemInfo")
    > doc.elements.each("//DoneList/Vector/Count/FullItemInfo") do |e|


    Can you verify and re-post a clean XML snippet? (That may help debug
    your XPath.) I'm going to guess:

    <Data>
    <DoneList>
    <Vector>
    <Count/>
    <FullItemInfo/>
    </Vector>
    </DoneList>
    </Data>

    In which case, the XPath might be: '//DoneList/Vector/FullItemInfo'?
     
    , Oct 27, 2010
    #7
  8. My issue was due to the mis matched tags actually, it was a broken XML
    file.

    Thanks for identifying that.

    --
    Posted via http://www.ruby-forum.com/.
     
    Stuart Clarke, Oct 29, 2010
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Damphyr
    Replies:
    2
    Views:
    143
    Damphyr
    Jul 16, 2003
  2. Francis Hwang

    Setting the output encoding in REXML

    Francis Hwang, Oct 7, 2004, in forum: Ruby
    Replies:
    0
    Views:
    89
    Francis Hwang
    Oct 7, 2004
  3. Daniel Berger

    rexml error - REXML::Validation

    Daniel Berger, Oct 12, 2004, in forum: Ruby
    Replies:
    2
    Views:
    154
    Henrik Horneber
    Oct 12, 2004
  4. Chris Large

    Small issue with REXML output

    Chris Large, Dec 23, 2005, in forum: Ruby
    Replies:
    1
    Views:
    84
  5. Phlip
    Replies:
    0
    Views:
    144
    Phlip
    Jan 15, 2008
Loading...

Share This Page