REXML to extract only values from XML?

  • Thread starter christopher.mcmahon
  • Start date
C

christopher.mcmahon

Say I have an XML record like

<Subscriber>
<name>
<firstName>CHRIS</firstName>
<lastName>MCMAHON</lastName>
</name>
<ssn>111223333</ssn>
</Subscriber>

I'd like to extract each value of each tag (without regard to
hierarchy) and add it to an array:

["CHRIS","MCMAHON","111223333"]

The REXML docs don't seem to address this. I've tried various
methods, but I can't seem to find the way to address only the contents
of each tag. Any suggestions would be welcome
 
J

James Britt

Say I have an XML record like

<Subscriber>
<name>
<firstName>CHRIS</firstName>
<lastName>MCMAHON</lastName>
</name>
<ssn>111223333</ssn>
</Subscriber>

I'd like to extract each value of each tag (without regard to
hierarchy) and add it to an array:

["CHRIS","MCMAHON","111223333"]

The REXML docs don't seem to address this. I've tried various
methods, but I can't seem to find the way to address only the contents
of each tag. Any suggestions would be welcome
require 'rexml/document'

xml = "<Subscriber>
<name>
<firstName>CHRIS</firstName>
<lastName>MCMAHON</lastName>
</name>
<ssn>111223333</ssn>
</Subscriber> "

p REXML::Document.new( xml ).elements.to_a( "//*[text()]").map { |e|
e.text.strip.empty? ? nil : e.text.strip}.compact

Note that elements that appear to have only child elements also have
newline characters, which you probably don't want.

(There may be a better way to ignore that sort of white space.)


James

--

http://www.ruby-doc.org - The Ruby Documentation Site
http://www.rubyxml.com - News, Articles, and Listings for Ruby & XML
http://www.rubystuff.com - The Ruby Store for Ruby Stuff
http://www.jamesbritt.com - Playing with Better Toys
 
W

William James

Say I have an XML record like

<Subscriber>
<name>
<firstName>CHRIS</firstName>
<lastName>MCMAHON</lastName>
</name>
<ssn>111223333</ssn>
</Subscriber>

I'd like to extract each value of each tag (without regard to
hierarchy) and add it to an array:

["CHRIS","MCMAHON","111223333"]

The REXML docs don't seem to address this. I've tried various
methods, but I can't seem to find the way to address only the contents
of each tag. Any suggestions would be welcome

xml = "<Subscriber>
<name>
<firstName> CHRIS </firstName>
<lastName>MCMAHON</lastName>
</name>
<ssn>111223333</ssn>
</Subscriber> "

p xml.split( / <.*?> (?: \s* <.*?> )* /xm )[1 .. -2]

---> [" CHRIS ", "MCMAHON", "111223333"]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top