REXML element reading <br /> error

J

John Butler

When reading in the site element from my xml file using rexml it seems
to be chopping the rest of the text off after the first <br/>

The value in the XML file is below
<Site>123 street<br/>amstown<br/>amserland</Site>

element = REXML::XPath.first(doc, '//Site')

puts element.text #shows 123 Street

How can i get the full data and once i have it i can remove the <br/>
I cant find any information on this?????

JB
 
K

Keith Fahlgren

When reading in the site element from my xml file using rexml it seems
to be chopping the rest of the text off after the first <br/>

The value in the XML file is below
<Site>123 street<br/>amstown<br/>amserland</Site>

element = REXML::XPath.first(doc, '//Site')

I'd suggest using a bit more XPath, both text() and a each {} to
iterate through the text nodes (which are distinct):

$ irb -r rexml/document --prompt xmp
a = REXML::Document.new("<Site>123 street<br/>amstown<br/>amserland</Site>")
# => <UNDEFINED> ... </>
REXML::XPath.first(a, '//Site').text
# => "123 street"
REXML::XPath.first(a, '//Site/text()').to_s
# => "123 street"
REXML::XPath.each(a, '//Site/text()') {|el| puts el}
123 street
amstown
amserland
# => ["123 street", "amstown", "amserland"]


HTH,
Keith
 
N

Nobuyoshi Nakada

Hi,

At Sat, 1 Sep 2007 05:18:48 +0900,
Keith Fahlgren wrote in [ruby-talk:266990]:
I'd suggest using a bit more XPath, both text() and a each {} to
iterate through the text nodes (which are distinct):

$ irb -r rexml/document --prompt xmp
a = REXML::Document.new("<Site>123 street<br/>amstown<br/>amserland</Site>")
# => <UNDEFINED> ... </>
REXML::XPath.first(a, '//Site').text
# => "123 street"

Seems like that just REXML::XPath.first(a, '//Site').to_s
returns the whole content.
 
N

not

When reading in the site element from my xml file using rexml it seems
to be chopping the rest of the text off after the first <br/>

Not quite. It gives you the *first* text element.
The value in the XML file is below
<Site>123 street<br/>amstown<br/>amserland</Site>

element = REXML::XPath.first(doc, '//Site')

puts element.text #shows 123 Street

How can i get the full data and once i have it i can remove the <br/> I
cant find any information on this?????

You can't find any specific info because there isn't anything specific.
You have an XML element that contains a text node, an empty element named
br, another text node, another empty element named br and another text
node. In the XML world, <br/> is a node like any other.

The REXML::Element.texts method is what you are looking for:

$ irb
irb(main):001:0> require "rexml/document"
=> true

irb(main):002:0> doc=REXML::Document.new( said:
amserland</Site>")
=> <UNDEFINED> ... </>

irb(main):003:0> doc.root.texts
=> ["123 street", "amstown", "amserland"]

irb(main):004:0> doc.root.texts.join " "
=> "123 street amstown amserland"


Enjoy!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top