Get content in a xml element using hpricot

Bonita · Apr 13, 2007

Hi

I'm using hpricot to parse the following file.

<item
rdf:about="http://del.icio.us/url/50666d1a3fe2b942b20819ec2919d2b7#morwyn">
<title>[from morwyn] * HTML for the Conceptually Challenged</title>
<link>http://del.icio.us/url/50666d1a3fe2b942b20819ec2919d2b7#morwyn</link>
<description>HTML for the Conceptually Challenged. Very basic tutorial,
plainly worded for people who hate to read instructions.</description>
<dc:creator>morwyn</dc:creator>
<dc:date>2006-10-10T07:28:28Z</dc:date>
<dc:subject>html imported webpagedesign</dc:subject>
<taxo:topics>
<rdf:Bag>
<rdf:li resource="http://del.icio.us/tag/imported" />
<rdf:li resource="http://del.icio.us/tag/html" />
<rdf:li resource="http://del.icio.us/tag/webpagedesign" />
</rdf:Bag>
</taxo:topics>
</item>

I'm trying to get the content from <dc:subject> like this

doc = Hpricot.parse(File.read("965.xhtml"))

(doc/"item").each do |t|

puts (t/"dc:subject").innerTEXT

end

but I got

<dc:subject>html internet tutorial web</dc:subject>

while I only need "html internet tutorial web"

Anyone knows what's the right function to call?

THanks

kikijump · Apr 13, 2007

Hi

I'm using hpricot to parse the following file.

<item
rdf:about="http://del.icio.us/url/50666d1a3fe2b942b20819ec2919d2b7#morwyn">
<title>[from morwyn] * HTML for the Conceptually Challenged</title>
<link>http://del.icio.us/url/50666d1a3fe2b942b20819ec2919d2b7#morwyn</link>
<description>HTML for the Conceptually Challenged. Very basic tutorial,
plainly worded for people who hate to read instructions.</description>
<dc:creator>morwyn</dc:creator>
<dc:date>2006-10-10T07:28:28Z</dc:date>
<dc:subject>html imported webpagedesign</dc:subject>
<taxo:topics>
<rdf:Bag>
<rdf:li resource="http://del.icio.us/tag/imported" />
<rdf:li resource="http://del.icio.us/tag/html" />
<rdf:li resource="http://del.icio.us/tag/webpagedesign" />
</rdf:Bag>
</taxo:topics>
</item>

I'm trying to get the content from <dc:subject> like this

doc = Hpricot.parse(File.read("965.xhtml"))

(doc/"item").each do |t|

puts (t/"dc:subject").innerTEXT

end

but I got

<dc:subject>html internet tutorial web</dc:subject>

while I only need "html internet tutorial web"

Anyone knows what's the right function to call?

THanks

--
Posted viahttp://www.ruby-forum.com/.

puts (t/'dc:subject').text

Click to expand...

kikijump · Apr 13, 2007

Hi

Click to expand...

I'm using hpricot to parse the following file.

Click to expand...

<item
rdf:about="http://del.icio.us/url/50666d1a3fe2b942b20819ec2919d2b7#morwyn">
<title>[from morwyn] * HTML for the Conceptually Challenged</title>
<link>http://del.icio.us/url/50666d1a3fe2b942b20819ec2919d2b7#morwyn</link>
<description>HTML for the Conceptually Challenged. Very basic tutorial,
plainly worded for people who hate to read instructions.</description>
<dc:creator>morwyn</dc:creator>
<dc:date>2006-10-10T07:28:28Z</dc:date>
<dc:subject>html imported webpagedesign</dc:subject>
<taxo:topics>
<rdf:Bag>
<rdf:li resource="http://del.icio.us/tag/imported" />
<rdf:li resource="http://del.icio.us/tag/html" />
<rdf:li resource="http://del.icio.us/tag/webpagedesign" />
</rdf:Bag>
</taxo:topics>
</item>

Click to expand...

I'm trying to get the content from <dc:subject> like this

Click to expand...

doc = Hpricot.parse(File.read("965.xhtml"))

Click to expand...

(doc/"item").each do |t|

Click to expand...

puts (t/"dc:subject").innerTEXT

but I got

Click to expand...

<dc:subject>html internet tutorial web</dc:subject>

Click to expand...

while I only need "html internet tutorial web"

Click to expand...

Anyone knows what's the right function to call?

Click to expand...

puts (t/'dc:subject').text

Sorry for the double post but I shouldn't have copy/paste the result
directly from irb

Billy Hsu · Apr 13, 2007

Sorry for deleted your text

Maybe you can try:

puts (t/"dc:subject").text

Hpricot innerTEXT?	9	Apr 13, 2007
How to access a specific element in the an XML file using the JavaScript DOM	3	Apr 3, 2007
XML Parsing Problem in Internet Explorer	1	Oct 11, 2008
Are XML-style "/>" tags valid in 4.01 Transitional? I get weird answers from validators.	6	Sep 12, 2003
Activation of a javascript incorporated after initial loading ofthe page	0	Nov 29, 2005
CFP: The 2011 International Conference on Modeling, Simulation andVisualization Methods (MSV'11), U	2	Jan 30, 2011
Call for Papers & Sessions: The 2011 International Conference onModeling, Simulation and Visualizat	0	Dec 26, 2010
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006

Get content in a xml element using hpricot

Bonita

kikijump

kikijump

Billy Hsu

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads