Hpricot and path of an elememt

L

Li Chen

Hi all,

I use hpricot to load a page. Then I try to find the path for an
element "font"(<font face="courier" color="black">) in the page. Here is
the tutorial
(http://code.whytheluckystiff.net/hpricot/wiki/HpricotBasics):

doc.at("#header").xpath
#=> "//div[@id='header']"

here is my code:
puts doc.at("#font").xpath

When I run the code Ruby complains undefined method for xpath. I wonder
if I have problem understanding the tutorial.


Thanks,

Li
 
D

David Masover

I use hpricot to load a page. Then I try to find the path for an
element "font"(<font face="courier" color="black">) in the page.

So, you probably want:

(doc / 'font')
doc.at("#header").xpath
#=> "//div[@id='header']"

Right said:
here is my code:
puts doc.at("#font").xpath

And that's searching for a tag that looks like this: <div id="font">

If you're following that example, you probably want:

puts doc.at('font').xpath

Now, first question: Why do you need the xpath? Usually, the idea is to try to
find that element, and then do something with it. So, for example:


# To return all text:
(doc / 'font').text

# To loop over each font element:
(doc / 'font').each { |tag|
puts tag.inner_text
}


Second question: Why is there a font tag on this page? If you had any hand in
creating the page, shame on you -- go learn some CSS.

In fact, go learn some CSS anyway. Hpricot supports both CSS selectors and
XPath, and it's usually much easier to use the selectors. Years later, I
still remember, roughly, how selectors work -- but only a few months later,
I've almost completely forgotten XPath.

There are things XPath can do that selectors can't. But until you encounter
them, XPath is overkill.
 
L

Li Chen

David said:
Now, first question: Why do you need the xpath? Usually, the idea is to
try to
find that element, and then do something with it. So, for example:
# To return all text:
(doc / 'font').text

# To loop over each font element:
(doc / 'font').each { |tag|
puts tag.inner_text
}

I need to extract text within this tag. I follow you code and I find
1) (doc/'font').text and (doc/'font').html return the same results
2) when I run (doc / 'font').each { |tag| puts tag.inner_text}
Ruby complains it:
undefined method `inner_text' for #<Hpricot::Elem:0x2e9f9c4>
(NoMethodError)

so I change it to tag.inner_html and it works. I check the document
about hpricot and find the methode #inner_text is there. But I cannot
figure out why Ruby complains about it.

Second question: Why is there a font tag on this page? If you had any
hand in
creating the page, shame on you -- go learn some CSS.

I am a newbie on HTML and website development. If you want to know why
there is a font tag in the page, please check this out:
http://www.ensembl.org/Homo_sapiens/exonview?db=core;transcript=ENST00000356766

What I try to do is to extract some info I am interested from this
page. I have no idea why they put this tag and that tag there. I don't
think it is my priority to know somany whys now. I am more concerned
about letting the job done.


Anyway thank very much for the tips.

Li
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
SterlingLa
Top