General Nokogiri problem

  • Thread starter Srijayanth Sridhar
  • Start date
S

Srijayanth Sridhar

[Note: parts of this message were removed to make it a legal post.]

Hello,

On several sites(probably malformed HTML/JavaScript/XML/general parsing
hell) I have the following problem.

For ex:

moonwolf@trantor:~/ruby$ irb
irb(main):001:0> ['rubygems','nokogiri','hpricot','open-uri'].each { |r|
require r }
=> ["rubygems", "nokogiri", "hpricot", "open-uri"]
irb(main):002:0> doc=Nokogiri(open("http://maps.google.com/"))
=> <?xml version="1.0"?>
<!DOCTYPE html>
<html/>

irb(main):003:0> doc/"a"
=>

Same with Nokogiri.Hpricot:

irb(main):004:0> doc=Nokogiri.Hpricot(open("http://maps.google.com/"))
=> <?xml version="1.0"?>
<!DOCTYPE html>
<html/>

However with regular Hpricot:

irb(main):009:0> (Hpricot(open("http://maps.google.com/"))/"a").size
=> 53
(the full post of course is too long, so just showed something simpler)


Hpricot by itself of course works. I tried looking and there's not much by
way of documentation or blogs on something like this.

Any suggestions/explanations will be welcome as I like Nokogiri's speed very
much.

I am using:

moonwolf@trantor:~/ruby$ gem list --local | grep -i nokogiri
nokogiri (1.2.3)
moonwolf@trantor:~/ruby$ ruby --version
ruby 1.8.6 (2008-03-03 patchlevel 114) [i686-linux]


Jayanth
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top