Nokogiri not getting html body sometimes

Discussion in 'Ruby' started by Jarmo Pertman, May 20, 2009.

  1. I'm using Mechanize to get imdb page and then Nokogiri Node#search
    method to get some info from the page, but I've stumbled onto one
    special case where #search doesn't work properly, e.g. all other pages
    I've tried so far work as expected.

    It seems that some special characters are causing the trouble for
    Nokogiri, because when I tried to print document itself it outputted
    only half of <head> tag and no body tags at all!

    Anyway here is the code snippet which I'd expect to output "false" 4
    times. Instead, it outputs false, false, true, false. Try with some
    other imdb url and it's ok.

    require 'mechanize'

    mech = WWW::Mechanize.new {|agent| agent.user_agent_alias = 'Windows
    Mozilla'}
    mech.get("http://www.imdb.com/title/tt1092016/") do |page|
    puts page.search("/html").empty?
    puts page.search("/html/head").empty?
    puts page.search("/html/body").empty?
    puts page.body.empty?
    end

    What could be the problem?

    I'm using ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
    --
    Posted via http://www.ruby-forum.com/.
     
    Jarmo Pertman, May 20, 2009
    #1
    1. Advertising

  2. Jarmo Pertman

    Lui Core Guest

    i think you'd better set the encoding first.

    mech.get("http://www.imdb.com/title/tt1092016/") do |page|
    page.encoding = 'ISO-8859-1'
    #... the rest of ur code
    end
    --
    Posted via http://www.ruby-forum.com/.
     
    Lui Core, May 21, 2009
    #2
    1. Advertising

  3. Thank you! It did the trick.

    Best regards,
    Jarmo

    Lui Core wrote:
    > i think you'd better set the encoding first.
    >
    > mech.get("http://www.imdb.com/title/tt1092016/") do |page|
    > page.encoding = 'ISO-8859-1'
    > #... the rest of ur code
    > end



    --
    Posted via http://www.ruby-forum.com/.
     
    Jarmo Pertman, May 21, 2009
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. John Saunders
    Replies:
    0
    Views:
    493
    John Saunders
    Aug 28, 2003
  2. Nehmo Sergheyev
    Replies:
    1
    Views:
    527
    Andrew Urquhart
    May 9, 2004
  3. Marcin Vorbrodt

    ::std sometimes needed, sometimes not

    Marcin Vorbrodt, Sep 16, 2003, in forum: C++
    Replies:
    24
    Views:
    804
    Jerry Coffin
    Sep 17, 2003
  4. Replies:
    1
    Views:
    527
    gkelly
    Nov 29, 2006
  5. Kamyk
    Replies:
    2
    Views:
    199
    Thomas 'PointedEars' Lahn
    May 8, 2005
Loading...

Share This Page