help with mechanize

Jeremy Woertink · Aug 6, 2008

I'm using mechanize to log into this form. The redirects aren't going
where I would expect them to though. I don't think i'm being logged in
properly, yet when I try it through the web browser I get logged in
normal.

I'm looking for maybe a better way on how to do this, or if anyone has
an ideas.

@agent = WWW::Mechanize.new { |agent| agent.user_agent_alias = 'Windows
Mozilla' }

@agent.get('http://smallbusiness.yahoo.com/ecommerce/') do |page|

@login_page = page.links.text("Small Business").click
temp_page = @login_page.form_with

name => 'login_form') do |form|
form['login'] = @login
form['passwd'] = @password
end.submit

if temp_page.uri.to_s.include?("login")
puts "Not logged in."
puts "an error occured."
exit
else
puts "Logged in"
end
end

The only thing I could think of is if the login fails, it returns me
back to a login page. This always says "Not Logged in" even though I
know the @login and @password are correct.

On another note, is there any good sites with REALLY good docs on
mechanize, and everything it can do. The main docs page seems to just
show the methods but not really what they do and how to use them.

Thanks,
~Jeremy

Aaron Patterson · Aug 6, 2008

Hi Jeremy,

I'm using mechanize to log into this form. The redirects aren't going
where I would expect them to though. I don't think i'm being logged in
properly, yet when I try it through the web browser I get logged in
normal.

I'm looking for maybe a better way on how to do this, or if anyone has
an ideas.

I tried out this script, and it looks like Yahoo sends a meta refresh
after you log in. Mechanize does not follow meta refreshes by default,
so you need to set that option.

Change this line:

@agent = WWW::Mechanize.new { |agent| agent.user_agent_alias = 'Windows
Mozilla' }

To this:

@agent = WWW::Mechanize.new { |agent|
agent.user_agent_alias = 'Windows Mozilla'
agent.follow_meta_refresh = true
}

How did I know to do this? I examined the behavior in Firefox and made
Mechanize do the same thing. I typically recommend people install
LiveHTTPHeaders in firefox and examine the requests and responses.

The only thing I could think of is if the login fails, it returns me
back to a login page. This always says "Not Logged in" even though I
know the @login and @password are correct.

Detecting whether or not you are logged in depends on the site you are
interacting with. Looking for a 'Not Logged in' string may be
appropriate in this case.

On another note, is there any good sites with REALLY good docs on
mechanize, and everything it can do. The main docs page seems to just
show the methods but not really what they do and how to use them.

I've tried to document the library as much as possible. "Good docs" is
very subjective. I'm doing my best.

That said, check EXAMPLES.txt, GUIDE.txt, and also the RDoc for each of
the main classes (Mechanize:

age, Mechanize::Form).

Hope that helps.

Jeremy Woertink · Aug 6, 2008

DUDE!!! Before I even try this out, i'm going to give you mad kudos! You
rock!

I will check it out and see what I come up with. I appreciate it.

As for another question, I'm trying to parse the HTML on this one page
after I get all logged in. I looked at the docs for Hpricot, but I don't
understand how it works exactly..

i.e.

doc.search("/html/body//p")

why does the "p" need 2 slashes in front?

It works when I do this, but if I only have 1 slash, then it doesn't
work..

Thanks,
~Jeremy

Phlip · Aug 7, 2008

Jeremy said:
doc.search("/html/body//p")

why does the "p" need 2 slashes in front?

It works when I do this, but if I only have 1 slash, then it doesn't
work..

'/html/body/p' will only match immediate children of <body>. Hence a body/div/p,
for example, won't match. // searches any descendant.

Here are tutorials on XPath for unit tests. Hpricot will also support many of
their techniques for functional tests:

http://www.oreillynet.com/onlamp/blog/2007/08/xpath_checker_and_assert_xpath.html
http://www.oreillynet.com/onlamp/blog/2007/08/assert_hpricot_1.html

Jeremy Woertink · Aug 12, 2008

I have another issue. The fix you gave me worked, but now i'm getting
stuck on this form.

I can get logged in just fine, but sometimes I'm asked to put in an
additional password. When this form comes up, I find the form, but when
I call .submit, or .click_button it just sits there. I let it sit there
while I went to lunch and came back an hour later and nothing had
happened. I turned the logging on, and I'm not seeing anything helpful.
I have to just do a ctrl+C to make it stop, but when I do, I get this
error.

c:/ruby/lib/ruby/1.8/net/protocol.rb:133:in `sysread': Interrupt
from c:/ruby/lib/ruby/1.8/net/protocol.rb:133:in `rbuf_fill'
from c:/ruby/lib/ruby/1.8/timeout.rb:56:in `timeout'
from c:/ruby/lib/ruby/1.8/timeout.rb:76:in `timeout'
from c:/ruby/lib/ruby/1.8/net/protocol.rb:132:in `rbuf_fill'
from c:/ruby/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'
from c:/ruby/lib/ruby/1.8/net/protocol.rb:126:in `readline'
from c:/ruby/lib/ruby/1.8/net/http.rb:2029:in `read_status_line'
from c:/ruby/lib/ruby/1.8/net/http.rb:2018:in `read_new'
... 8 levels...
from scrape.rb:87:in `each'
from scrape.rb:87
from
c:/ruby/lib/ruby/gems/1.8/gems/mechanize-0.7.7/lib/www/mechanize.rb
:217:in `get'
from scrape.rb:58
...
87 @store_manager.forms.each do |form|
88 form['passwd'] = @security_key
89 form.click_button
90 end
...

I can go through a normal browser and follow the steps normally and it
works fine.

Any ideas?

Thanks,
~Jeremy

Jeremy Woertink · Aug 12, 2008

I have also tried this code, and it does the same thing

@store_index = @store_manager.form_with

name => 'a') do |form|
form['passwd'] = @security_key
end.submit

It just stops. Is there a way I can see what it is doing? I'm not sure
how to fix this problem. Could have anything to do with it being an
https://?

Thanks,
~Jeremy

Jeremy said:
I have another issue. The fix you gave me worked, but now i'm getting
stuck on this form.

I can get logged in just fine, but sometimes I'm asked to put in an
additional password. When this form comes up, I find the form, but when
I call .submit, or .click_button it just sits there. I let it sit there
while I went to lunch and came back an hour later and nothing had
happened. I turned the logging on, and I'm not seeing anything helpful.
I have to just do a ctrl+C to make it stop, but when I do, I get this
error.

c:/ruby/lib/ruby/1.8/net/protocol.rb:133:in `sysread': Interrupt
from c:/ruby/lib/ruby/1.8/net/protocol.rb:133:in `rbuf_fill'
from c:/ruby/lib/ruby/1.8/timeout.rb:56:in `timeout'
from c:/ruby/lib/ruby/1.8/timeout.rb:76:in `timeout'
from c:/ruby/lib/ruby/1.8/net/protocol.rb:132:in `rbuf_fill'
from c:/ruby/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'
from c:/ruby/lib/ruby/1.8/net/protocol.rb:126:in `readline'
from c:/ruby/lib/ruby/1.8/net/http.rb:2029:in `read_status_line'
from c:/ruby/lib/ruby/1.8/net/http.rb:2018:in `read_new'
... 8 levels...
from scrape.rb:87:in `each'
from scrape.rb:87
from
c:/ruby/lib/ruby/gems/1.8/gems/mechanize-0.7.7/lib/www/mechanize.rb
:217:in `get'
from scrape.rb:58
...
87 @store_manager.forms.each do |form|
88 form['passwd'] = @security_key
89 form.click_button
90 end
...

I can go through a normal browser and follow the steps normally and it
works fine.

Any ideas?

Thanks,
~Jeremy

Mechanize	0	Jun 20, 2009
Mechanize and charset issues	5	May 26, 2009
Problem with Mechanize	2	Oct 1, 2008
Mechanize	2	Dec 17, 2007
Django authenticate problem	0	Oct 19, 2022
Need Help in Aol Login with mechanize	0	Jun 17, 2007
Scraping with Nokogiri while using Mechanize	2	Mar 10, 2011
mechanize login form	1	Jun 16, 2010

help with mechanize

Jeremy Woertink

Aaron Patterson

Jeremy Woertink

Phlip

Jeremy Woertink

Jeremy Woertink

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads