help with mechanize

  • Thread starter Jeremy Woertink
  • Start date
J

Jeremy Woertink

I'm using mechanize to log into this form. The redirects aren't going
where I would expect them to though. I don't think i'm being logged in
properly, yet when I try it through the web browser I get logged in
normal.

I'm looking for maybe a better way on how to do this, or if anyone has
an ideas.

@agent = WWW::Mechanize.new { |agent| agent.user_agent_alias = 'Windows
Mozilla' }

@agent.get('http://smallbusiness.yahoo.com/ecommerce/') do |page|

@login_page = page.links.text("Small Business").click
temp_page = @login_page.form_with:)name => 'login_form') do |form|
form['login'] = @login
form['passwd'] = @password
end.submit

if temp_page.uri.to_s.include?("login")
puts "Not logged in."
puts "an error occured."
exit
else
puts "Logged in"
end
end

The only thing I could think of is if the login fails, it returns me
back to a login page. This always says "Not Logged in" even though I
know the @login and @password are correct.

On another note, is there any good sites with REALLY good docs on
mechanize, and everything it can do. The main docs page seems to just
show the methods but not really what they do and how to use them.

Thanks,
~Jeremy
 
A

Aaron Patterson

Hi Jeremy,

I'm using mechanize to log into this form. The redirects aren't going
where I would expect them to though. I don't think i'm being logged in
properly, yet when I try it through the web browser I get logged in
normal.

I'm looking for maybe a better way on how to do this, or if anyone has
an ideas.

I tried out this script, and it looks like Yahoo sends a meta refresh
after you log in. Mechanize does not follow meta refreshes by default,
so you need to set that option.

Change this line:
@agent = WWW::Mechanize.new { |agent| agent.user_agent_alias = 'Windows
Mozilla' }

To this:

@agent = WWW::Mechanize.new { |agent|
agent.user_agent_alias = 'Windows Mozilla'
agent.follow_meta_refresh = true
}

How did I know to do this? I examined the behavior in Firefox and made
Mechanize do the same thing. I typically recommend people install
LiveHTTPHeaders in firefox and examine the requests and responses.
The only thing I could think of is if the login fails, it returns me
back to a login page. This always says "Not Logged in" even though I
know the @login and @password are correct.

Detecting whether or not you are logged in depends on the site you are
interacting with. Looking for a 'Not Logged in' string may be
appropriate in this case.
On another note, is there any good sites with REALLY good docs on
mechanize, and everything it can do. The main docs page seems to just
show the methods but not really what they do and how to use them.

I've tried to document the library as much as possible. "Good docs" is
very subjective. I'm doing my best. :)

That said, check EXAMPLES.txt, GUIDE.txt, and also the RDoc for each of
the main classes (Mechanize::page, Mechanize::Form).

Hope that helps.
 
J

Jeremy Woertink

DUDE!!! Before I even try this out, i'm going to give you mad kudos! You
rock!

I will check it out and see what I come up with. I appreciate it.

As for another question, I'm trying to parse the HTML on this one page
after I get all logged in. I looked at the docs for Hpricot, but I don't
understand how it works exactly..

i.e.

doc.search("/html/body//p")

why does the "p" need 2 slashes in front?

It works when I do this, but if I only have 1 slash, then it doesn't
work..



Thanks,
~Jeremy
 
P

Phlip

Jeremy said:
doc.search("/html/body//p")

why does the "p" need 2 slashes in front?

It works when I do this, but if I only have 1 slash, then it doesn't
work..

'/html/body/p' will only match immediate children of <body>. Hence a body/div/p,
for example, won't match. // searches any descendant.

Here are tutorials on XPath for unit tests. Hpricot will also support many of
their techniques for functional tests:

http://www.oreillynet.com/onlamp/blog/2007/08/xpath_checker_and_assert_xpath.html
http://www.oreillynet.com/onlamp/blog/2007/08/assert_hpricot_1.html
 
J

Jeremy Woertink

I have another issue. The fix you gave me worked, but now i'm getting
stuck on this form.

I can get logged in just fine, but sometimes I'm asked to put in an
additional password. When this form comes up, I find the form, but when
I call .submit, or .click_button it just sits there. I let it sit there
while I went to lunch and came back an hour later and nothing had
happened. I turned the logging on, and I'm not seeing anything helpful.
I have to just do a ctrl+C to make it stop, but when I do, I get this
error.

c:/ruby/lib/ruby/1.8/net/protocol.rb:133:in `sysread': Interrupt
from c:/ruby/lib/ruby/1.8/net/protocol.rb:133:in `rbuf_fill'
from c:/ruby/lib/ruby/1.8/timeout.rb:56:in `timeout'
from c:/ruby/lib/ruby/1.8/timeout.rb:76:in `timeout'
from c:/ruby/lib/ruby/1.8/net/protocol.rb:132:in `rbuf_fill'
from c:/ruby/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'
from c:/ruby/lib/ruby/1.8/net/protocol.rb:126:in `readline'
from c:/ruby/lib/ruby/1.8/net/http.rb:2029:in `read_status_line'
from c:/ruby/lib/ruby/1.8/net/http.rb:2018:in `read_new'
... 8 levels...
from scrape.rb:87:in `each'
from scrape.rb:87
from
c:/ruby/lib/ruby/gems/1.8/gems/mechanize-0.7.7/lib/www/mechanize.rb
:217:in `get'
from scrape.rb:58
...
87 @store_manager.forms.each do |form|
88 form['passwd'] = @security_key
89 form.click_button
90 end
...

I can go through a normal browser and follow the steps normally and it
works fine.

Any ideas?

Thanks,
~Jeremy
 
J

Jeremy Woertink

I have also tried this code, and it does the same thing

@store_index = @store_manager.form_with:)name => 'a') do |form|
form['passwd'] = @security_key
end.submit

It just stops. Is there a way I can see what it is doing? I'm not sure
how to fix this problem. Could have anything to do with it being an
https://?

Thanks,
~Jeremy

Jeremy said:
I have another issue. The fix you gave me worked, but now i'm getting
stuck on this form.

I can get logged in just fine, but sometimes I'm asked to put in an
additional password. When this form comes up, I find the form, but when
I call .submit, or .click_button it just sits there. I let it sit there
while I went to lunch and came back an hour later and nothing had
happened. I turned the logging on, and I'm not seeing anything helpful.
I have to just do a ctrl+C to make it stop, but when I do, I get this
error.

c:/ruby/lib/ruby/1.8/net/protocol.rb:133:in `sysread': Interrupt
from c:/ruby/lib/ruby/1.8/net/protocol.rb:133:in `rbuf_fill'
from c:/ruby/lib/ruby/1.8/timeout.rb:56:in `timeout'
from c:/ruby/lib/ruby/1.8/timeout.rb:76:in `timeout'
from c:/ruby/lib/ruby/1.8/net/protocol.rb:132:in `rbuf_fill'
from c:/ruby/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'
from c:/ruby/lib/ruby/1.8/net/protocol.rb:126:in `readline'
from c:/ruby/lib/ruby/1.8/net/http.rb:2029:in `read_status_line'
from c:/ruby/lib/ruby/1.8/net/http.rb:2018:in `read_new'
... 8 levels...
from scrape.rb:87:in `each'
from scrape.rb:87
from
c:/ruby/lib/ruby/gems/1.8/gems/mechanize-0.7.7/lib/www/mechanize.rb
:217:in `get'
from scrape.rb:58
...
87 @store_manager.forms.each do |form|
88 form['passwd'] = @security_key
89 form.click_button
90 end
...

I can go through a normal browser and follow the steps normally and it
works fine.

Any ideas?

Thanks,
~Jeremy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top