mechanize newbie

C

Colin Summers

Okay, Ruby in general newbie, but I did the whole shovell project for
RoR, so I felt I was getting somewhere...

I am fooling around trying to make a spider (scraper?) to pull content
off the Forum I read all the time so that I can read it offline.

It seemed like mechanize is exactly what I want. But I try this:

require 'rubygems'; require 'mechanize'

agent = WWW::Mechanize.new
page = agent.get('http://dapo.org/forums/index.php')

pp page

puts "\n\n trying to login... \n\n"

# Fill out the login form
form = page.forms.first
form.vb_login_username = "username"
form.vb_login_md5password = "password"
form.do ="login"
form.s = ""

page = agent.submit(form)

pp page

# pull down a thread
page = agent.get('http://dapo.org/forums/archive/index.php?t-2293.html')

pp page

And it doesn't login (blank page for that last get). Clues?

Thanks,
--Colin
 
J

jfry

Hi Colin, I can't tell you how to do it in mechanize, but I can say
that what you are trying to do is super easy in Watir: http://openqa.org/watir

Watir (Web Application Testing In Ruby) is primarily used for driving
browser-based test automation, but it has a wonderful API that makes
what you describe very easy. Originally the only choice of browser to
drive was IE, but now the FireWatir and SafariWatir projects are
getting strong as well.

Best of luck, whatever solution you go with,
Jeff
 
C

Colin Summers

Nathan,

You are correct. I finally figured that part out (with some help from
someone who wrote the same sort of thing in .NET).

Thanks,
--Colin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,342
Messages
2,571,405
Members
48,796
Latest member
katerack

Latest Threads

Top