mechanize question

P

Peter Szinek

Hello,

I am using mechanize 0.6.3. On Aaron's blog I have found this example:

form.selectlist.options[2].select

however, for me, 'puts form.methods.sort' revealed that form does not
have a method 'selectlist'. What's up? I am doing something wrong?

Here is the code I am using:

require 'rubygems'
require 'mechanize'

agent = WWW::Mechanize.new
page = agent.get 'www.some-page.com'
form = page.forms.with.name('formname').first

and this form does not have a method selectlist. (just in case:
page.forms.with.name('formname') == 'WWW::Mechanize::Form' and not nil
or other kind of nonsense :)


Thanks,
Peter

__
http://www.rubyrailways.com
 
P

Peter Szinek

Peter said:
Hello,

I am using mechanize 0.6.3. On Aaron's blog I have found this example:

form.selectlist.options[2].select

however, for me, 'puts form.methods.sort' revealed that form does not
have a method 'selectlist'. What's up? I am doing something wrong?

Meanwhile, after browsing the RDoc of WWW::Mechanize::GlobalForm I have
found that indeed, Form does not have a selectlist method.

OK but then How I am supposed to get the selectlist of a form?

the $1.000.000 question: In the same RDoc I read this:

'Class Form does not work in the case there is some invalid (unbalanced)
html involved...'

Well, on my page, the <form> tag is not even closed. Can this be fixed
somehow?

Thanks,
Peter

__
http://www.rubyrailways.com
 
G

Giles Bowkett

Hey, I don't know the answer to this question specifically, but I did
some work with Mechanize recently and I found that it was pretty much
doing everything we needed it to do, just sometimes returning things
in forms we didn't expect. Pretty much everything we stumbled on, we
solved by getting the return value and doing .class on it to find out
what it was coming back to us as.

Going on the million-dollar question, which I actually only just
noticed, I think the HTML we were working against with Mechanize --
this was a consulting thing, so I don't have the code in front of me,
it's on somebody else's laptop -- but I think the HTML was pretty bad.
totally noncompliant, non-validating. we used Hpricot a lot, which is
pretty great, we might have actually given up on Mechanize for the
HTML-parsing and just used it for setting and getting cookies, things
like that. I don't quite recall, but definitely have a look at
Hpricot, it's pretty great and I think it was written by why the lucky
stiff.

Peter said:
Hello,

I am using mechanize 0.6.3. On Aaron's blog I have found this example:

form.selectlist.options[2].select

however, for me, 'puts form.methods.sort' revealed that form does not
have a method 'selectlist'. What's up? I am doing something wrong?

Meanwhile, after browsing the RDoc of WWW::Mechanize::GlobalForm I have
found that indeed, Form does not have a selectlist method.

OK but then How I am supposed to get the selectlist of a form?

the $1.000.000 question: In the same RDoc I read this:

'Class Form does not work in the case there is some invalid (unbalanced)
html involved...'

Well, on my page, the <form> tag is not even closed. Can this be fixed
somehow?

Thanks,
Peter

__
http://www.rubyrailways.com
 
G

Gregory Brown

Going on the million-dollar question, which I actually only just
noticed, I think the HTML we were working against with Mechanize --
this was a consulting thing, so I don't have the code in front of me,
it's on somebody else's laptop -- but I think the HTML was pretty bad.
totally noncompliant, non-validating. we used Hpricot a lot, which is
pretty great, we might have actually given up on Mechanize for the
HTML-parsing and just used it for setting and getting cookies, things
like that. I don't quite recall, but definitely have a look at
Hpricot, it's pretty great and I think it was written by why the lucky
stiff.

Mechanize now has direct support (and is implemented on top of) hpricot. [IIRC]

Aaron is likely to have more info on this, of course.
 
A

Aaron Patterson

Peter said:
Hello,

I am using mechanize 0.6.3. On Aaron's blog I have found this example:

form.selectlist.options[2].select

however, for me, 'puts form.methods.sort' revealed that form does not
have a method 'selectlist'. What's up? I am doing something wrong?

Meanwhile, after browsing the RDoc of WWW::Mechanize::GlobalForm I have
found that indeed, Form does not have a selectlist method.

OK but then How I am supposed to get the selectlist of a form?

The select list is treated like a regular field. Say you have a select
list with name 'foo', you could find it like this:

form.fields.name('foo')

-or, with method missing magic-

form.foo
the $1.000.000 question: In the same RDoc I read this:

'Class Form does not work in the case there is some invalid (unbalanced)
html involved...'

Well, on my page, the <form> tag is not even closed. Can this be fixed
somehow?

Mechanize uses HPricot for its HTML parsing. If Hpricot can handle the
form tag that isn't closed, then you should be fine. If HPricot cannot
handle the unbalanced form tag, you can write a pluggable parser to fix
up your HTML before it is run through HPricot.

Hope that helps.

--Aaron
 
W

_why

Mechanize uses HPricot for its HTML parsing. If Hpricot can handle the
form tag that isn't closed, then you should be fine. If HPricot cannot
handle the unbalanced form tag, you can write a pluggable parser to fix
up your HTML before it is run through HPricot.

If Hpricot cannot handle the tag, please open a ticket[1] or mail me, so I can
fix it and add to my tests. This way we all get to benefit from these wild tags
you've captured.

_why

[1] https://code.whytheluckystiff.net/hpricot/
 
P

Peter Szinek

_why said:
Mechanize uses HPricot for its HTML parsing. If Hpricot can handle the
form tag that isn't closed, then you should be fine. If HPricot cannot
handle the unbalanced form tag, you can write a pluggable parser to fix
up your HTML before it is run through HPricot.

If Hpricot cannot handle the tag, please open a ticket[1] or mail me, so I can
fix it and add to my tests. This way we all get to benefit from these wild tags
you've captured.

Sure :) I am just beginning with mechanize so I don't have a real-life
testcase yet, but since I am going to scrape tens or maybe hundreds of
pages with HPricot + mechanize in the near future, I guess something
will pop up sooner or later...

Peter

__
http://www.rubyrailways.com
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,265
Latest member
TodLarocca

Latest Threads

Top