[Mechanize.ClientForm] double reading from urllib2.urlopen

T

tiktak.hodiki

Hello, folks!
I use mechanize.clientform to parse HTML-forms. I preliminary check
response and call response.read().find("..."). But when it's taken to
ClientForm.ParseResponse, it can't parse because of response.read() is
zero-length text. The problem is that ClientForm.ParseResponse is not
taken text of response, only object.

Example:

import urllib
from ClientForm import ParseResponse
response = urllib.urlopen("http://yandex.ru")
if -1 != response.read().find("foobar"):
pass
form = ParseResponse(response)[1] <-- there is exception IndexError
 
M

MRAB

Hello, folks!
I use mechanize.clientform to parse HTML-forms. I preliminary check
response and call response.read().find("..."). But when it's taken to
ClientForm.ParseResponse, it can't parse because of response.read() is
zero-length text. The problem is that ClientForm.ParseResponse is not
taken text of response, only object.

Example:

import urllib
from ClientForm import ParseResponse
response = urllib.urlopen("http://yandex.ru")
if -1 != response.read().find("foobar"):
pass
form = ParseResponse(response)[1] <-- there is exception IndexError
It might be that read() is consuming the data, so there's none remaining
for the second read(). Try:

response = urllib.urlopen("http://yandex.ru")
text = response.read()
if "foobar" in text: # preferred to find()
pass
form = ParseResponse(text)[1]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,577
Members
45,052
Latest member
LucyCarper

Latest Threads

Top