Web-crawling

J

John Bradbury

I am trying to develop a special putpose crawler using htmllib & urllib.
How do you tell the server application that you are a modern browser and can
handle frames?

Thanks,

john Bradbury
 
R

Rene Pijlman

John Bradbury:
I am trying to develop a special putpose crawler using htmllib & urllib.
How do you tell the server application that you are a modern browser and can
handle frames?

I don't know of any "I can handle frames" header and I don't see why the
server would care, but you could mimic the User-agent header sent by a
modern browser.
 
J

John Bradbury

I don't know what is causing the problem, but the site I am accessing is
sending out forms for a browser that has a low resolution and does not
support frames. Excuse my ignorance, but where do you set up the User-agent
header you suggested.

Many thanks for your prompt reply.

John Bradbury
 
J

John J. Lee

John Bradbury said:
Rene Pijlman said:
John Bradbury:
I am trying to develop a special putpose crawler using htmllib & urllib.
How do you tell the server application that you are a modern browser
and can handle frames?
[...]
server would care, but you could mimic the User-agent header sent by a
[...]
I don't know what is causing the problem, but the site I am accessing is
sending out forms for a browser that has a low resolution and does not
support frames. Excuse my ignorance, but where do you set up the
User-agent header you suggested.

For urllib2 (well, almost):

http://wwwsearch.sourceforge.net/ClientCookie/doc.html#headers


John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top