Trying to make a spider using mechanize

T

tedpottel

Hi,

I can read the home page using the mechanize lib. Is there a way to
load in web pages using filename.html instad of servername/
filename.html. Lots of time the links just have the file name. I'm
trying to read in the links name and then vsit those pages.

here is the sample code I am ussing.


import ClientForm
import mechanize


#get home page
request = mechanize.Request("http://www.activetechconsulting.com")
response = mechanize.urlopen(request)
print response.read()

#sub page (this does note work)
request = mechanize.Request("service.html")
response = mechanize.urlopen(request)
print response.read-Ted
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,262
Messages
2,571,056
Members
48,769
Latest member
Clifft

Latest Threads

Top