webspider getting stuck

N

notnorwegian

i am writing a simple webspider .

how do i avoid getting stuck at something like this:
Enter username for W3CACL at www.w3.org:

?

i can obv add an if-clause for the specific site but since i guess
there will be more of the same thats ov not a viable approach in the
long run.
 
J

John Nagle

i am writing a simple webspider .

how do i avoid getting stuck at something like this:
Enter username for W3CACL at www.w3.org:

?


It's a silly feature of urllib. See

http://docs.python.org/lib/module-urllib.html

where it says:

"Note: When performing basic authentication, a FancyURLopener instance calls its
prompt_user_passwd() method. The default implementation asks the users for the
required information on the controlling terminal. A subclass may override this
method to support more appropriate behavior if needed."

Yes, the default behavior when faced with a site that wants authentication
is to to ask for a user name and password on standard input. This is
seldom what you want.

So subclass and overrride.

John Nagle
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top