urllib and sites that require passwds

B

bob_smith_17280

Hello,

I'm doing a small website survey as a consultant for a company that has
a large private lan. Basically, I'm trying to determine how many web
sites there are on their network and what content the sites contain
(scary how they don't know this, but I suspect many companies are this
way).

Everything is going fine so far except for sites that require passwds
to be accessed. I don't want to view content on these sites, I only
want to note that they are passwd protected, make a list of them and
move on. The problem is that urllib hangs waiting for a username/passwd
to be entered. Is there a graceful way to deal with this?
Many thanks,
Bob
 
F

Fuzzyman

USe urllib2 which will fail with an exception. You can trap this
exception and using the code attribute of the exception object,
determine why it failed. The error code for 'authentication required'
is 401.

Off the top of my head :

import urllib2
req = urllib2.Request(theurl)
try:
handle = urllib2.urlopen(req)
except IOError, e:
if not e.hasattr('code'):
print 'The url appears to be invalid.'
print e.reason
else:
if e.code == 401:
print theurl, 'is protected with a password.'
else:
print 'We failed with error code', e.code
HTH

Regards,

Fuzzy
http://www.voidspace.org.uk/python/index.shtml
 
F

Fuzzyman

damn... I'm losing my leading spaces.... indentation should be obvious
anyway... (everything below except is indented at least one step).
Fuzzy
 
I

Ishwor

damn... I'm losing my leading spaces.... indentation should be obvious
We'll forgive you for that. It was from "top-of-your-head" ~;-)
anyway... (everything below except is indented at least one step).
Fuzzy
Its nice that urllib2 returns errcode to process further. doesn't
urllib do that?
Anyway i wanted to know if any website which is similar to CPAN
library website? I mean i want to be able find modules n stuff for
Python.. It would be really great to know.

Thanks.
 
F

Fuzzyman

Ishwor said:
obvious
We'll forgive you for that. It was from "top-of-your-head" ~;-)

Hey - I put the indentation in there... it just got stripped out when
it was posted ! :)
Its nice that urllib2 returns errcode to process further. doesn't
urllib do that?

The OP is saying that it hangs rather than returning an error. I
haven't tested it. In general urllib2.urlopen is much better than
urllib.urlopen. urllib has some useful other functions though.
Anyway i wanted to know if any website which is similar to CPAN
library website? I mean i want to be able find modules n stuff for
Python.. It would be really great to know.

There is PyPi and the Vaults of Parnassus. Neither are really like
CPAN. There has been lots of talk about it recently - everyone agrees
we need one... but no one is offering the bandwidth or the code.

There are lots of modules available though - and usually not too hard
to track down.

Regards,

Fuzzy
http://www.voidspace.org.uk/python/index.shtml
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top