urllib2 - 403 that _should_ not occur.

J

James Mills

Hey all,

The following fails for me:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.6/urllib2.py", line 124, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.6/urllib2.py", line 389, in open
response = meth(req, response)
File "/usr/lib/python2.6/urllib2.py", line 502, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.6/urllib2.py", line 427, in error
return self._call_chain(*args)
File "/usr/lib/python2.6/urllib2.py", line 361, in _call_chain
result = func(*args)
File "/usr/lib/python2.6/urllib2.py", line 510, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 403: Forbidden
However, that _same_ url works perfectly fine on the
same machine (and same network) using any of:
* curl
* wget
* elinks
* firefox

Any helpful ideas ?

cheers
James
 
P

Philip Semanchuk

Hey all,

The following fails for me:

Traceback (most recent call last): [...]
Any helpful ideas ?

Maybe raise a real bug @ bugs.python.org instead of just mentioning it
like I did: http://bugs.python.org/msg77889

I think at least some sites would be willing to add the new UA to
their whitelists.

I don't think I understand you clearly. Whether or not Google et al
whitelist the Python UA isn't a Python issue, is it?
 
S

Steve Holden

Philip said:
Hey all,

The following fails for me:

from urllib2 import urlopen
f =
urlopen("http://groups.google.com/group/chromium-announce/feed/rss_v2_0_msgs.xml")


Traceback (most recent call last): [...]
Any helpful ideas ?

Maybe raise a real bug @ bugs.python.org instead of just mentioning it
like I did: http://bugs.python.org/msg77889

I think at least some sites would be willing to add the new UA to
their whitelists.

I don't think I understand you clearly. Whether or not Google et al
whitelist the Python UA isn't a Python issue, is it?
I'd say it's an issue relevant to Python users, which woudl seem to put
it pretty much in the mainstream for c.l.py - especially as the code
causing concern was written in Python.

regards
Steve
 
P

Philip Semanchuk

Philip said:
On Jan 11, 11:59 pm, "James Mills" <[email protected]>
wrote:
Hey all,

The following fails for me:

from urllib2 import urlopen
f =
urlopen("http://groups.google.com/group/chromium-announce/feed/rss_v2_0_msgs.xml
")


Traceback (most recent call last):
[...]
Any helpful ideas ?

Maybe raise a real bug @ bugs.python.org instead of just
mentioning it
like I did: http://bugs.python.org/msg77889

I think at least some sites would be willing to add the new UA to
their whitelists.

I don't think I understand you clearly. Whether or not Google et al
whitelist the Python UA isn't a Python issue, is it?
I'd say it's an issue relevant to Python users, which woudl seem to
put
it pretty much in the mainstream for c.l.py - especially as the code
causing concern was written in Python.

I didn't mean to imply that the conversation didn't belong here. I
think that is perfectly appropriate. What I don't understand is the
suggestion that Google's server config should be raised as a bug
against Python. (i.e. "raise a real bug @ bugs.python.org...")
 
S

Steve Holden

Philip said:
Philip said:
On Jan 12, 2009, at 6:48 PM, ajaksu wrote:

On Jan 11, 11:59 pm, "James Mills" <[email protected]>
wrote:
Hey all,

The following fails for me:

from urllib2 import urlopen
f =
urlopen("http://groups.google.com/group/chromium-announce/feed/rss_v2_0_msgs.xml")



Traceback (most recent call last):
[...]
Any helpful ideas ?

Maybe raise a real bug @ bugs.python.org instead of just mentioning it
like I did: http://bugs.python.org/msg77889

I think at least some sites would be willing to add the new UA to
their whitelists.

I don't think I understand you clearly. Whether or not Google et al
whitelist the Python UA isn't a Python issue, is it?
I'd say it's an issue relevant to Python users, which woudl seem to put
it pretty much in the mainstream for c.l.py - especially as the code
causing concern was written in Python.

I didn't mean to imply that the conversation didn't belong here. I think
that is perfectly appropriate. What I don't understand is the suggestion
that Google's server config should be raised as a bug against Python.
(i.e. "raise a real bug @ bugs.python.org...")
Oh, I see! Yes, it's hard to know what actions anyone could take on such
a bug report. I suppose the documentation could be modified to describe
how some services require specific agents, but that wouldn't help a huge
amount.

regards
Steve
 
F

Falcolas

Hey all,

The following fails for me:

For what it's worth, I've had a similar problem with the urlopen as
well. Using the library default urlopen results in an error, but if I
build an opener with the basic handlers, it works just fine.
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
f = urllib2.urlopen("http://localhost:8000")
File "C:\Python25\lib\urllib2.py", line 121, in urlopen
return _opener.open(url, data)
File "C:\Python25\lib\urllib2.py", line 380, in open
response = meth(req, response)
File "C:\Python25\lib\urllib2.py", line 491, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python25\lib\urllib2.py", line 418, in error
return self._call_chain(*args)
File "C:\Python25\lib\urllib2.py", line 353, in _call_chain
result = func(*args)
File "C:\Python25\lib\urllib2.py", line 499, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 403: Forbidden'something relevant'
 
A

ajaksu

I don't think I understand you clearly. Whether or not Google et al  
whitelist the Python UA isn't a Python issue, is it?

Hi, sorry for taking so long to reply :)

I imagine it's something akin to Firefox's 'Report broken website':
evangelism.

IMHO, if the PSF *cough* Steve *cough* or individual Python hackers
can contact key sites (as Wikipedia, groups.google, etc.) the issue
can be solved sooner.

Instead of waiting for each whitelist maintainer's to find out we have
a new UA, go out and tell them. A template for such requests could
help those inside e.g. Google to bring the issue to the attention of
the whitelist admins. The community has lots of connections that could
be useful to pass the message along, if only 'led by the nose' to
achieve that :)

Hence, the suggestion to raise a bug.

Regards,
Daniel
 
P

Philip Semanchuk

Hi, sorry for taking so long to reply :)

I imagine it's something akin to Firefox's 'Report broken website':
evangelism.

IMHO, if the PSF *cough* Steve *cough* or individual Python hackers
can contact key sites (as Wikipedia, groups.google, etc.) the issue
can be solved sooner.

Instead of waiting for each whitelist maintainer's to find out we have
a new UA, go out and tell them. A template for such requests could
help those inside e.g. Google to bring the issue to the attention of
the whitelist admins. The community has lots of connections that could
be useful to pass the message along, if only 'led by the nose' to
achieve that :)

Hence, the suggestion to raise a bug.

Gotcha.

In this case I think there is no whitelist. I think Google has a
default accept policy supplemented with a blacklist rather than a
default ban policy mitigated by a whitelist. As evidence I submit the
fact that my user agent of "funny fish" was accepted. In other words,
Google has taken explicit steps to ban agents sending the default
Python UA. Now, if the default UA changed in Python 3.0, maybe the
best thing to do is keep quiet and maybe it will fly under the Google
radar for a while. =)

Cheers
Philip
 
S

Steve Holden

ajaksu said:
Hi, sorry for taking so long to reply :)

I imagine it's something akin to Firefox's 'Report broken website':
evangelism.

IMHO, if the PSF *cough* Steve *cough* or individual Python hackers
can contact key sites (as Wikipedia, groups.google, etc.) the issue
can be solved sooner.

Instead of waiting for each whitelist maintainer's to find out we have
a new UA, go out and tell them. A template for such requests could
help those inside e.g. Google to bring the issue to the attention of
the whitelist admins. The community has lots of connections that could
be useful to pass the message along, if only 'led by the nose' to
achieve that :)

Hence, the suggestion to raise a bug.
OK, but be aware that the PSF doesn't monitor the bugs looking for
actions to take on behalf of the Python user community. In fact we
aren't overtly "political" in this way at all. This doesn't mean it
wouldn't be useful for the PSF to get involved in this role; just that
right now it isn't, and a bug report probably isn't the best way to get
action.

regards
Steve
 
A

ajaksu

ajaksu said:
[snip evangelism stuff]
OK, but be aware that the PSF doesn't monitor the bugs looking for
actions to take on behalf of the Python user community. In fact we
aren't overtly "political" in this way at all. This doesn't mean it
wouldn't be useful for the PSF to get involved in this role; just that
right now it isn't, and a bug report probably isn't the best way to get
action.

Acknowledged. I have posted a (pretty poor) support request @
http://groups.google.com/group/Google-Groups-Basics/ and suggest
others do the same for Wikipedia and other big sites that block 3.0 (I
might build a list of those later today). Maybe a wiki page, some blog
posts, etc.

Best regards,
Daniel

Request: http://groups.google.com/group/Google-Groups-Basics/browse_thread/thread/498a39a89d81b650#
"""
Hi,
As mentioned in a comp.lang.python thread[1], the new version of
Python (3.0) cannot open pages @ groups.google.com.

It seems the UA of Python 3.0 ("User-Agent: Python-urllib/3.1") is
actively blocked, while that of Python 2.5 ("User-Agent: Python-urllib/
1.17") isn't.

This message is a call for help so that we can get Python 3.0 working
with groups.google.com. Is this the right place to bring the issue to
the attention of those that can fix it? Does anyone have a contact
that could speed up getting Python 3.0 working?

Thanks in advance,
Daniel

[1]http://groups.google.com/group/comp.lang.python/browse_thread/
thread/088491d5a0d86f1b
"""
 
S

Steve Holden

ajaksu said:
ajaksu said:
[snip evangelism stuff]
OK, but be aware that the PSF doesn't monitor the bugs looking for
actions to take on behalf of the Python user community. In fact we
aren't overtly "political" in this way at all. This doesn't mean it
wouldn't be useful for the PSF to get involved in this role; just that
right now it isn't, and a bug report probably isn't the best way to get
action.

Acknowledged. I have posted a (pretty poor) support request @
http://groups.google.com/group/Google-Groups-Basics/ and suggest
others do the same for Wikipedia and other big sites that block 3.0 (I
might build a list of those later today). Maybe a wiki page, some blog
posts, etc.

Best regards,
Daniel

Request: http://groups.google.com/group/Google-Groups-Basics/browse_thread/thread/498a39a89d81b650#
"""
Hi,
As mentioned in a comp.lang.python thread[1], the new version of
Python (3.0) cannot open pages @ groups.google.com.

It seems the UA of Python 3.0 ("User-Agent: Python-urllib/3.1") is
actively blocked, while that of Python 2.5 ("User-Agent: Python-urllib/
1.17") isn't.

This message is a call for help so that we can get Python 3.0 working
with groups.google.com. Is this the right place to bring the issue to
the attention of those that can fix it? Does anyone have a contact
that could speed up getting Python 3.0 working?

Thanks in advance,
Daniel

[1]http://groups.google.com/group/comp.lang.python/browse_thread/
thread/088491d5a0d86f1b
"""
Thanks very much. It's good to see Python users taking action that will
lead to benefits for all. Congratulations on taking the initiative.

regards
Steve
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top