urllib2 weirdness when https_proxy environment variable is exported

D

Devraj

Hi Everyone,

I have been extensively using Python's urllib2 while developing a
project with the Google Data API. The Google Data API uses httplib to
place all of its requests. However I have been using urllib2 and some
handlers that I discovered in an ASPN article to handle HTTPS proxies
in my code.

The Google Data API relies on an environment variable called
https_proxy to get information about the proxy to be used. However
urllib2 starts spitting out the BadStatusLine exception if the
https_proxy environment variable is found.

Has anyone experienced similar things with urllib2? Is this a bug in
the urllib2 libraries or am I completely missing something here?

Attached is a dump of the error messages. Any experiences/information/
finding are welcome.

Thanks for your time.

Regards,
Devraj

------

Notice the BadStatusLine exception at the end of the error messages,
the https_proxy variable is not even used by the code in anyway. I
have also implemented simpler examples of the code to demonstrate the
same.

root@sidux:/data/gdatacopier# export https_proxy="http://proxy2:8080"
root@sidux:/data/gdatacopier# ./gdoc-cp.py --username
(e-mail address removed) --list-all
gdoc-cp.py version 1.0, content copy & backup utility for Google
documents & spreadsheets
Distributed under the GNU/GPL v2, Copyright (c) De Bortoli Wines
<http://debortoli.com.au>
Password:
Logging into Google server as (e-mail address removed) ...Traceback
(most recent call last):
File "./gdoc-cp.py", line 352, in <module>
main()
File "./gdoc-cp.py", line 347, in main
parse_user_options()
File "./gdoc-cp.py", line 298, in parse_user_options
handle_login(_username, _password)
File "./gdoc-cp.py", line 137, in handle_login
_copier.login(username, password)
File "/media/disk/gdatacopier/gdatacopier.py", line 310, in login
response = self._open_https_url(prepared_auth_url, login_data)
File "/media/disk/gdatacopier/gdatacopier.py", line 555, in
_open_https_url
response = opener.open(target_url, urllib.urlencode(post_data))
File "/usr/lib/python2.5/urllib2.py", line 381, in open
response = self._open(req, data)
File "/usr/lib/python2.5/urllib2.py", line 399, in _open
'_open', req)
File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
result = func(*args)
File "/usr/lib/python2.5/urllib2.py", line 675, in <lambda>
meth(r, proxy, type))
File "/usr/lib/python2.5/urllib2.py", line 698, in proxy_open
return self.parent.open(req)
File "/usr/lib/python2.5/urllib2.py", line 381, in open
response = self._open(req, data)
File "/usr/lib/python2.5/urllib2.py", line 399, in _open
'_open', req)
File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
result = func(*args)
File "/usr/lib/python2.5/urllib2.py", line 1107, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/media/disk/gdatacopier/gdatacopier.py", line 195, in do_open
return urllib2.HTTPHandler.do_open(self, ProxyHTTPConnection, req)
File "/usr/lib/python2.5/urllib2.py", line 1080, in do_open
r = h.getresponse()
File "/usr/lib/python2.5/httplib.py", line 924, in getresponse
response.begin()
File "/usr/lib/python2.5/httplib.py", line 385, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.5/httplib.py", line 349, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine

Exporting the http_proxy variable does what one would expect

root@sidux:/data/gdatacopier# export http_proxy="http://proxy2:8080"
root@sidux:/data/gdatacopier# ./gdoc-cp.py --username
(e-mail address removed) --list-all
gdoc-cp.py version 1.0, content copy & backup utility for Google
documents & spreadsheets
Distributed under the GNU/GPL v2, Copyright (c) De Bortoli Wines
<http://debortoli.com.au>
Password:
Logging into Google server as (e-mail address removed) ...Traceback
(most recent call last):
File "./gdoc-cp.py", line 352, in <module>
main()
File "./gdoc-cp.py", line 347, in main
parse_user_options()
File "./gdoc-cp.py", line 298, in parse_user_options
handle_login(_username, _password)
File "./gdoc-cp.py", line 137, in handle_login
_copier.login(username, password)
File "/media/disk/gdatacopier/gdatacopier.py", line 292, in login
self._gd_client.ProgrammaticLogin()
File "/usr/lib/python2.5/site-packages/gdata/service.py", line 307,
in ProgrammaticLogin
auth_connection.endheaders()
File "/usr/lib/python2.5/httplib.py", line 856, in endheaders
self._send_output()
File "/usr/lib/python2.5/httplib.py", line 728, in _send_output
self.send(msg)
File "/usr/lib/python2.5/httplib.py", line 695, in send
self.connect()
File "/usr/lib/python2.5/httplib.py", line 1130, in connect
sock.connect((self.host, self.port))
File "<string>", line 1, in connect
socket.gaierror: (-2, 'Name or service not known')
 
J

John J. Lee

Devraj said:
I have been extensively using Python's urllib2 while developing a
project with the Google Data API. The Google Data API uses httplib to
place all of its requests. However I have been using urllib2 and some
handlers that I discovered in an ASPN article to handle HTTPS proxies
in my code.

The Google Data API relies on an environment variable called
https_proxy to get information about the proxy to be used. However
urllib2 starts spitting out the BadStatusLine exception if the
https_proxy environment variable is found.
[...]

This is because urllib2 does not support HTTPS proxies (neither does
urllib). See Python cookbook for a hack to get it working.


John
 
D

Devraj

Hi John,

Thanks for that.

Do you have any web urls that I can see an example of the hack?


Devraj said:
I have been extensively using Python's urllib2 while developing a
project with the Google Data API. The Google Data API uses httplib to
place all of its requests. However I have been using urllib2 and some
handlers that I discovered in an ASPN article to handle HTTPS proxies
in my code.
The Google Data API relies on an environment variable called
https_proxy to get information about the proxy to be used. However
urllib2 starts spitting out the BadStatusLine exception if the
https_proxy environment variable is found.

[...]

This is because urllib2 does not support HTTPS proxies (neither does
urllib). See Python cookbook for a hack to get it working.

John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,521
Members
44,995
Latest member
PinupduzSap

Latest Threads

Top