urllib2 httplib.BadStatusLine exception while opening a page on anOracle HTTP Server

A

ak

Hi everyone,

I have a problem with urllib2 on this particular url, hosted on an
Oracle HTTP Server

http://www.orange.sk/eshop/sk/portal/catalog.html?type=post&subtype=phone&null

which gets 302 redirected to https://www.orange.sk/eshop/sk/catalog/post/phones.html,
after setting a cookie through the Set-Cookie header field in the 302
reply. This works fin with firefox.

However, with urllib2 and the following code snippet, it doesn't work


--------
import cookiejar
import urllib2

cookiejar = cookielib.LWPCookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookiejar))
url = 'http://www.orange.sk/eshop/sk/portal/catalog.html?
type=post&subtype=phone&null'
req = urllib2.Request(url, None)
s=opener.open(req)
--------

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.5/urllib2.py", line 387, in open
response = meth(req, response)
File "/usr/lib/python2.5/urllib2.py", line 498, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.5/urllib2.py", line 419, in error
result = self._call_chain(*args)
File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
result = func(*args)
File "/usr/lib/python2.5/urllib2.py", line 582, in http_error_302
return self.parent.open(new)
File "/usr/lib/python2.5/urllib2.py", line 381, in open
response = self._open(req, data)
File "/usr/lib/python2.5/urllib2.py", line 399, in _open
'_open', req)
File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
result = func(*args)
File "/usr/lib/python2.5/urllib2.py", line 1115, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/usr/lib/python2.5/urllib2.py", line 1080, in do_open
r = h.getresponse()
File "/usr/lib/python2.5/httplib.py", line 928, in getresponse
response.begin()
File "/usr/lib/python2.5/httplib.py", line 385, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.5/httplib.py", line 349, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine

Trying the redirected url directly doesn't work either (trying with
Firefox will give an HTML error page, as the cookie is not set yet,
but trying with urllib2 gives the same exception as previously,
whereas it should return the HTML error page)
This works correctly on other urls on this website (http(s)://
www.orange.sk).

Am I doing anything wrong or is this a bug in urllib2 ?

-- ak
 
A

ak

Hi everyone,

I have a problem with urllib2 on this particular url, hosted on an
Oracle HTTP Server

http://www.orange.sk/eshop/sk/portal/catalog.html?type=post&subtype=p....

which gets 302 redirected tohttps://www.orange.sk/eshop/sk/catalog/post/phones.html,
after setting a cookie through the Set-Cookie header field in the 302
reply. This works fin with firefox.

However, with urllib2 and the following code snippet, it doesn't work

--------
import cookiejar
import urllib2

cookiejar = cookielib.LWPCookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookiejar))
url = 'http://www.orange.sk/eshop/sk/portal/catalog.html?
type=post&subtype=phone&null'
req = urllib2.Request(url, None)
s=opener.open(req)
--------

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.5/urllib2.py", line 387, in open
    response = meth(req, response)
  File "/usr/lib/python2.5/urllib2.py", line 498, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python2.5/urllib2.py", line 419, in error
    result = self._call_chain(*args)
  File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.5/urllib2.py", line 582, in http_error_302
    return self.parent.open(new)
  File "/usr/lib/python2.5/urllib2.py", line 381, in open
    response = self._open(req, data)
  File "/usr/lib/python2.5/urllib2.py", line 399, in _open
    '_open', req)
  File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.5/urllib2.py", line 1115, in https_open
    return self.do_open(httplib.HTTPSConnection, req)
  File "/usr/lib/python2.5/urllib2.py", line 1080, in do_open
    r = h.getresponse()
  File "/usr/lib/python2.5/httplib.py", line 928, in getresponse
    response.begin()
  File "/usr/lib/python2.5/httplib.py", line 385, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.5/httplib.py", line 349, in _read_status
    raise BadStatusLine(line)
httplib.BadStatusLine

Trying the redirected url directly doesn't work either (trying with
Firefox will give an HTML error page, as the cookie is not set yet,
but trying with urllib2 gives the same exception as previously,
whereas it should return the HTML error page)
This works correctly on other urls on this website (http(s)://www.orange.sk).

Am I doing anything wrong or is this a bug in urllib2 ?

-- ak


Actually, I was wrong on the last point, this does *not* work on
https://www.orange.sk (but does on http://www.orange.sk). IMHO, this
means either urllib2 or the server misimplemented HTTPS.

Here's some output with debuglevel=1 :
reply: 'HTTP/1.1 200 OK\r\n'
header: Date: Mon, 19 Jan 2009 21:44:03 GMT
header: Server: Oracle-Application-Server-10g/10.1.3.1.0 Oracle-HTTP-
Server
header: Set-Cookie:
JSESSIONID=0a19055a30d630c427bda71d4e26a37ca604b9f590dc.e3eNaNiRah4Pe3aSch8Sc3yOc40;
path=/web
header: Expires: Mon, 19 Jan 2009 21:44:13 GMT
header: Surrogate-Control: max-age="10"
header: Content-Type: text/html; charset=ISO-8859-2
header: X-Cache: MISS from www.orange.sk
header: Connection: close
header: Transfer-Encoding: chunked
<addinfourl at 137417292 whose fp = <socket._fileobject object at
0x831348c>>
reply: ''
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.5/urllib2.py", line 381, in open
response = self._open(req, data)
File "/usr/lib/python2.5/urllib2.py", line 399, in _open
'_open', req)
File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
result = func(*args)
File "/usr/lib/python2.5/urllib2.py", line 1115, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/usr/lib/python2.5/urllib2.py", line 1080, in do_open
r = h.getresponse()
File "/usr/lib/python2.5/httplib.py", line 928, in getresponse
response.begin()
File "/usr/lib/python2.5/httplib.py", line 385, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.5/httplib.py", line 349, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine

As you can see the reply from the server seems empty (which results in
the BadStatusLine exception)

Any help greatly appreciated.

-- ak
 
S

Steven D'Aprano

Hi everyone,

I have a problem with urllib2 on this particular url, hosted on an
Oracle HTTP Server

http://www.orange.sk/eshop/sk/portal/catalog.html? type=post&subtype=phone&null

which gets 302 redirected to
https://www.orange.sk/eshop/sk/catalog/post/phones.html, after setting a
cookie through the Set-Cookie header field in the 302 reply. This works
fin with firefox.

However, with urllib2 and the following code snippet, it doesn't work


Looking at the BadStatusLine exception raised, the server response line
is empty. Looking at the source for httpllib suggests to me that the
server closed the connection early. Perhaps it doesn't like connections
from urllib2?

I ran a test pretending to be IE using this code:

cookiejar = cookielib.LWPCookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookiejar))
url = 'http://www.orange.sk/eshop/sk/portal/catalog.html?' \
'type=post&subtype=phone&null'
agent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; " \
"NeosBrowser; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
headers = {'User-Agent': agent}
req = urllib2.Request(url, data=None, headers=headers)
try:
s=opener.open(req)
except httplib.BadStatusLine, e:
print e, e.line
else:
print "Success"



but it failed. So the problem is not as simple as changing the user-agent
string.

Other than that, I'm stumped.
 
A

ak

Looking at the BadStatusLine exception raised, the server response line
is empty. Looking at the source for httpllib suggests to me that the
server closed the connection early. Perhaps it doesn't like connections
from urllib2?

I ran a test pretending to be IE using this code:

cookiejar = cookielib.LWPCookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookiejar))
url = 'http://www.orange.sk/eshop/sk/portal/catalog.html?'\
    'type=post&subtype=phone&null'
agent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; " \
    "NeosBrowser; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
headers = {'User-Agent': agent}
req = urllib2.Request(url, data=None, headers=headers)
try:
    s=opener.open(req)
except httplib.BadStatusLine, e:
    print e, e.line
else:
    print "Success"

but it failed. So the problem is not as simple as changing the user-agent
string.

Other than that, I'm stumped.

Thanks a lot for confirming this. I also tried with different headers,
including putting *exactly* the same headers as firefox (including
Connection:keep-alive by modifying httplib), it still doesn't work.
The only possible explanation for me is that python's httplib doesn't
handle SSL/TLS 'properly' (not necessarly in the sense of the TLS
spec, but in the sense that every other browser can connect properly
to this website and httplib can't)

If anyone knows an Oracle HTTPS server to confirm this on another
server, it would be nice...
 
A

Ahmed, Shakir

I am grabbing few fields from a table and one of the columns is in date
format. The output which I am getting is "Wed Feb 09 00:00:00 2005" but
the data in that column is "02/09/2005" and I need the same format
output to insert those recodes into another table.

print my_service_DATE
Wed Feb 09 00:00:00 2005

Any help is highly appreciated.

sk
 
T

Tim Chase

I am grabbing few fields from a table and one of the columns is in date
format. The output which I am getting is "Wed Feb 09 00:00:00 2005" but
the data in that column is "02/09/2005" and I need the same format
output to insert those recodes into another table.

print my_service_DATE
Wed Feb 09 00:00:00 2005

if you are getting actual date/datetime objects, just use the
strftime() method to format as you so desire.

If you're getting back a *string*, then you should use
time.strptime() to parse the string into a time-object, and then
use the constituent parts to reformat as you see fit.

-tkc
 
O

O Peng

I'm running into a similar problem with the BadStatusLine.
The source code for httplib.py in the problem is as follows:

class HTTPResponse:
...
def _read_status(self):
line = self.fp.readline()
...
if not line:
# Presumably, the server closed the connection before
# sending a valid response.
raise BadStatusLine(line)

However, I found that right before the 'raise BadStatusLine(line)'
when I ran the following:

restOfResponse = self.fp.read()
print restOfResponse

restOfResponse is NOT empty. In fact, when I run self.fp.read() at
the beginning of the begin() function, it is not empty at all.
This leads me to believe there is a bug with the self.fp.readline()
(socket._fileobject.readline()) function. For me it only fails
sometimes.

This behavior is only observed on Windows, Python 2.5. Running it on
Mac OS X, Python 2.5 yielded no problems.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,038
Messages
2,570,372
Members
47,018
Latest member
IrisN51866

Latest Threads

Top