Re: Performance evaluation of HTTPS library

Discussion in 'Python' started by Antoine Pitrou, Oct 12, 2010.

  1. On Tue, 12 Oct 2010 05:40:43 -0700 (PDT)
    Ashish Vyas <> wrote:
    > Another observation that I have made is with 10 parallel HTTPS connection each
    > trying 1 transaction per second from 2 different machines (effectively same load
    > on server), the response time is again reducing to .17 secs.
    > However if I run two instances of the tool with 10 parallel HTTPS connection
    > each trying 1 transaction per second from from same machine, the response time
    > is again shooting up to 1.1 seconds.


    Is the client machine at 100% CPU when you do that?

    > So the question is does anyone here have any idea or some data about performance
    > limitation of HTTPS implementation in Python 3.1?


    Which API are you using? urlopen()?
    The HTTPS implementation is basically the same as the HTTP
    implementation, except for the additional SSL layer. So if indeed
    Python is responsible for the slowdown, it may be because of excessive
    overhead brought by the SSL layer.

    It would be nice if you tried the just-released Python 3.2 alpha,
    because some changes have been made to the SSL wrapper:
    http://python.org/download/releases/3.2/

    Also, there's a feature request to reduce overhead of SSL
    connections, but it needs implementing:
    http://bugs.python.org/issue8106

    Regards

    Antoine.
     
    Antoine Pitrou, Oct 12, 2010
    #1
    1. Advertising

  2. Antoine Pitrou

    Ashish Guest

    On Oct 12, 6:33 pm, Antoine Pitrou <> wrote:
    > On Tue, 12 Oct 2010 05:40:43 -0700 (PDT)
    >
    > Ashish Vyas <> wrote:
    > > Another observation that I have made is with 10 parallel HTTPS connection each
    > > trying 1 transaction per second from 2 different machines (effectively same load
    > > on server), the response time is again reducing to .17 secs.
    > > However if I run two instances of the tool with 10 parallel HTTPS connection
    > > each trying 1 transaction per second from from same machine, the response time
    > > is again shooting up to 1.1 seconds.

    >
    > Is the client machine at 100% CPU when you do that?
    >

    With HTTP, I see client CPU at appx. 97%. However with HTTPS, it stays
    at 53-55%.

    > > So the question is does anyone here have any idea or some data about performance
    > > limitation of HTTPS implementation in Python 3.1?

    >
    > Which API are you using? urlopen()?
    > The HTTPS implementation is basically the same as the HTTP
    > implementation, except for the additional SSL layer. So if indeed
    > Python is responsible for the slowdown, it may be because of excessive
    > overhead brought by the SSL layer.
    >

    I am doing something like this:-

    self.conn = AsyncHTTPSConnection(self.URL, HTTPS_PORT)

    self.conn.putrequest('POST', WEBSERVER_IP)
    self.conn.putheader('Cookie', cookie)
    self.conn.putheader('Content-Length', reqLen)
    ...
    self.conn.endheaders()
    self.conn.send(str.encode(reqest))


    and AsyncHTTPSConnection class is something like this:-

    class AsyncHTTPSConnection(client.HTTPConnection):
    default_port = HTTPS_PORT
    def __init__(self, host, port=HTTPS_PORT, key_file=None,
    cert_file=None,
    strict=None,
    timeout=socket._GLOBAL_DEFAULT_TIMEOUT):
    """ Init has same eparameters as HTTPSConnection. """
    client.HTTPConnection.__init__(self, host, port, strict,
    timeout)
    self.key_file = key_file
    self.cert_file = cert_file

    def connect(self):
    try:
    log.mjLog.LogReporter ("Model", "info",
    "AsyncHTTPSConnection::connect trying to connect... "+ str(self.host)
    + ":"+ str(self.port))
    sock = socket.create_connection((self.host, self.port),
    self.timeout)
    sock2 = ssl.wrap_socket(sock, self.key_file,
    self.cert_file)
    self.sock = CBSocket(sock2)
    except:
    log.mjLog.LogReporter ("Model", "critical",
    "AsyncHTTPSConnection::connect Failed to connect to the GWS")


    > It would be nice if you tried the just-released Python 3.2 alpha,
    > because some changes have been made to the SSL wrapper:http://python.org/download/releases/3.2/
    >

    Let me try to use this, I will come back with my observations.

    > Also, there's a feature request to reduce overhead of SSL
    > connections, but it needs implementing:http://bugs.python.org/issue8106


    Well good to know this. Do we have any date when this will be
    available? I feel like contributing to this but kind of over occupied
    with several activities right now.
    >
    > Regards
    >
    > Antoine.


    Thanks a lot,
    Ashish
     
    Ashish, Oct 13, 2010
    #2
    1. Advertising

  3. Antoine Pitrou

    Ashish Guest

    On Oct 13, 11:11 am, Ashish <> wrote:
    > On Oct 12, 6:33 pm, Antoine Pitrou <> wrote:> On Tue, 12 Oct 2010 05:40:43 -0700 (PDT)
    >
    > > Ashish Vyas <> wrote:
    > > > Another observation that I have made is with 10 parallel HTTPS connection each
    > > > trying 1 transaction per second from 2 different machines (effectively same load
    > > > on server), the response time is again reducing to .17 secs.
    > > > However if I run two instances of the tool with 10 parallel HTTPS connection
    > > > each trying 1 transaction per second from from same machine, the response time
    > > > is again shooting up to 1.1 seconds.

    >
    > > Is the client machine at 100% CPU when you do that?

    >
    > With HTTP, I see client CPU at appx. 97%. However with HTTPS, it stays
    > at 53-55%.
    >
    > > > So the question is does anyone here have any idea or some data about performance
    > > > limitation of HTTPS implementation in Python 3.1?

    >
    > > Which API are you using? urlopen()?
    > > The HTTPS implementation is basically the same as the HTTP
    > > implementation, except for the additional SSL layer. So if indeed
    > > Python is responsible for the slowdown, it may be because of excessive
    > > overhead brought by the SSL layer.

    >
    > I am doing something like this:-
    >
    > self.conn = AsyncHTTPSConnection(self.URL, HTTPS_PORT)
    >
    > self.conn.putrequest('POST', WEBSERVER_IP)
    > self.conn.putheader('Cookie', cookie)
    > self.conn.putheader('Content-Length', reqLen)
    > ..
    > self.conn.endheaders()
    > self.conn.send(str.encode(reqest))
    >
    > and AsyncHTTPSConnection class is something like this:-
    >
    > class AsyncHTTPSConnection(client.HTTPConnection):
    >     default_port = HTTPS_PORT
    >     def __init__(self, host, port=HTTPS_PORT, key_file=None,
    > cert_file=None,
    >                      strict=None,
    > timeout=socket._GLOBAL_DEFAULT_TIMEOUT):
    >         """ Init has same eparameters as HTTPSConnection. """
    >         client.HTTPConnection.__init__(self, host, port, strict,
    > timeout)
    >         self.key_file = key_file
    >         self.cert_file = cert_file
    >
    >     def connect(self):
    >         try:
    >             log.mjLog.LogReporter ("Model", "info",
    > "AsyncHTTPSConnection::connect trying to connect... "+ str(self.host)
    > + ":"+ str(self.port))
    >             sock = socket.create_connection((self.host, self.port),
    > self.timeout)
    >             sock2 = ssl.wrap_socket(sock, self.key_file,
    > self.cert_file)
    >             self.sock = CBSocket(sock2)
    >         except:
    >             log.mjLog.LogReporter ("Model", "critical",
    > "AsyncHTTPSConnection::connect Failed to connect to the GWS")
    >
    > > It would be nice if you tried the just-released Python 3.2 alpha,
    > > because some changes have been made to the SSL wrapper:http://python.org/download/releases/3.2/

    >
    > Let me try to use this, I will come back with my observations.


    Well, I tried python3.2a2 and the average response time for 20 HTTPS
    tps reduced from about 1.1 seconds to about .97 seconds.
    This is a noticeable change but not enough I feel.
    Also when I tried running same test client on XEON, I am able to see
    average response time appx. 0.23 seconds.

    >
    > > Also, there's a feature request to reduce overhead of SSL
    > > connections, but it needs implementing:http://bugs.python.org/issue8106

    >
    > Well good to know this. Do we have any date when this will be
    > available? I feel like contributing to this but kind of over occupied
    > with several activities right now.
    >
    >
    >
    > > Regards

    >
    > > Antoine.

    >
    > Thanks a lot,
    > Ashish
     
    Ashish, Oct 13, 2010
    #3
  4. On Wed, 13 Oct 2010 02:12:21 -0700 (PDT)
    Ashish <> wrote:
    > >
    > > > Is the client machine at 100% CPU when you do that?

    > >
    > > With HTTP, I see client CPU at appx. 97%. However with HTTPS, it stays
    > > at 53-55%.


    And is the server at 100% CPU then?
    If the client is not at 100% CPU, it shouldn't be the bottleneck,
    unless you have something wrong in the client implementation.

    > >             sock = socket.create_connection((self.host, self.port),
    > > self.timeout)
    > >             sock2 = ssl.wrap_socket(sock, self.key_file,
    > > self.cert_file)
    > >             self.sock = CBSocket(sock2)


    What is CBSocket? What happens if you just write:
    self.sock = sock2

    > > > Also, there's a feature request to reduce overhead of SSL
    > > > connections, but it needs implementing:http://bugs.python.org/issue8106

    > >
    > > Well good to know this. Do we have any date when this will be
    > > available? I feel like contributing to this but kind of over occupied
    > > with several activities right now.


    Probably not in Python 3.2 anyway. But given your client isn't at 100%
    CPU when you launch your HTTPS test, it might not make a lot of
    difference.

    Regards

    Antoine.
     
    Antoine Pitrou, Oct 13, 2010
    #4
  5. Antoine Pitrou

    Ashish Guest

    On Oct 13, 3:19 pm, Antoine Pitrou <> wrote:
    > On Wed, 13 Oct 2010 02:12:21 -0700 (PDT)
    >
    > Ashish <> wrote:
    >
    > > > > Is the client machine at 100% CPU when you do that?

    >
    > > > With HTTP, I see client CPU at appx. 97%. However with HTTPS, it stays
    > > > at 53-55%.

    >
    > And is the server at 100% CPU then?
    > If the client is not at 100% CPU, it shouldn't be the bottleneck,
    > unless you have something wrong in the client implementation.
    >
    > > >             sock = socket.create_connection((self.host, self.port),
    > > > self.timeout)
    > > >             sock2 = ssl.wrap_socket(sock, self.key_file,
    > > > self.cert_file)
    > > >             self.sock = CBSocket(sock2)

    >
    > What is CBSocket? What happens if you just write:
    >     self.sock = sock2
    >


    Server's java process is taking 15% cpu.

    Well, CBSocket is socket implementation that calls my callback on
    data.
    Both my classes AsyncHTTPSConnection and AsyncHTTPConnection use it
    and use it the same way ( self.sock = CBSocket(sock2) ).
    The implemetation of AsyncHTTPConnection differs from
    AsyncHTTPSConnection only in connect method: sock2 =
    ssl.wrap_socket(sock, self.key_file, self.cert_file)

    class CBSocket(asynchat.async_chat):
    """ This is a class that calls the callback when it has data and
    it read it."""
    def __init__(self, socket):
    asynchat.async_chat.__init__(self, socket)

    self._in_buffer = io.BytesIO()
    self._closed = False
    self._cb = None

    def handle_read(self):
    try:
    read = self.socket.recv(65536)
    except:
    read = 0
    raise
    if not read and not self._closed:
    self.handle_close()
    self.close()
    self._closed = True
    return

    self._in_buffer.write(read)

    def sendall(self, data):
    self.send(data)

    def makefile(self, mode, buffsize= 8192):
    self._in_buffer.seek(0)
    return self._in_buffer

    def set_cb(self, cb):
    self._cb = cb
    if self._closed:
    self._cb()
    else:
    pass


    def handle_close(self):
    if self._cb:
    self._cb()
    self._closed = True
    self.close()
    del self._in_buffer

    > > > > Also, there's a feature request to reduce overhead of SSL
    > > > > connections, but it needs implementing:http://bugs.python.org/issue8106

    >
    > > > Well good to know this. Do we have any date when this will be
    > > > available? I feel like contributing to this but kind of over occupied
    > > > with several activities right now.

    >
    > Probably not in Python 3.2 anyway. But given your client isn't at 100%
    > CPU when you launch your HTTPS test, it might not make a lot of
    > difference.
    >
    > Regards
    >
    > Antoine.
     
    Ashish, Oct 13, 2010
    #5
  6. On Wed, 13 Oct 2010 05:27:29 -0700 (PDT)
    Ashish <> wrote:
    >
    > Well, CBSocket is socket implementation that calls my callback on
    > data.
    > Both my classes AsyncHTTPSConnection and AsyncHTTPConnection use it
    > and use it the same way ( self.sock = CBSocket(sock2) ).
    > The implemetation of AsyncHTTPConnection differs from
    > AsyncHTTPSConnection only in connect method: sock2 =
    > ssl.wrap_socket(sock, self.key_file, self.cert_file)
    >
    > class CBSocket(asynchat.async_chat):

    [...]

    Ok, this won't work as expected. The first issue is that
    ssl.wrap_socket() is a blocking operation, where your client will send
    data and wait for the server reply (it's the SSL's handshake),
    *before* the socket has been set in non-blocking mode by asyncore. It
    means that your client will remain idle a lot of time, and explains
    that neither the client nor the server reach 100% CPU utilization.

    The second issue is that combining SSL and asyncore is more complicated
    than that; there are various situations to consider which your code
    doesn't address. The stdlib right now doesn't provide SSL support for
    asyncore (see http://bugs.python.org/issue10084 ), so you would have to
    do it yourself. I don't think it's worth the trouble, and would
    recommend switching your client to a simple thread-based approach,
    where you handle each HTTP(S) connection in a separate thread and stick
    to blocking I/O.

    Regards

    Antoine.
     
    Antoine Pitrou, Oct 13, 2010
    #6
  7. Antoine Pitrou

    Ashish Guest

    On Oct 13, 6:12 pm, Antoine Pitrou <> wrote:
    > On Wed, 13 Oct 2010 05:27:29 -0700 (PDT)Ashish <> wrote:
    >
    > > Well, CBSocket is socket implementation that calls my callback on
    > > data.
    > > Both my classes AsyncHTTPSConnection and AsyncHTTPConnection use it
    > > and use it the same way ( self.sock = CBSocket(sock2) ).
    > > The implemetation of AsyncHTTPConnection differs from
    > > AsyncHTTPSConnection only in connect method: sock2 =
    > > ssl.wrap_socket(sock, self.key_file, self.cert_file)

    >
    > > class CBSocket(asynchat.async_chat):

    >
    > [...]
    >
    > Ok, this won't work as expected. The first issue is that
    > ssl.wrap_socket() is a blocking operation, where your client will send
    > data and wait for the server reply (it's the SSL's handshake),
    > *before* the socket has been set in non-blocking mode by asyncore. It
    > means that your client will remain idle a lot of time, and explains
    > that neither the client nor the server reach 100% CPU utilization.
    >
    > The second issue is that combining SSL and asyncore is more complicated
    > than that; there are various situations to consider which your code
    > doesn't address. The stdlib right now doesn't provide SSL support for
    > asyncore (seehttp://bugs.python.org/issue10084), so you would have to
    > do it yourself. I don't think it's worth the trouble, and would
    > recommend switching your client to a simple thread-based approach,
    > where you handle each HTTP(S) connection in a separate thread and stick
    > to blocking I/O.
    >
    > Regards
    >
    > Antoine.


    I am impressed by the knowledge and also thankful to you for helping
    me out.

    I thought threads will be costly to use and if I go for say 200
    parallel connections with 200 total threads (+ a few more I have in my
    tool), it may not be efficient either. Let me try to change the
    implementation to use threads + blocking i/o and get back with
    results.

    One more question: If I run the tool from multicore machine, will
    python3.1 or 3.2 be able to actually use multicore? or it will be
    running only on one core?

    Thanks
    Ashish.
     
    Ashish, Oct 14, 2010
    #7
  8. On Thu, 14 Oct 2010 05:06:30 -0700 (PDT)
    Ashish <> wrote:
    >
    > One more question: If I run the tool from multicore machine, will
    > python3.1 or 3.2 be able to actually use multicore? or it will be
    > running only on one core?


    Only partly. Pure Python code is serialized (by the Global Interpreter
    Lock), but some internal C code, such as SSL and socket routines, can
    run in parallel with other code.

    Regards

    Antoine.
     
    Antoine Pitrou, Oct 14, 2010
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ilias Lazaridis
    Replies:
    2
    Views:
    393
    Ilias Lazaridis
    Apr 24, 2005
  2. Axel
    Replies:
    8
    Views:
    1,105
    Adrienne Boswell
    Apr 27, 2009
  3. Ilias Lazaridis
    Replies:
    74
    Views:
    763
    Ilias Lazaridis
    Apr 4, 2005
  4. Ilias Lazaridis
    Replies:
    18
    Views:
    335
    Bill Guindon
    Apr 9, 2005
  5. jotto
    Replies:
    4
    Views:
    393
    jotto
    Oct 2, 2006
Loading...

Share This Page