xmlprclib/server not reusing connections

R

Roger Binns

It appears that xmlrpclib and/or SimpleXMLRPCServer always use new
connections for each request. I have been trying to make them
reuse the existing connection but can't find where. Is there any
particular reason they go to such great lengths to call shutdown/close/finish
as well as making new connection objects in xmlrpclib?

I am actually using M2Crypto so the connection ultimately ends up
inside SSL. Having a new connection established for every single
request is insane, especially as most of my requests will be lightweight.
I expect the clients and servers to be on lower bandwidth connections
and usually on opposite ends of the US.

My attempts so far to make reuse happen have led to a twisty maze
of deep inheritance hierarchies and intertwined transports, connections
and request handlers hard coding each others classes. No amount of
debugging and commenting out shutdown/close/finish prevents new
connections from being used.

Roger
 
S

Skip Montanaro

Roger> It appears that xmlrpclib and/or SimpleXMLRPCServer always use
Roger> new connections for each request. I have been trying to make
Roger> them reuse the existing connection but can't find where. Is
Roger> there any particular reason they go to such great lengths to call
Roger> shutdown/close/finish as well as making new connection objects in
Roger> xmlrpclib?

Connection reuse would be done at a lower level I think. Take a look at the
httplib module. Take a look at xmlrpclib's MultiCall class for another way
to improve performance in some situations.

Skip
 
R

Roger Binns

Connection reuse would be done at a lower level I think. Take a look at the
httplib module. Take a look at xmlrpclib's MultiCall class for another way
to improve performance in some situations.


multicall is of no use to me because my following requests depend on the results
of the preceding ones. SSL is really important since the traffic is going
over the open Internet. I needed to do some certificate validation which
the Python 2.3 library has no facilities for.

On the server side, things were hopeless as well. Since it would be
listening on the open Internet, and had to work on Windows as well as
Linux and Mac, the threading server initially seemed appealing, but
sadly it has no facilities for limiting the number of threads spawned.
It also makes a new thread per request, rather than the more normal
design of having a thread pool to deal with requests/connections.

xmlrpclib is coded to use the Python 1.5 compatible names (and lack of
HTTP/1.1 and persistent connections). SimpleXMLRPCRequestHandler is
hard-coded to close connections after one request (see last line of
do_POST). The M2Crypto code that added SSL to both inherited those
implementations and design.

Basically the existing xmlrpclib and SimpleXMLRPCServer are hard coded
(and IMHO go out of their way) to do one request per connection, and
then shut things down. Throw in that I needed to do SSL on both ends, with
HTTP authentication, and some "light" firewalling in order to prevent
DOS attacks, I had to spend several days mashing the various classes
together with M2Crypto together.

You can see the results here:

http://cvs.sf.net/viewcvs.py/bitpim/bitpim/bitfling/xmlrpcstuff.py?view=markup

Roger
 
A

Andrew Bennetts

]
Basically the existing xmlrpclib and SimpleXMLRPCServer are hard coded
(and IMHO go out of their way) to do one request per connection, and
then shut things down. Throw in that I needed to do SSL on both ends, with
HTTP authentication, and some "light" firewalling in order to prevent
DOS attacks, I had to spend several days mashing the various classes
together with M2Crypto together.

You might have better luck with Twisted <http://twistedmatrix.com/>.

The XML-RPC server support in Twisted does support SSL, and HTTP/1.1
persistent connections, and it's pretty easy to use:

http://twistedmatrix.com/documents/examples/xmlrpc.py

I don't have any experience with Twisted's XML-RPC client (although you can
see an example at
http://twistedmatrix.com/documents/examples/xmlrpcclient.py), but glancing
at the code it doesn't support HTTP/1.1 persistent connections out of the
box... you could probably add that in relatively easily, though.

Glancing at that, it looks like using Twisted to implement that would be a
lot easier, and shorter.

-Andrew.
 
R

Roger Binns

You might have better luck with Twisted http://twistedmatrix.com/.

I did look at twisted but really don't like it. In particular I really
don't like the event driven code, that has to add callbacks and
deal with partial state when called.

That is certainly the right way to do things if you want to avoid
using threads. It is however complicated and convoluted. See this
example:

http://twistedmatrix.com/documents/howto/tutorial#auto20

The model I far prefer is to use multiple threads (available on all
major Python platforms) and use worker threads with work queues.
They turn out to be simpler since they don't have queue callbacks
or effectively be a glorified state machine scattered across
several functions.
Glancing at that, it looks like using Twisted to implement that would be a
lot easier, and shorter.

Except it wouldn't unless twisted already had the necessary functionality
which it doesn't. I would have to go through a similar exercise with
twisted, which is far more complicated.

Roger
 
C

Chris Foote

multicall is of no use to me because my following requests depend on the results
of the preceding ones. SSL is really important since the traffic is going
over the open Internet. I needed to do some certificate validation which
the Python 2.3 library has no facilities for.

On the server side, things were hopeless as well. Since it would be
listening on the open Internet, and had to work on Windows as well as
Linux and Mac, the threading server initially seemed appealing, but
sadly it has no facilities for limiting the number of threads spawned.
It also makes a new thread per request, rather than the more normal
design of having a thread pool to deal with requests/connections.

xmlrpclib is coded to use the Python 1.5 compatible names (and lack of
HTTP/1.1 and persistent connections). SimpleXMLRPCRequestHandler is
hard-coded to close connections after one request (see last line of
do_POST). The M2Crypto code that added SSL to both inherited those
implementations and design.

Basically the existing xmlrpclib and SimpleXMLRPCServer are hard coded
(and IMHO go out of their way) to do one request per connection, and
then shut things down. Throw in that I needed to do SSL on both ends, with
HTTP authentication, and some "light" firewalling in order to prevent
DOS attacks, I had to spend several days mashing the various classes
together with M2Crypto together.

You can see the results here:

http://cvs.sf.net/viewcvs.py/bitpim/bitpim/bitfling/xmlrpcstuff.py?view=markup

Excellent - I've been wanting these features for some time ;-)


Chris Foote <[email protected]>
_ _ _
(_) | | | | Director - INETD PTY LTD
_ _ __ ___ | |_ __| | Level 2, 132 Franklin St
| | | '_ \ / _ \ | __| / _` | Adelaide SA 5000
| | | | | | | __/ | |_ | (_| | Web: http://www.inetd.com.au
|_| |_| |_| \___| \__| \__,_| Phone: (08) 8410 4566
 
D

Dave Brueck

Roger said:
I did look at twisted but really don't like it. In particular I really
don't like the event driven code, that has to add callbacks and
deal with partial state when called.

That is certainly the right way to do things if you want to avoid
using threads. It is however complicated and convoluted. See this
example:

http://twistedmatrix.com/documents/howto/tutorial#auto20

The model I far prefer is to use multiple threads (available on all
major Python platforms) and use worker threads with work queues.
They turn out to be simpler since they don't have queue callbacks
or effectively be a glorified state machine scattered across
several functions.

Yes, but the main drawback is of course scalability. But, you can have the best
of both worlds with Stackless Python's tasklets. I few years ago I wrote an
event-based I/O framework for my company using normal Python and it's nice and
fast but programs built on it are ugly (logic split across many callback
functions, state is maintained in an awkward way as a result, etc). Over the
past week or so I've started writing a replacement framework using Stackless
and it's been *so* nice... the applications that use it _look_ like they are
threaded, but you get the performance and scalability of "traditional" asynch
I/O frameworks (the core of the framework still uses a loop to call
select.poll).

Here's one of my test cases, a server that sends back a dummy HTTP response:

def ConnHandler(sock):
header = sock.Read(4096)
resp = 'HTTP/1.0 200 Ok\r\nContent-length: 5\r\nContent-type:
text/html\r\n\r\n12345'
sock.Write(resp)
sock.Close()

engine = Engine()
s = ListenSocket('127.0.0.1', 7777, ConnHandler)
s.AttachToEngine(engine)
engine.Run()

On a 900MHz P3 Linux box the above server easily handles 2000 requests per
second, which is nice. :)

Although obviously a very trivial example, it's a good enough proof-of-concept
to encourage me to continue. I haven't tested high levels of concurrent
connections yet - but a similar test that sent back larger responses didn't
have any problems serving well over 1000 simultaneous connections, so in the
worse case using Stackless won't have any performance drawbacks compared to
other approaches.

-Dave
 
F

Fredrik Lundh

Roger said:
Basically the existing xmlrpclib and SimpleXMLRPCServer are hard coded
(and IMHO go out of their way) to do one request per connection, and
then shut things down.

if you don't like the xmlrpclib default Transport, use your own. you probably
only have to override 'make_connection' and 'parse_response' to get the be-
haviour you're looking for.

</F>
 
R

Roger Binns

Basically the existing xmlrpclib and SimpleXMLRPCServer are hard coded
if you don't like the xmlrpclib default Transport, use your own. you probably
only have to override 'make_connection' and 'parse_response' to get the be-
haviour you're looking for.

Yes, I pointed at the code I had to write. In particular I had to do the following:

- My own Transport
- reimplementation of parse_response method (existing one shuts down connection)
- reimplementation of request method (existing one uses Python 1.5 httplib
function names and semantics)
- reimplementation of making a connection to reuse existing one if appropriate
- My own handler
- reimplementation of do_POST to get authentication headers (and not close
connections)
- reimplementation of finish to do proper SSL shutdown sequence
- work around bug in M2Crypto makefile method that doesn't actually do a dup
and breaks semantics if called twice

It isn't as simple as it would first appear, and the many layers of indirection
and sub-classing made it even harder to figure out what was going on, and to
override behaviour.

Roger
 
C

Christian Tismer

Roger said:
I did look at twisted but really don't like it. In particular I really
don't like the event driven code, that has to add callbacks and
deal with partial state when called.

That is certainly the right way to do things if you want to avoid
using threads. It is however complicated and convoluted.

True and false at the same time.
While it is internally the right thing to do (better to say, it
is what actually happens, internally), it is not the best way
to write it.

This is why I developed Stackless Python.
It behaves in a similar way, acting like many callbacks,
but you +write* your code in the most natural way possible.
In a way, Stackless takes all the clumsy state keeping stuff
away from the programmer and frees his mind to write simple
top-down programs with no callbacks.
This approach has even been tested with Zope. You can write
a simple loop that sends Web pages to the user, without leaving
the loop. Internally, of course it is left, the whole program is
pickled, and restored on the next request. But you write a single
script, no methods which have to seek for state, do something, and
encode state, again.

I'm thinking of a talk about Stackless on EuroPython 2004:
"Stackless Python and the Death of the Reactive Pattern". :)

Note that Christopher Armstrong is working on
a Stackless Reactor for Twisted!

I think, when that thing is ready, it will give both Twisted
and Stackless a boost of popularity and usability.

ciao - chris
--
Christian Tismer :^) <mailto:[email protected]>
Mission Impossible 5oftware : Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/
14109 Berlin : PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34 home +49 30 802 86 56 mobile +49 173 24 18 776
PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04
whom do you want to sponsor today? http://www.stackless.com/
 
A

Andrew Bennetts

I did look at twisted but really don't like it. In particular I really
don't like the event driven code, that has to add callbacks and
deal with partial state when called.

I was initially skeptical too, but I found that the small cost of using
callbacks to be well worth the benefit of avoid threads most of the time.
Threads invite all sorts of difficult to reproduce and debug problems like
race conditions and deadlocks.

That said, event-driven programming doesn't suit every problem. Twisted
does try fairly hard to reduce the burden of asynchronous code, though:
Deferreds make chains of callbacks easy to handle, and the twisted.flow
package uses python 2.2's generators to allow synchronous-looking code to
yield to the event loop:
http://twistedmatrix.com/documents/howto/flow

And of course Twisted's splitting of network code into reactor, factory,
protocol and transport objects is a really nice abstraction, but that would
probably translate just fine into blocking code too (except perhaps the
reactor would be largely irrelevant?).
That is certainly the right way to do things if you want to avoid
using threads. It is however complicated and convoluted. See this
example:

http://twistedmatrix.com/documents/howto/tutorial#auto20

That example looks fine to me, but perhaps I've been using Twisted too long!
:)

I certainly have no problems reading that code and immediately understanding
what every part of it does.
The model I far prefer is to use multiple threads (available on all
major Python platforms) and use worker threads with work queues.
They turn out to be simpler since they don't have queue callbacks
or effectively be a glorified state machine scattered across
several functions.

That's fine. Twisted doesn't prevent you from using threads!
reactor.callInThread(func, args...) will run a function inside a thread pool
(or you can just start and manage threads yourself), and
reactor.callFromThread provides a safe way for a thread to run some code in
the main event loop. Twisted itself does this in e.g. the
twisted.enterprise.adbapi module, to provide an asynchronous interface
around a synchronous DB-API modules.
Except it wouldn't unless twisted already had the necessary functionality
which it doesn't. I would have to go through a similar exercise with
twisted, which is far more complicated.

Well, Twisted already has persistent HTTP connections and SSL. So e.g. the
vast bulk of your XMLRPCRequestHandler.do_POST would disappear, because
twisted.protocols.http already does that.

Perhaps my point wasn't clear: I wasn't saying that Twisted will inherently
make any network code much much shorter[1], I was saying that Twisted
already has sufficient HTTP and SSL support to meet your needs, and that you
could re-use that. Of course, now that you've written what you need without
Twisted, you probably don't care very much :)

-Andrew.

[1] Although I think on average it probably does, and makes it easier to
write, too. The wide variety of protocols available in twisted.protocols
suggests to me that I'm at least part right about this...
 
R

Roger Binns

This is why I developed Stackless Python.
It behaves in a similar way, acting like many callbacks,
but you +write* your code in the most natural way possible.
In a way, Stackless takes all the clumsy state keeping stuff
away from the programmer and frees his mind to write simple
top-down programs with no callbacks.

I certainly agree with that and really like your examples.

However my code also has to use wxPython, win32all, pySerial, M2Crypto
and libusb. I have no idea if all those are integrated correctly with
Stackless (and work correctly on Linux, Windows and Mac), but I was
certainly not going to be the first person to find out. And I would
still have to have fixed xmlrpc client/server portions to reuse
connections properly anyway.

Roger
 
C

Christian Tismer

Roger said:
I certainly agree with that and really like your examples.

However my code also has to use wxPython, win32all, pySerial, M2Crypto
and libusb. I have no idea if all those are integrated correctly with
Stackless (and work correctly on Linux, Windows and Mac), but I was
certainly not going to be the first person to find out. And I would
still have to have fixed xmlrpc client/server portions to reuse
connections properly anyway.

Sure. Some of my experience:
(yes, I'm using Stackless since a year now :)

It works just *great* with wxPython. There are a few objects
which need a little care since they only live on the C stack
(mouse events for instance), but all in all it is wonderful
to use wxPython + Stackless (+ Boa Constructor + PIL + ...).
I have multiple dynamic widgets with animated graphical
content in my GUI, it is all running in a single thread,
and all my tasklets can update the GUI at any time, since
it is a single thread...

Let me know if you need sample code. Like a mouse handler,
written like a single, main program. No callbacks...

ciao - chris

--
Christian Tismer :^) <mailto:[email protected]>
Mission Impossible 5oftware : Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/
14109 Berlin : PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34 home +49 30 802 86 56 mobile +49 173 24 18 776
PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04
whom do you want to sponsor today? http://www.stackless.com/
 
R

Roger Binns

It works just *great* with wxPython. There are a few objects
which need a little care since they only live on the C stack
(mouse events for instance),

I don't understand how that is relevant. Do I have to recompile
wxPython to use stackless? You also didn't mention anything
about the other libraries (win32all, pySerial, m2crypto, libusb)

Also, does stackless work with threads? I use a seperate
thread to do serial port stuff, and need to have the user
interface running at the same time. The serial port stuff
consists of reads with 5 second timeouts (the GIL is released
during the read).

Roger
 
C

Christian Tismer

Roger said:
I don't understand how that is relevant. Do I have to recompile
wxPython to use stackless? You also didn't mention anything
about the other libraries (win32all, pySerial, m2crypto, libusb)

I answered about the stuff I tried. Ok, I forgot win32all,
it works fine, too.

No, wxPython doesn't need to get recompiled, but mouse events
are dead when the stack is moved away, so you have to create
extra objects before handling the event in a different tasklet.
Also, does stackless work with threads? I use a seperate
thread to do serial port stuff, and need to have the user
interface running at the same time. The serial port stuff
consists of reads with 5 second timeouts (the GIL is released
during the read).

Yes it does. At the moment, threads are just ignored, and each
has its independent list of tasklets. This si going to change, since
I will write channels for inter-thread communication, and tasklets
which are nomading across threads.

ciao - chris
--
Christian Tismer :^) <mailto:[email protected]>
Mission Impossible 5oftware : Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/
14109 Berlin : PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34 home +49 30 802 86 56 mobile +49 173 24 18 776
PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04
whom do you want to sponsor today? http://www.stackless.com/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top