Standard Asynchronous Python

D

Dustin J. Mitchell

After seeing David Mertz's talk at PyCon 2012, "Coroutines, event
loops, and the history of Python generators" [1], I got thinking again
about Python's expressive power for asynchronous programming.
Generators, particularly with the addition of 'yield from' and
'return' in PEP 380 [2], allow us to write code that is executed "bit
by bit" but still reads naturally. There are a number of frameworks
that take advantage of this ability, but each is a little different --
enough so that there's negligible code re-use between these
frameworks. I think that's a shame.

I proposed a design PEP a while back [3] with the intent of defining a
standard way of writing asynchronous code, with the goal of allowing
code re-use and bringing users of the frameworks closer together.
Ideally, we could have libraries to implement network protocols,
database wrappers, subprocess execution, and so on, that would work in
any of the available asynchronous frameworks.

My proposal met with near-silence, and I didn't pursue it. Instead, I
did what any self-respecting hacker would do - I wrote up a framework,
uthreads [4], that implemented my idea. This was initially a simple
trampoline scheduler, but I eventually refactored it to run atop
Twisted, since that's what I use. To my knowledge, it's never been
used.

I'm considering re-drafting the PEP with the following changes:

* De-emphasize the thread emulation aspects, and focus on
code-portability issues:
* callbacks vs. "blocking" calls (e.g., when accepting incoming
connections on a socket, how is my code invoked?)
* consistent access to primitives, regardless of framework (e.g.,
where's the function I call to branch execution?)
* nested asynchronous methods
* Account for PEP 380 (by making the StopIteration workarounds match
PEP 380, and explicitly deprecating them after Python 3.3)
* Look forward to a world with software transactional memory [5] by
matching that API where appropriate

As I get to work on the PEP, I'd like to hear any initial reactions to the idea.

Dustin

[1] https://us.pycon.org/2012/schedule/presentation/104/
[2] http://www.python.org/dev/peps/pep-0380
[3] http://code.google.com/p/uthreads/source/browse/trunk/microthreading-pep.txt
[4] http://code.google.com/p/uthreads/
[5] https://bitbucket.org/pypy/pypy/raw/stm-thread/pypy/doc/stm.rst
 
S

Steven D'Aprano

After seeing David Mertz's talk at PyCon 2012, "Coroutines, event loops,
and the history of Python generators" [1], I got thinking again about
Python's expressive power for asynchronous programming. [...]
I'm considering re-drafting the PEP with the following changes:

* De-emphasize the thread emulation aspects, and focus on
code-portability issues:
* callbacks vs. "blocking" calls (e.g., when accepting incoming
connections on a socket, how is my code invoked?)
* consistent access to primitives, regardless of framework (e.g.,
where's the function I call to branch execution?)
* nested asynchronous methods
* Account for PEP 380 (by making the StopIteration workarounds match
PEP 380, and explicitly deprecating them after Python 3.3)
* Look forward to a world with software transactional memory [5] by
matching that API where appropriate

As I get to work on the PEP, I'd like to hear any initial reactions to
the idea.

My reaction is that until your framework gets some significant real-world
use, it probably doesn't belong in the standard library. To some degree,
the standard library is where good code goes to die -- the API should be
mature and well-tested, not just the code, because we'll be stuck with
the API (essentially) forever. Code can change, bugs can be fixed, but
we're stuck with "referer" forever :)

At least, you should be prepared to justified why your library uthreads
should be considered mature enough for the std lib despite the lack of
real-world use.
 
B

Bryan

Dustin said:
After seeing David Mertz's talk at PyCon 2012, "Coroutines, event
loops, and the history of Python generators" [1], I got thinking again
about Python's expressive power for asynchronous programming.

I lament the confusion of generators and coroutines. Generators are no
substitute for threads; they're a substitute for temporary memory-
wasting lists and for those extraneous classes that just implement
another class's __iter__.

[big snip]
As I get to work on the PEP, I'd like to hear any initial reactions to the idea.

Be warned that I'm frequently chastised for negativity...
Your motivating examples motivate me in the opposite direction.
Computing Fibonacci numbers with a generator? The problem with your
first support app is that threads don't scale?

I'm happy with the direction Python has chosen: real, platform-
provided threads. There are some fine projects taking other courses,
but most of the standard stuff is sequential code with some notion of
thread-safety. In this area Python has benefited for free as the
popular platforms have steadily improved.

Properly exploiting event-drive would be a bigger change than you
propose. Python tends to polymorphize I/O via "file like" objects,
who's requirements have traditionally been under-specified and have
never included event-driven behavior. Also, while Python is proudly
cross-platform, Microsoft Windows has the edge on asynchronous
facilities but Python treats Windows like a Unix wanna-be.

-Bryan
 
D

Dustin J. Mitchell

The responses have certainly highlighted some errors in emphasis in my approach.

* My idea is to propose a design PEP. (Steven, Dennis) I'm not at
*all* suggesting including uthreads in the standard library. It's a
toy implementation I used to develop my ideas. I think of this as a
much smaller idea in the same vein as the DBAPI (PEP 249): a common
set of expectations that allows portability.
* I'd like to set aside the issue of threads vs. event-driven
programming. There are legitimate reasons to do both, and the healthy
ecosystem of frameworks for the latter indicates at least some people
are interested. My idea is to introduce a tiny bit of coherence
across those frameworks.
* (Bryan) The Fibonacci example is a simple example of, among other
things, a CPU-bound, recursive task -- something that many async
frameworks don't handle fairly right now. I will add some text to
call that out explicitly.
* Regarding generators vs. coroutines (Bryan), I use the terms
generator and generator function in the PEP carefully, as that's what
the syntactic and runtime concepts are called in Python. I will
include a paragraph distinguishing the two.

I will need to take up the details of the idea with the developers of
the async frameworks themselves, and get some agreement before
actually proposing the PEP. However, among this group I'm interested
to know whether this is an appropriate use of a design PEP. That's
why I posted my old and flawed PEP text, rather than re-drafting
first.

Thanks for the responses so far!
Dustin
 
S

Steven D'Aprano

The responses have certainly highlighted some errors in emphasis in my
approach.

* My idea is to propose a design PEP. (Steven, Dennis) I'm not at *all*
suggesting including uthreads in the standard library. It's a toy
implementation I used to develop my ideas. I think of this as a much
smaller idea in the same vein as the DBAPI (PEP 249): a common set of
expectations that allows portability.

Okay, point taken, I misunderstood your proposal.

But my point still stands: since nobody except (possibly) you has used
your uthreads library, what gives you confidence that the API you suggest
is any good? Not just good, but good enough to impose that API on every
other async framework in the standard library and possibly beyond it?

If you have a good answer to that question, then it might be appropriate
to propose such an API.

(For what it's worth, consensus among the major async frameworks that
your approach was a good idea would be a pretty good answer to that
question.)
 
T

Terry Reedy

The responses have certainly highlighted some errors in emphasis in my approach.

* My idea is to propose a design PEP. (Steven, Dennis) I'm not at
*all* suggesting including uthreads in the standard library. It's a
toy implementation I used to develop my ideas. I think of this as a
much smaller idea in the same vein as the DBAPI (PEP 249): a common
set of expectations that allows portability.

That has been very successful.
* I'd like to set aside the issue of threads vs. event-driven
programming. There are legitimate reasons to do both, and the healthy
ecosystem of frameworks for the latter indicates at least some people
are interested. My idea is to introduce a tiny bit of coherence
across those frameworks.

I think many developers recognize that some improvment in coherence
would be a good idea. I occasionally read that *someone* is working on
common event loop approach, though it has not materialized yet.
I will need to take up the details of the idea with the developers of
the async frameworks themselves, and get some agreement before
actually proposing the PEP. However, among this group I'm interested
to know whether this is an appropriate use of a design PEP.

I think so.
That's why I posted my old and flawed PEP text, rather than re-drafting
first.

I think you should do a bit of editing now, even if not a full redraft.
 
D

Dustin J. Mitchell

Thanks for the second round of responses. I think this gives me some
focus - concentrate on the API, talk to the framework developers, and
start redrafting the PEP sooner rather than later.

Thanks!
Dustin
 
B

Bryan

Dustin said:
Thanks for the second round of responses.  I think this gives me some
focus - concentrate on the API, talk to the framework developers, and
start redrafting the PEP sooner rather than later.

That's mostly what you came in with, but talking to the framework
developers is unarguably a good idea. So what frameworks?

There is one great framework for asynchronous Python and that's
Twisted. I'm not a Twisted guy... well not in the sense at issue here.
I can tell you why Twisted is "massive": no workable alternative. An
event-driven framework cannot simply call methods in a mainstream
Python library because they often block for I/O. The good and gifted
developers of Twisted have risen to the challenge by re-writing
important libraries around their deferreds which they regard as
beautiful.

Our fellow scripting language programmers in the young node.js
community work in a system that is event-driven from the ground up.
They had the advantage of no important blocking sequential libraries
to re-write. As a programming language JavaScript is grossly inferior
to Python, yet the node.js guys have put out some useful code and
impressive benchmarks with their bad-ass rock-star tech.

The obvious exemplars among Python frameworks are web frameworks.
Developers of the most popular Python web framework, Django, famously
stated their goal of making making web programming "stupidly easy".
That registered with me. Django did spectacularly well, then Web2py
did even better on that particular criterion. There are several-to-
many other excellent Python web frameworks, and there's a clear
consensus among the major players on how to handle simultaneous
requests: threads.

Dustin, I hope you carry on with your plan. I request, please, report
back here what you find. As law professor James Duane said in pre-
introduction of police officer George Bruch, "I'm sure [you'll] have a
lot to teach all of us, including myself."

-Bryan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top