Is Unladen Swallow dead?

L

laspi

There has been little or no activity at all in this project in the
last months, and the last comments on their mailing list seem to
conrfim that it's future is uncertain.
It's also very strange the lack of updates, news or discussions,
specially considering that the merging plan has been approved. Or it
hasn't?
 
S

swapnil

There has been little or no activity at all in this project in the
last months, and the last comments on their mailing list seem to
conrfim that it's future is uncertain.
It's also very strange the lack of updates, news or discussions,
specially considering that the merging plan has been approved. Or it
hasn't?

AFAIK, the merging plan was approved by Guido early this year. I guess
Google is expecting the community to drive the project from here on.
That was the whole idea for merging it to mainline. From my last
conversation with Collin, they are targeting Python 3.3
 
J

John Nagle

AFAIK, the merging plan was approved by Guido early this year. I guess
Google is expecting the community to drive the project from here on.
That was the whole idea for merging it to mainline. From my last
conversation with Collin, they are targeting Python 3.3

I think it's dead. They're a year behind on quarterly releases.
The last release was Q3 2009. The project failed to achieve its
stated goal of a 5x speedup. Not even close. More like 1.5x
(http://www.python.org/dev/peps/pep-3146)

The Google blog at
"http://groups.google.com/group/unladen-swallow/browse_thread/thread/f2011129c4414d04"
says, as of November 8, 2010:

"Jeffrey and I have been pulled on to other projects of higher
importance to Google. Unfortunately, no-one from the Python
open-source community has been interested in picking up the merger
work, and since none of the original team is still full-time on the
project, it's moving very slowly. Finishing up the merger into the
py3k-jit branch is a high priority for me this quarter, but what
happens then is an open question."

So Google has pulled the plug on Unladen Swallow. It looks
like they underestimated the difficulty of speeding up the CPython
model. The performance improvement achieved was so low
that cluttering up CPython with a JIT system and LLVM probably is
a lose.

John Nagle
 
J

John Nagle

No, it's just resting.

For those who don't get that, The Monty Python reference:
"http://www.mtholyoke.edu/~ebarnes/python/dead-parrot.htm"

Owner: Oh yes, the, uh, the Norwegian Blue...What's,uh...What's wrong
with it?

Mr. Praline: I'll tell you what's wrong with it, my lad. 'E's dead,
that's what's wrong with it!

Owner: No, no, 'e's uh,...he's resting.

Mr. Praline: Look, matey, I know a dead parrot when I see one, and I'm
looking at one right now.

Owner: No no he's not dead, he's, he's restin'! Remarkable bird, the
Norwegian Blue, idn'it, ay? Beautiful plumage!

Mr. Praline: The plumage don't enter into it. It's stone dead.

Owner: Nononono, no, no! 'E's resting!

Mr. Praline: All right then, if he's restin', I'll wake him up!
(shouting at the cage) 'Ello, Mister Polly Parrot! I've got a lovely
fresh cuttle fish for you if you show...

(owner hits the cage)

Owner: There, he moved!

Mr. Praline: No, he didn't, that was you hitting the cage!

Owner: I never!!

Mr. Praline: Yes, you did!

Owner: I never, never did anything...

Mr. Praline: (yelling and hitting the cage repeatedly) 'ELLO POLLY!!!!!
Testing! Testing! Testing! Testing! This is your nine o'clock alarm call!

(Takes parrot out of the cage and thumps its head on the counter. Throws
it up in the air and watches it plummet to the floor.)

Mr. Praline: Now that's what I call a dead parrot.

(There's more, but you get the idea.)

John Nagle
 
R

Robert Kern

Thank you John for making my light enough Wallet even lighter, now I have to go
and buy the original English version. Seems the German translation sucks (misses
a lot) and my copy lacks the original dub.

They're all (legitimately) on Youtube now.


--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
M

Martin Gregorie

Thank you John for making my light enough Wallet even lighter, now I
have to go and buy the original English version. Seems the German
translation sucks (misses a lot) and my copy lacks the original dub.
While you're at it, pick up the video of "Monty Python and the Holy
Grail". the project name, Unladen Swallow, is a reference to the film.
 
B

BartC

John Nagle said:
On 11/16/2010 10:24 PM, swapnil wrote:

I think it's dead. They're a year behind on quarterly releases.
The last release was Q3 2009. The project failed to achieve its
stated goal of a 5x speedup. Not even close. More like 1.5x
(http://www.python.org/dev/peps/pep-3146)

There must have been good reasons to predict a 5x increase. But why did it
take so long to find out the approach wasn't going anywhere?

Assuming the 5x speedup was shown to be viable (ie. performing the same
benchmarks, on the same data, can be done that quickly in any other
language, and allowing for the overheads associated with Python's dynamic
nature), then what went wrong?

(I've had a look at the benchmarks, with a view to trying some on other
languages, and they seem an extraordinarily difficult bunch to work with.)
So Google has pulled the plug on Unladen Swallow. It looks
like they underestimated the difficulty of speeding up the CPython
model. The performance improvement achieved was so low
that cluttering up CPython with a JIT system and LLVM probably is
a lose.

LLVM. Ok, that explains a lot. (LLVM is a huge, complex system).
 
J

John Nagle

There must have been good reasons to predict a 5x increase.

For Java, adding a JIT improved performance by much more than that.
Hard-code compilers for LISP have done much better than 5x. The
best Java and LISP compilers approach the speed of C, while CPython
is generally considered to be roughly 60 times slower than C. So
5x probably looked like a conservative goal. For Google, a company
which buys servers by the acre, a 5x speedup would have a big payoff.
Assuming the 5x speedup was shown to be viable (ie. performing the
same benchmarks, on the same data, can be done that quickly in any
other language, and allowing for the overheads associated with
Python's dynamic nature), then what went wrong?

Python is defined by what a naive interpreter with late binding
and dynamic name lookups, like CPython, can easily implement. Simply
emulating the semantics of CPython with generated code doesn't help
all that much.

Because you can "monkey patch" Python objects from outside the
class, a local compiler, like a JIT, can't turn name lookups into hard
bindings. Nor can it make reliable decisions about the types of
objects. That adds a sizable performance penalty. Short of global
program analysis, the compiler can't tell when code for the hard cases
needs to be generated. So the hard-case code, where you figure out at
run-time, for ever use of "+", whether "+" is addition or concatenation,
has to be generated every time. Making that decision is far slower
than doing an add.

Shed Skin, which analyzes the entire program, including libraries,
on every compilation, can figure out the types of objects and generate
much faster code. Shed Skin has some heavy restrictions, many of which
could be lifted if more work went into that effort. That's one
approach that might work.

I've referred to this problem as "gratuitous hidden dynamism".
Most things which could be changed dynamically in a Python program
usually aren't.

This has been pointed out many times by many people. There's
even a PhD thesis on the topic. Without a few restrictions, so
that a compiler can at least tell when support for the hard cases
is needed, Python cannot be compiled well.

John Nagle
 
J

Jean-Paul Calderone

     For Java, adding a JIT improved performance by much more than that.
Hard-code compilers for LISP have done much better than 5x.  The
best Java and LISP compilers approach the speed of C, while CPython
is generally considered to be roughly 60 times slower than C.  So
5x probably looked like a conservative goal.  For Google, a company
which buys servers by the acre, a 5x speedup would have a big payoff.


      Python is defined by what a naive interpreter with late binding
and dynamic name lookups, like CPython, can easily implement.  Simply
emulating the semantics of CPython with generated code doesn't help
all that much.

      Because you can "monkey patch" Python objects from outside the
class, a local compiler, like a JIT, can't turn name lookups into hard
bindings.  Nor can it make reliable decisions about the types of
objects.  That adds a sizable performance penalty. Short of global
program analysis, the compiler can't tell when code for the hard cases
needs to be generated.  So the hard-case code, where you figure out at
run-time, for ever use of "+", whether "+" is addition or concatenation,
has to be generated every time.  Making that decision is far slower
than doing an add.

This isn't completely accurate. It *is* possible to write a JIT
compiler
for a Python runtime which has fast path code for the common case, the
case
where the meaning of "+" doesn't change between every opcode. PyPy
has
produced some pretty good results with this approach.

For those who haven't seen it yet, http://speed.pypy.org/ has some
graphs
which reflect fairly well on PyPy's performance for benchmarks that
are not
entirely dissimilar to real world code.

Jean-Paul
 
M

Mark Wooding

John Nagle said:
Python is defined by what a naive interpreter with late binding
and dynamic name lookups, like CPython, can easily implement. Simply
emulating the semantics of CPython with generated code doesn't help
all that much.
Indeed.

Because you can "monkey patch" Python objects from outside the
class, a local compiler, like a JIT, can't turn name lookups into hard
bindings. Nor can it make reliable decisions about the types of
objects.

But it /can/ make guesses. A dynamic runtime doesn't have to predict
everything right in advance; it only has to predict most things sort of
well enough, and fix up the things it got wrong before anyone notices.
For example, A Python compiler could inline a function call if it makes
a note to recompile the calling function if the called function is
modified. Most functions aren't redefined, so this is probably a pretty
good guess.
That adds a sizable performance penalty. Short of global program
analysis, the compiler can't tell when code for the hard cases needs
to be generated.

The right approach is to guess that things are going to be done the easy
way, and then detect when the guess is wrong.
So the hard-case code, where you figure out at run-time, for ever use
of "+", whether "+" is addition or concatenation, has to be generated
every time. Making that decision is far slower than doing an add.

There's an old trick here called `inline caching'. The first time a
function is called, compile it so as to assume that types of things are
as you found this time: inline simple methods, and so on. Insert some
quick type checks at the top: is this going to work next time? If not,
take a trap back into the compiler. The traditional approach is to
replace the mispredictions with full dispatches (`monomorphic inline
caching'); the clever approach tolerates a few different types,
dispatching to optimized code for each (`polymorphic inline caching'),
unless there are just too many decision points and you give up.

There are time/space tradeoffs to be made here too. Fortunately, you
don't have to compile everything super-optimized from the get-go: you
can dynamically identify the inner loops which need special attention,
and get the compiler to really stare hard at them. The rest of the
program might plausibly be left interpreted much of the time for all
anyone will care.
I've referred to this problem as "gratuitous hidden dynamism".
Most things which could be changed dynamically in a Python program
usually aren't.

This is one of the crucial observations for making a dynamic language go
fast; the other is that you still have the compiler around if you
guessed wrong.

An aggressively dynamic runtime has two enormous advantages over batch
compilers such as are traditionally used for C: it gets the entire
program in one go, and it gets to see the real live data that the
program's meant to run against. Given that, I'd expect it to be able to
/beat/ a batch compiler in terms of performance.
This has been pointed out many times by many people. There's
even a PhD thesis on the topic. Without a few restrictions, so
that a compiler can at least tell when support for the hard cases
is needed, Python cannot be compiled well.

This assumes static compilation. It's the wrong approach for a dynamic
language like Python.

-- [mdw]
 
C

Carl Banks

This isn't completely accurate.  It *is* possible to write a JIT
compiler
for a Python runtime which has fast path code for the common case, the
case
where the meaning of "+" doesn't change between every opcode.  PyPy
has
produced some pretty good results with this approach.

Right. The key is to be able to dispatch on the type of object once
for a given chunk of code, which is possible if you do some kind of
flow path analysis on the function/chunk.

PyPy is starting to look much better of late, I kind of thought their
first approach was wrong (or at least too much) but they seem to have
pushed though it.


Carl Banks
 
J

John Nagle

But it /can/ make guesses. A dynamic runtime doesn't have to predict
everything right in advance; it only has to predict most things sort of
well enough, and fix up the things it got wrong before anyone notices.
For example, A Python compiler could inline a function call if it makes
a note to recompile the calling function if the called function is
modified. Most functions aren't redefined, so this is probably a pretty
good guess.


The right approach is to guess that things are going to be done the easy
way, and then detect when the guess is wrong.

That's been done successfully for Self and JavaScript. It's not
easy. See this talk on JaegerMonkey:

http://blog.cdleary.com/2010/09/picing-on-javascript-for-fun-and-profit/

The effort needed to do that for Javascript is justified by the size
of the installed base.

The Unladen Swallow people had plans to go in that direction, but they
underestimated the size of the job.

John Nagle
 
S

Stefan Behnel

Mark Wooding, 19.11.2010 02:35:
This assumes static compilation. It's the wrong approach for a dynamic
language like Python.

Cython does a pretty good job in that, though. It also optimistically
optimises a couple of things even during static compilation, e.g.
"x.append(y)" likely hints on "x" being a list, even if static analysis
can't prove that.

Stefan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,012
Latest member
RoxanneDzm

Latest Threads

Top