Interesting talk on Python vs. Ruby and how he would like Python to have just a bit more syntactic f


S

Steve Howell

Steve Howell said:
But frankly, although there's no reason that you _have_ to name the
content at each step, I find it a lot more readable if you do:
def print_numbers():
    tuples = [(n*n, n*n*n) for n in (1,2,3,4,5,6)]
    filtered = [ cube for (square, cube) in tuples if square!=25 and
cube!=64 ]
    for f in filtered:
        print f
The names you give to the intermediate results here are
terse--"tuples" and "filtered"--so your code reads nicely.

But that example makes tuples and filtered into completely expanded
lists in memory.  I don't know Ruby so I've been wondering whether the
Ruby code would run as an iterator pipeline that uses constant memory.

That's a really good question. I don't know the answer. My hunch is
that you could implement generators using Ruby syntax, but it's
probably not implemented that way.

The fact that Python allows you to turn the intermediate results into
generator expressions is a very powerful feature, of course.
http://haskell.org/ghc/docs/6.10.4/html/users_guide/syntax-extns.html...

might be of interest.  Maybe Ruby and/or Python could grow something similar.

Can you elaborate?
 
Ad

Advertisements

S

Steve Howell

But frankly, although there's no reason that you _have_ to name the
content at each step, I find it a lot more readable if you do:
def print_numbers():
    tuples = [(n*n, n*n*n) for n in (1,2,3,4,5,6)]
    filtered = [ cube for (square, cube) in tuples if square!=25 and
cube!=64 ]
    for f in filtered:
        print f
The names you give to the intermediate results here are
terse--"tuples" and "filtered"--so your code reads nicely.
But that example makes tuples and filtered into completely expanded
lists in memory.  I don't know Ruby so I've been wondering whether the
Ruby code would run as an iterator pipeline that uses constant memory.

I don't know how Ruby works, either.  If it's using constant memory,
switching the Python to generator comprehensions (and getting constant
memory usage) is simply a matter of turning square brackets into
parentheses:

def print_numbers():
    tuples = ((n*n, n*n*n) for n in (1,2,3,4,5,6))
    filtered = ( cube for (square, cube) in tuples if square!=25 and
                 cube!=64 )
    for f in filtered:
        print f

Replace (1,2,3,4,5,6) with xrange(100000000) and memory usage still
stays constant.

Though for this particular example, I prefer a strict looping solution
akin to what Jonathan Gardner had upthread:

for n in (1,2,3,4,5,6):
    square = n*n
    cube = n*n*n
    if square == 25 or cube == 64: continue
    print cube

I don't think the assertion that the names would be ridiculously long
is accurate, either.

Something like:

departments = blah
ny_depts = blah(departments)
non_bonus_depts = blah(ny_depts)
non_bonus_employees = blah(non_bonus_depts)
employee_names = blah(non_bonus_employees)

If the code is at all well-structured, it'll be just as obvious from
the context that each list/generator/whatever is building from the
previous one as it is in the anonymous block case.

I agree that the names don't have to be as ridiculously long as my
examples, but using intermediate locals forces you to come up with
consistent abbreviations between adjacent lines, which adds to the
maintenance burden. When the requirements change so that bonuses
apply to NY and PA departments, you would have to change three places
in the code instead of one.

To the extent that each of your transformations were named functions,
you'd need to maintain the names there as well (something more
descriptive than "blah").
 
P

Paul Rubin

Steve Howell said:
Can you elaborate?

List comprehensions are a Python feature you're probably familiar with,
and I think Ruby has something like them too. They originally came from
Haskell. GHC (the main Haskell implementation) now supports an extended
list comprehension syntax with SQL-like features. I haven't used it
much yet, but here's an example from a poker ranking program
(http://www.rubyquiz.com/quiz24.html) that I did as a Haskell exercise:

let (winners:eek:thers) =
[zip c ls | ls <- lines cs
, let {h = mkHand ls; c=classify h}
, then group by c
, then sortWith by Down c]

It's reasonably evocative and doing the same thing with the older
syntax would have been a big mess. "Down" basically means sort
in reverse order.
 
S

Steve Howell

Steve Howell said:
But frankly, although there's no reason that you _have_ to name the
content at each step, I find it a lot more readable if you do:
def print_numbers():
    tuples = [(n*n, n*n*n) for n in (1,2,3,4,5,6)]
    filtered = [ cube for (square, cube) in tuples if square!=25 and
cube!=64 ]
    for f in filtered:
        print f
The names you give to the intermediate results here are
terse--"tuples" and "filtered"--so your code reads nicely.

But that example makes tuples and filtered into completely expanded
lists in memory.  I don't know Ruby so I've been wondering whether the
Ruby code would run as an iterator pipeline that uses constant memory.

Running the following code would probably answer your question. At
least in the case of Array.map and Array.reject, under my version of
Ruby, each block transforms the entire array before passing control to
the next block.

def print_numbers()
[1, 2, 3, 4, 5, 6].map { |n|
puts 'first block', n
[n * n, n * n * n]
}.reject { |square, cube|
puts 'reject', square
square == 25 || cube == 64
}.map { |square, cube|
cube
}.each { |cube|
puts cube
}
end

print_numbers()

But I'm running only version 1.8.7. Version 1.9 of Ruby apparently
introduced something more akin to generators and Unix pipelines:

http://pragdave.blogs.pragprog.com/pragdave/2007/12/pipelines-using.html

I haven't tried them myself.
 
K

Kurt Smith

    def print_numbers()
        [1, 2, 3, 4, 5, 6].map { |n|
            [n * n, n * n * n]
        }.reject { |square, cube|
            square == 25 || cube == 64
        }.map { |square, cube|
            cube
        }.each { |n|
            puts n
        }
    end

If this style of programming were useful, we would all be writing Lisp
today. As it turned out, Lisp is incredibly difficult to read and
understand, even for experienced Lispers. I am pleased that Python is
not following Lisp in that regard.

for n in range(1,6):
    square = n*n
    cube = n*n*n
    if square == 25 or cube == 64: continue
    print cube

There's definitely a cognitive dissonance between imperative
programming and functional programming.  It's hard for programmers
used to programming in an imperative style to appreciate a functional
approach, because functional solutions often read "upside down" in the
actual source code and common algebraic notation:

   def compute_squares_and_cubes(lst):
       return [(n * n, n * n * n) for n in lst]

   def reject_bad_values(lst):
       return [(square, cube) for (square, cube) \
           in lst if not (square == 25 or cube == 64)]

   def cubes_only(lst):
       return [cube for square, cube in lst]

   def print_results(lst):
       # 1. compute_squares_and_cubes
       # 2. reject_bad_values
       # 3. take cubes_only
       # 4. print values
       for item in \
           cubes_only( # 3
               reject_bad_values( # 2
                   compute_squares_and_cubes(lst))): # 1
           print item # 4

You can, of course, restore the natural order of operations to read
top-down with appropriate use of intermediate locals:

   def print_results(lst):
       lst2 = compute_squares_and_cubes(lst)
       lst3 = reject_bad_values(lst2)
       lst4 = cubes_only(lst3)
       for item in lst4:
           print item

# sent the original to the wrong place -- resending to python-list.

Somewhat off topic, but only somewhat: you could use coroutines to
get a pipeline effect.

#--------------8<-----------------------------
# Shamelessly lifted from David Beazley's
# http://www.dabeaz.com/coroutines/

def coroutine(co):
def _inner(*args, **kwargs):
gen = co(*args, **kwargs)
gen.next()
return gen
return _inner

def squares_and_cubes(lst, target):
for n in lst:
target.send((n * n, n * n * n))

@coroutine
def reject_bad_values(target):
while True:
square, cube = (yield)
if not (square == 25 or cube == 64):
target.send((square, cube))

@coroutine
def cubes_only(target):
while True:
square, cube = (yield)
target.send(cube)

@coroutine
def print_results():
while True:
print (yield)

squares_and_cubes(range(10),
reject_bad_values(
cubes_only(
print_results()
)
)
)
#--------------8<-----------------------------
 
Ad

Advertisements

S

Steven D'Aprano

The names you give to the intermediate results here are terse--"tuples"
and "filtered"--so your code reads nicely.

In a more real world example, the intermediate results would be
something like this:

departments
departments_in_new_york
departments_in_new_york_not_on_bonus_cycle
employees_in_departments_in_new_york_not_on_bonus_cycle
names_of_employee_in_departments_in_new_york_not_on_bonus_cycle

Those last two could be written more concisely as:

serfs_in_new_york
names_of_serfs_in_new_york_as_if_we_cared

But seriously... if you have a variable called "departments_in_new_york",
presumably you also have variables called "departments_in_washington",
"departments_in_los_angeles", "departments_in_houston",
"departments_in_walla_walla", and so forth. If so, this is a good sign
that you are doing it wrong and you need to rethink your algorithm.
 
G

Gregory Ewing

Steve said:
Python may not support the broadest notion of anonymous functions, but
it definitely has anonymous blocks. You can write this in Python:

for i in range(10):
print i
print i * i
print i * i * i

There's a clear difference between this and a Ruby block,
however. A "block" in Ruby is implemented by passing a
callable object to a method. There is no callable object
corresponding to the body of a for-loop in Python.

The important thing about Ruby blocks is not that they're
anonymous, but that they're concrete objects that can
be manipulated.

The Ruby approach has the advantage of making it possible
to implement user-defined control structures without
requiring a macro facility. You can't do that in Python.

However, there's something that Python's iterator protocol
makes possible that you can't do with a block-passing
approach. You can have multiple iterators active at once,
and pull values from them as an when required in the
calling code. Ruby's version of the iterator protocol
can't handle that, because once an iterator is started
it retains control until it's finished.

Also, most people who advocate adding some form of
block-passing facility to Python don't seem to have
thought through what would happen if the block contains
any break, continue, return or yield statements.

This issue was looked into in some detail back when there
was consideration of implementing the with-statement
by passing the body as a function. Getting these
statements to behave intuitively inside the body
turned out to be a very thorny problem -- thorny enough
to cause the block-passing idea to be abandoned in
favour of the current implementation.
 
S

Steven D'Aprano

The Ruby approach has the advantage of making it possible to implement
user-defined control structures without requiring a macro facility. You
can't do that in Python. [...]
Also, most people who advocate adding some form of block-passing
facility to Python don't seem to have thought through what would happen
if the block contains any break, continue, return or yield statements.

That is the only time I ever wanted blocks: I had a series of functions
containing for loops that looked something vaguely like this:

for x in sequence:
code_A
try:
something
except some_exception:
code_B

where code_B was different in each function, so I wanted to pull it out
as a code block and do this:


def common_loop(x, block):
code_A
try:
something
except some_exception:
block

for x in sequence:
common_loop(x, block)


The problem was that the blocks contained a continue statement, so I was
stymied.
 
S

Steve Howell

Those last two could be written more concisely as:

serfs_in_new_york
names_of_serfs_in_new_york_as_if_we_cared

But seriously... if you have a variable called "departments_in_new_york",
presumably you also have variables called "departments_in_washington",
"departments_in_los_angeles", "departments_in_houston",
"departments_in_walla_walla", and so forth. If so, this is a good sign
that you are doing it wrong and you need to rethink your algorithm.

Sure, but it could also be that you're launching a feature that is
only temporarily limited to New York departments, and any investment
in coming up with names for the New York filter function or
intermediate local variables becomes pointless once you go national:

# version 1
emps = [
['Bob Rich', 'NY', 55],
['Alice Serf', 'NY', 30],
['Joe Peasant', 'MD', 12],
['Mary Pauper', 'CA', 13],
]

emps.select { |name, state, salary|
salary < 40
}.select { |name, state, salary|
# limit bonuses to NY for now...reqs
# may change!
state == 'NY'
}.each { |name, state, salary|
new_salary = salary * 1.1
puts "#{name} gets a raise to #{new_salary}!"
}

# version 2
emps = [
['Bob Rich', 'NY', 55],
['Alice Serf', 'NY', 30],
['Joe Peasant', 'MD', 12],
['Mary Pauper', 'CA', 13],
]

emps.select { |name, state, salary|
salary < 40
}.each { |name, state, salary|
new_salary = salary * 1.1
puts "#{name} gets a raise to #{new_salary}!"
}
 
Ad

Advertisements

S

Steve Howell

    def print_numbers()
        [1, 2, 3, 4, 5, 6].map { |n|
            [n * n, n * n * n]
        }.reject { |square, cube|
            square == 25 || cube == 64
        }.map { |square, cube|
            cube
        }.each { |n|
            puts n
        }
    end
If this style of programming were useful, we would all be writing Lisp
today. As it turned out, Lisp is incredibly difficult to read and
understand, even for experienced Lispers. I am pleased that Python is
not following Lisp in that regard.
for n in range(1,6):
    square = n*n
    cube = n*n*n
    if square == 25 or cube == 64: continue
    print cube
There's definitely a cognitive dissonance between imperative
programming and functional programming.  It's hard for programmers
used to programming in an imperative style to appreciate a functional
approach, because functional solutions often read "upside down" in the
actual source code and common algebraic notation:
   def compute_squares_and_cubes(lst):
       return [(n * n, n * n * n) for n in lst]
   def reject_bad_values(lst):
       return [(square, cube) for (square, cube) \
           in lst if not (square == 25 or cube == 64)]
   def cubes_only(lst):
       return [cube for square, cube in lst]
   def print_results(lst):
       # 1. compute_squares_and_cubes
       # 2. reject_bad_values
       # 3. take cubes_only
       # 4. print values
       for item in \
           cubes_only( # 3
               reject_bad_values( # 2
                   compute_squares_and_cubes(lst))): # 1
           print item # 4
You can, of course, restore the natural order of operations to read
top-down with appropriate use of intermediate locals:
   def print_results(lst):
       lst2 = compute_squares_and_cubes(lst)
       lst3 = reject_bad_values(lst2)
       lst4 = cubes_only(lst3)
       for item in lst4:
           print item

# sent the original to the wrong place -- resending to python-list.

Somewhat off topic, but only somewhat:  you could use coroutines to
get a pipeline effect.

#--------------8<-----------------------------
# Shamelessly lifted from David Beazley's
#  http://www.dabeaz.com/coroutines/

def coroutine(co):
   def _inner(*args, **kwargs):
       gen = co(*args, **kwargs)
       gen.next()
       return gen
   return _inner

def squares_and_cubes(lst, target):
   for n in lst:
       target.send((n * n, n * n * n))

@coroutine
def reject_bad_values(target):
   while True:
       square, cube = (yield)
       if not (square == 25 or cube == 64):
           target.send((square, cube))

@coroutine
def cubes_only(target):
   while True:
       square, cube = (yield)
       target.send(cube)

@coroutine
def print_results():
   while True:
       print (yield)

squares_and_cubes(range(10),
       reject_bad_values(
           cubes_only(
               print_results()
               )
           )
       )

Wow! It took me a while to get my head around it, but that's pretty
cool.
 
S

Steve Howell

There's a clear difference between this and a Ruby block,
however. A "block" in Ruby is implemented by passing a
callable object to a method. There is no callable object
corresponding to the body of a for-loop in Python.

The important thing about Ruby blocks is not that they're
anonymous, but that they're concrete objects that can
be manipulated.

Agreed.

The Ruby approach has the advantage of making it possible
to implement user-defined control structures without
requiring a macro facility. You can't do that in Python.

However, there's something that Python's iterator protocol
makes possible that you can't do with a block-passing
approach. You can have multiple iterators active at once,
and pull values from them as an when required in the
calling code. Ruby's version of the iterator protocol
can't handle that, because once an iterator is started
it retains control until it's finished.

Is this still true or Ruby today?

http://pragdave.blogs.pragprog.com/pragdave/2007/12/pipelines-using.html
Also, most people who advocate adding some form of
block-passing facility to Python don't seem to have
thought through what would happen if the block contains
any break, continue, return or yield statements.

For sure. It's certainly not clear to me how Ruby handles all those
cases, although I am still quite new to Ruby, so it's possible that I
just haven't stumbled upon the best explanations yet.
This issue was looked into in some detail back when there
was consideration of implementing the with-statement
by passing the body as a function. Getting these
statements to behave intuitively inside the body
turned out to be a very thorny problem -- thorny enough
to cause the block-passing idea to be abandoned in
favour of the current implementation.

I found these links in the archive...were these part of the discussion
you were referring to?

http://mail.python.org/pipermail/python-dev/2005-April/052907.html

http://mail.python.org/pipermail/python-dev/2005-April/053055.html

http://mail.python.org/pipermail/python-dev/2005-April/053123.html
 
A

Anh Hai Trinh

Wow!  It took me a while to get my head around it, but that's pretty
cool.


This pipeline idea has actually been implemented further, see <http://
blog.onideas.ws/stream.py>.

from stream import map, filter, cut
range(10) >> map(lambda x: [x**2, x**3]) >> filter(lambda t: t[0]!
=25 and t[1]!=64) >> cut[1] >> list
[0, 1, 8, 27, 216, 343, 512, 729]
 
S

Steve Howell

This pipeline idea has actually been implemented further, see <http://
blog.onideas.ws/stream.py>.
from stream import map, filter, cut
range(10) >> map(lambda x: [x**2, x**3]) >> filter(lambda t: t[0]!
=25 and t[1]!=64) >> cut[1] >> list
[0, 1, 8, 27, 216, 343, 512, 729]

Wow, cool!

Just to show that you can easily add the iterator.map(f).blabla-syntax  
to Python:

     from __future__ import print_function

     class rubified(list):
         map    = lambda self, f: rubified(map(f, self))
         filter = lambda self, f: rubified(filter(f, self))
         reject = lambda self, f: rubified(filter(lambda x: not f(x),  
self))
         # each = lambda self, f: rubified(reduce(lambda x, y:  
print(y), self, None))
         def each(self, f):
             for x in self: f(x)

         def __new__(cls, value):
             return list.__new__(cls, value)

     def print_numbers():
         rubified([1, 2, 3, 4, 5, 6]).map(lambda n:
             [n * n, n * n * n]).reject(lambda (square, cube):
             square == 25 or cube == 64).map(lambda (square, cube):
             cube).each(lambda n:
             print(n))

Sure, that definitely achieves the overall sequential structure of
operations that I like in Ruby. A couple other example have been
posted as well now, which also mimic something akin to a Unix
pipeline.

A lot of Ruby that I see gets spelled like this:

list.select { |arg1, arg2|
expr
}.reject { |arg|
expr
}.collect { |arg}
expr
}

With your class you can translate into Python as follows:

list.select(lambda arg1, arg2:
expr
).reject(lambda arg:
expr
).collect(lambda arg:
expr
)

So for chaining transformations based on filters, the difference
really just comes down to syntax (and how much sugar is built into the
core library).

The extra expressiveness of Ruby comes from the fact that you can add
statements within the block, which I find useful sometimes just for
debugging purposes:

debug = true
data = strange_dataset_from_third_party_code()
data.each { |arg|
if debug and arg > 10000
puts arg
end
# square the values
arg * arg
}
 
S

Steven D'Aprano

The extra expressiveness of Ruby comes from the fact that you can add
statements within the block, which I find useful sometimes just for
debugging purposes:

debug = true
data = strange_dataset_from_third_party_code()
data.each { |arg|
if debug and arg > 10000
puts arg
end
# square the values
arg * arg
}

How is that different from this?

debug = true
data = strange_dataset_from_third_party_code()
for i, arg in enumerate(data):
if debug and arg > 10000
print arg
# square the values
data = arg * arg


I don't see the extra expressiveness. What I see is that the Ruby snippet
takes more lines (even excluding the final brace), and makes things
implicit which in my opinion should be explicit. But since I'm no Ruby
expert, perhaps I'm misreading it.
 
Ad

Advertisements

S

Steve Howell

The extra expressiveness of Ruby comes from the fact that you can add
statements within the block, which I find useful sometimes just for
debugging purposes:
    debug = true
    data = strange_dataset_from_third_party_code()
    data.each { |arg|
        if debug and arg > 10000
            puts arg
        end
        # square the values
        arg * arg
    }

How is that different from this?

debug = true
data = strange_dataset_from_third_party_code()
for i, arg in enumerate(data):
    if debug and arg > 10000
        print arg
    # square the values
    data = arg * arg

I don't see the extra expressiveness. What I see is that the Ruby snippet
takes more lines (even excluding the final brace), and makes things
implicit which in my opinion should be explicit. But since I'm no Ruby
expert, perhaps I'm misreading it.


You are reading the example out of context.

Can you re-read the part you snipped?

The small piece of code can obviously be written imperatively, but the
point of the example was not to print a bunch of squares.
 
L

Lie Ryan

In a more real world example, the intermediate results would be
something like this:

departments
departments_in_new_york
departments_in_new_york_not_on_bonus_cycle
employees_in_departments_in_new_york_not_on_bonus_cycle
names_of_employee_in_departments_in_new_york_not_on_bonus_cycle

I fare better, in less than ten-seconds thinking:

departments
eligible_departments
eligible_departments
eligible_employees
eligible_employee_names

as a bonus, they would be much more resilient when there are change of
eligibility requirements.

Names doesn't have to exactly describe what's in it; in fact, if your
names is way too descriptive, it may take significantly more brain-cycle
to parse. A good name abstracts the objects contained in it.
 
L

Lawrence D'Oliveiro

They are both lambda forms in Python. As a Python expression, they
evaluate to (they “returnâ€) a function object.

So there is no distinction between functions and procedures, then?
 
Ad

Advertisements

L

Lawrence D'Oliveiro

In message <84166541-c10a-47b5-ae5b-
Some people make the definition of function more restrictive--"if it
has side effects, it is not a function."

Does changing the contents of CPU cache count as a side-effect?
 

Top