Proposed new collection methods

M

Mike Meyer

Another thread pointed out a couple of methods that would be nice to
have on Python collections: find and inject. These are taken from
<URL: http://martinfowler.com/bliki/CollectionClosureMethod.html >.

find can be defined as:

def find(self, test = None):
for item in self:
if test:
if test(item):
return item
elif item:
return item
return ValueError, '%s.index(): no matching items in list' \
% self.__class__.__name__

find with no arguments is identical to the any function <URL:
http://www.artima.com/forums/flat.jsp?forum=106&thread=98196 > that
Guido has already accepted <URL:
http://mail.python.org/pipermail/python-dev/2005-March/052010.html >,
except it's a method.

find raises a ValueError so that you can search for any value. This
also keeps it's behavior similar to index.

An alternative to adding a new method to collections would be adding a
new keyword argument - test - to the list index method. This would
remove the overlap of functionality with any. However, it would also
mean you couldn't use it on anything but lists (and presumably
strings). I'm not sure how serious a restriction that is.

inject is basically an OO version of reduce. You can define it in
terms of reduce:

def inject(self, op, init = None):
return reduce(op, self, init)

The arguments against reduce probably apply to it as well. But it
makes the pain from removing reduce vanish.

These have probably been proposed before, but under other names. If
so, I'd appreciate pointers to the discussion.

<mike
 
C

Christopher Subich

Mike said:
Another thread pointed out a couple of methods that would be nice to
have on Python collections: find and inject. These are taken from
<URL: http://martinfowler.com/bliki/CollectionClosureMethod.html >.

find can be defined as:

def find(self, test = None):
for item in self:
if test:
if test(item):
return item
elif item:
return item
return ValueError, '%s.index(): no matching items in list' \
% self.__class__.__name__

Dear Zeus no. Find can be defined as:
def find(self, test=lambda x:1):
try:
item = (s for s in iter(self) if test(s)).next()
except StopIteration:
raise ValueError('No matching items in list')

Let's use generators, people. And given the simplicity of writing this,
I'm not sure we need it in the standard library -- especially since the
default test is arbitrary. This recipe can also be useful for
dictionaries, where the syntax would be slightly different, and
lists-of-immutables, in which case returning the index in the list might
be better. Too much customization.
inject is basically an OO version of reduce. You can define it in
terms of reduce:

Except that if it's exactly reduce, we don't need to call it inject.
The problem with reduce in python isn't that it's functional rather than
OO, it's that it's limited to a function call or lambda -- one
expression rather than an anonymous block.
 
R

Robert Kern

Christopher said:
Dear Zeus no. Find can be defined as:
def find(self, test=lambda x:1):
try:
item = (s for s in iter(self) if test(s)).next()
except StopIteration:
raise ValueError('No matching items in list')

I would prefer that a find() operation return as soon as it locates an
item that passes the test. This generator version tests every item.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
R

Robert Kern

Robert said:
Christopher Subich wrote:



I would prefer that a find() operation return as soon as it locates an
item that passes the test. This generator version tests every item.

Pardon me, I am retarded.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
J

Jeff Schwab

Robert said:
Pardon me, I am retarded.

Why are you retarded? Isn't the above code O(n)?

Forgive me for not understanding, I'm still awfully new to Python
(having come from Perl & C++), and I didn't see an explanation in the FAQ.
 
R

Robert Kern

Jeff said:
Why are you retarded? Isn't the above code O(n)?

Forgive me for not understanding, I'm still awfully new to Python
(having come from Perl & C++), and I didn't see an explanation in the FAQ.

(s for s in iter(self) is test(s)) is a generator expression. It is
roughly equivalent to

def g(self, test=lambda x: True):
for s in iter(self):
if test(s):
yield s

Now, if I were to do

item = g(self, test).next()

the generator would execute the code until it reached the yield
statement which happens when it finds the first item that passes the
test. That item will get returned, and execution does not return to the
generator again.

That implementation does indeed return as soon as it locates the first
item, so yes, I was being retarded.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
J

Jeff Schwab

Robert said:
(s for s in iter(self) is test(s)) is a generator expression. It is
roughly equivalent to

def g(self, test=lambda x: True):
for s in iter(self):
if test(s):
yield s

Now, if I were to do

item = g(self, test).next()

the generator would execute the code until it reached the yield
statement which happens when it finds the first item that passes the
test. That item will get returned, and execution does not return to the
generator again.

That implementation does indeed return as soon as it locates the first
item, so yes, I was being retarded.

Wow, that's cool! Very reminiscent of C++ input iterators, but a lot
cleaner and shorter. Thanks.
 
C

Christopher Subich

(s for s in iter(self) is test(s)) is a generator expression. It is
roughly equivalent to [snip]
That implementation does indeed return as soon as it locates the first
item, so yes, I was being retarded.

Thank you for the best programming-language related laugh I've had
today. :) I know just the kind of synapse-misfires that lead to
completely obvious, yet also completely wrong conclusions like that --
I've done well more than my own share.

For the grandparent poster: generators and its baby brother generator
expressions are the kind of really neat feature that you're never told
about in CS101. Generators themselves are relatively well-covered in
the Python documentation, which should serve as a decent introduction to
the topic.
 
C

Christopher Subich

Jeff said:
Robert said:
Now, if I were to do

item = g(self, test).next()

the generator would execute the code until it reached the yield
statement which happens when it finds the first item that passes the
test. That item will get returned, and execution does not return to
the generator again.
[snip]
Wow, that's cool! Very reminiscent of C++ input iterators, but a lot
cleaner and shorter. Thanks.

Read up on the __iter__ and __next__ methods that can be implemented by
objects; Python objects have a very nice way of becoming (and returning)
iterators. Generator functions and expressions are magic ways of doing
some Really Common Things as iterators.
 
T

Terry Reedy

Mike Meyer said:
Another thread pointed out a couple of methods that would be nice to
have on Python collections: find and inject.

Since Python does not have a collections superclass, I am puzzled as to
what you are really proposing.
find with no arguments is identical to the any function,
except it's [find is] a method.

Again, a method of what?

The virtue of functions of iterables is that once written, the
functionality immediately becomes available to *EVERY* iterable -- for
free. Moreover, anybody can write one of the infinite number of such
functions and use it *without* begging Guido (or Mats) to please add it to
a perhaps already overly long list of methods of the collection superclass.
An alternative to adding a new method to collections would be adding a
new keyword argument - test - to the list index method. This would
remove the overlap of functionality with any. However, it would also
mean you couldn't use it on anything but lists (and presumably
strings). I'm not sure how serious a restriction that is.

Compared to the infinite universe of possible iterables, the restriction is
pretty severe. Index should perhaps be turned into a universal iterable
function. You could suggest that any() get an optional keyword or 2nd
param.

In Python, generic functions are functions. Methods are functions properly
restricted to one or a few classes and perhaps subclasses.
def inject(self, op, init = None):
return reduce(op, self, init)

You aren't seriously suggesting that reduce be renamed to something more
obscure, even contradictory, are you? Inject means to add something into,
while reduce usually means to pull a summary out of. Perhaps I missed
something.

Terry J. Reedy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top