Recurring patterns: Am I missing it, or can we get these added to thelanguage?

E

Erich

Hello all,

Today I found myself once again defining two functions that I use all
the time: nsplit and iterable. These little helper functions of mine
get used all the time when I work. Im sick of having to define them
(but am very good at it these days, less than 1 typo per function!).
It leads me to the following questions

1. Is this functionality already built in and im just missing it
2. Is there some well known, good technique for these that I missed?
3. Insert question I need to ask here (with a response)

These are the funtions w/ explaination:

def nsplit(s,p,n):
n -= 1
l = s.split(p, n)
if len(l) < n:
l.extend([''] * (n - len(l)))
return l

This is like split() but returns a list of exactly lenght n. This is
very useful when using unpacking, e.g.:
x, y = nsplit('foo,bar,baz', ',', 2)

def iterable(item, count_str=False):
if not count_str and isinstance(item, str):
return False
try:
iter(item)
except:
return False
return True
This is just simple boolean test for whether or not an object is
iterable. I would like to see this in builtins, to mirror callable.
The optional count_str adds flexibility for string handling, since
sometimes I need to iterate over a string, but usually not. I
frequently use it to simplify my case handling in this type of
costruct:

def foo(bar):
bar = bar if iterable(bar) else [bar]
for x in bar:
....

Thanks for feeback,
Erich
 
M

Mike Driscoll

Hello all,

Today I found myself once again defining two functions that I use all
the time: nsplit and iterable. These little helper functions of mine
get used all the time when I work. Im sick of having to define them
(but am very good at it these days, less than 1 typo per function!).
It leads me to the following questions

1. Is this functionality already built in and im just missing it
2. Is there some well known, good technique for these that I missed?
3. Insert question I need to ask here (with a response)

These are the funtions w/ explaination:

def nsplit(s,p,n):
n -= 1
l = s.split(p, n)
if len(l) < n:
l.extend([''] * (n - len(l)))
return l


The split() method has a maxsplit parameter that I think does the same
thing. For example:
['foo', 'bar,baz']

See the docs for more info:

http://docs.python.org/lib/string-methods.html

This is like split() but returns a list of exactly lenght n. This is
very useful when using unpacking, e.g.:
x, y = nsplit('foo,bar,baz', ',', 2)

def iterable(item, count_str=False):
if not count_str and isinstance(item, str):
return False
try:
iter(item)
except:
return False
return True
This is just simple boolean test for whether or not an object is
iterable. I would like to see this in builtins, to mirror callable.
The optional count_str adds flexibility for string handling, since
sometimes I need to iterate over a string, but usually not. I
frequently use it to simplify my case handling in this type of
costruct:

def foo(bar):
bar = bar if iterable(bar) else [bar]
for x in bar:
....

Thanks for feeback,
Erich

Not sure about the other one, but you might look at itertools.

Mike
 
M

Mike Driscoll

Hello all,

Today I found myself once again defining two functions that I use all
the time: nsplit and iterable. These little helper functions of mine
get used all the time when I work. Im sick of having to define them
(but am very good at it these days, less than 1 typo per function!).
It leads me to the following questions

1. Is this functionality already built in and im just missing it
2. Is there some well known, good technique for these that I missed?
3. Insert question I need to ask here (with a response)

These are the funtions w/ explaination:

def nsplit(s,p,n):
n -= 1
l = s.split(p, n)
if len(l) < n:
l.extend([''] * (n - len(l)))
return l

This is like split() but returns a list of exactly lenght n. This is
very useful when using unpacking, e.g.:
x, y = nsplit('foo,bar,baz', ',', 2)

def iterable(item, count_str=False):
if not count_str and isinstance(item, str):
return False
try:
iter(item)
except:
return False
return True
This is just simple boolean test for whether or not an object is
iterable. I would like to see this in builtins, to mirror callable.
The optional count_str adds flexibility for string handling, since
sometimes I need to iterate over a string, but usually not. I
frequently use it to simplify my case handling in this type of
costruct:

Just found this thread on the subject:

http://mail.python.org/pipermail/python-list/2006-July/394487.html

That might answer your question.

def foo(bar):
bar = bar if iterable(bar) else [bar]
for x in bar:
....

Thanks for feeback,
Erich


Mike
 
R

Robin Stocker

Erich said:
This is like split() but returns a list of exactly lenght n. This is
very useful when using unpacking, e.g.:
x, y = nsplit('foo,bar,baz', ',', 2)

You could use the second argument of split:

x, y = 'foo,bar,baz'.split(',', 1)

Note that the number has the meaning "only split n times" as opposed to
"split into n parts".

Cheers,
Robin
 
D

David

Today I found myself once again defining two functions that I use all
the time: nsplit and iterable. These little helper functions of mine
get used all the time when I work. Im sick of having to define them
(but am very good at it these days, less than 1 typo per function!).
It leads me to the following questions

How about creating an erichtools module?
 
T

Tim Chase

def nsplit(s,p,n):
n -= 1
l = s.split(p, n)
if len(l) < n:
l.extend([''] * (n - len(l)))
return l

The split() method has a maxsplit parameter that I think does the same
thing. For example:
['foo', 'bar,baz']


The OP's code *does* use the maxsplit parameter of split()

The important (and missing) aspect of the OP's code in your
example is exercised when there are *fewer* delimited pieces than
"n":
>>> "a,b,c".split(',', 5) ['a', 'b', 'c']
>>> nsplit("a,b,c", ',', 5)
['a', 'b', 'c', '', '']

A few things I noticed that might "improve" the code:

- cache len(l) though my understanding is that len() is an O(1)
operation, so it may not make a difference

- using "delim", "maxsplit", "results" instead of "p", "n" "l" to
make it easier to read

-setting default values to match split()

def nsplit(s, delim=None, maxsplit=None):
if maxsplit:
results = s.split(delim, maxsplit)
result_len = len(results)
if result_len < maxsplit:
results.extend([''] * (maxsplit - result_len)
return results
else:
return s.split(delim)


My suggestion would just be to create your own utils.py module
that holds your commonly used tools and re-uses them

-tkc
 
M

Mike Driscoll

My suggestion would just be to create your own utils.py module
that holds your commonly used tools and re-uses them

-tkc

Well, I almost said that, but I was trying to find some "battery"
included that he could use since the OP seemed to want one. I'm sure
there's a geeky way to do this with lambdas or list comprehensions
too.

Thanks for the feedback though.

Mike
 
J

John Krukoff

Hello all,

Today I found myself once again defining two functions that I use all
the time: nsplit and iterable. These little helper functions of mine
get used all the time when I work. Im sick of having to define them
(but am very good at it these days, less than 1 typo per function!).
It leads me to the following questions

1. Is this functionality already built in and im just missing it
2. Is there some well known, good technique for these that I missed?
3. Insert question I need to ask here (with a response)

These are the funtions w/ explaination:

def nsplit(s,p,n):
n -= 1
l = s.split(p, n)
if len(l) < n:
l.extend([''] * (n - len(l)))
return l

This is like split() but returns a list of exactly lenght n. This is
very useful when using unpacking, e.g.:
x, y = nsplit('foo,bar,baz', ',', 2)

def iterable(item, count_str=False):
if not count_str and isinstance(item, str):
return False
try:
iter(item)
except:
return False
return True
This is just simple boolean test for whether or not an object is
iterable. I would like to see this in builtins, to mirror callable.
The optional count_str adds flexibility for string handling, since
sometimes I need to iterate over a string, but usually not. I
frequently use it to simplify my case handling in this type of
costruct:

def foo(bar):
bar = bar if iterable(bar) else [bar]
for x in bar:
....

Thanks for feeback,
Erich

As far as I know there is no built in function that does exactly what
you want. You can certainly simplify your nsplit function a bit, but as
mentioned, it's probably best just to create your own package and keep
your utility functions there.

It's worth noting that you almost certainly want to be doing
isinstance( item, basestring ) in your iterable function instead of
isinstance( item, str ), or things will get very surprising for you as
soon as you have to deal with a unicode string.

If you don't want the hassle of creating a separate package, and you're
only interested in having these functions be handy on your local python
install, you could also add them into your sitecustomize file as
described here:
http://docs.python.org/lib/module-site.html

On linux, that's as easy as creating a file
named /usr/lib/python2.5/sitecustomize.py that inserts whatever you want
into the __builtin__ module, and it'll be automatically imported
whenever you run python.

I'd doubt there's a case for getting this functionality added to the
language, as your use case seems pretty specific, and it's just not that
hard to write the function that does what you want to do.
 
J

John Krukoff

Tim said:
def nsplit(s, delim=None, maxsplit=None):
if maxsplit:
results = s.split(delim, maxsplit)
result_len = len(results)
if result_len < maxsplit:
results.extend([''] * (maxsplit - result_len)
return results
else:
return s.split(delim)

I'll add a couple more suggestions:

1. Delay the test for maxsplit, as str.split() does the right thing if
maxsplit is None.

2. Use a generator to pad the list, to avoid interim list creation. This
works fine, because list.extend() accepts any iterable. This also shortens
the code a bit, because xrange() does the right thing in this case with
negative numbers. For example:

def nsplit(s, delim=None, maxsplit=None):
results = s.split(delim, maxsplit)
if maxsplit is not None:
results.extend('' for i in xrange(maxsplit - len(results)))
return results


Jeffrey

Neither of these quite match what the OP's nsplit function did, as his n
parameter (maxsplit here) actually specified the number of list items in
the result, not the number of splits to perform. Which makes matching
the default split parameters kind of pointless, as why bother doing all
this work to return a 0 item list in the default maxsplit = None case.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,276
Latest member
Sawatmakal

Latest Threads

Top