suggestion for a small addition to the Python 3 list class

R

Robert Yacobellis

Greetings,

I'm an instructor of Computer Science at Loyola University, Chicago, and I and Dr. Harrington (copied on this email) teach sections of COMP 150, Introduction to Computing, using Python 3. One of the concepts we teach students is the str methods split() and join(). I have a suggestion for a small addition to the list class: add a join() method to lists. It would work in a similar way to how join works for str's, except that the object and method parameter would be reversed: <list object>.join(<str object>).

Rationale: When I teach students about split(), I can intuitively tell them split() splits the string on its left on white space or a specified string. Explaining the current str join() method to them doesn't seem to make as much sense: use the string on the left to join the items in the list?? If the list class had a join method, it would be more intuitive to say "join the items in the list using the specified string (the method's argument)." This is similar to Scala's List mkString() method.

I've attached a proposed implementation in Python code which is a little more general than what I've described. In this implementation the list can contain elements of any type, and the separator can also be any data type, not just str.

I've noticed that the str join() method takes an iterable, so in the most general case I'm suggesting to add a join() method to every Python-providediterable (however, for split() vs. join() it would be sufficient to just add a join() method to the list class).

Please let me know your ideas, reactions, and comments on this suggestion.

Thanks and regards,
Dr. Robert (Bob) Yacobellis
 
S

Steven D'Aprano

Greetings,

I'm an instructor of Computer Science at Loyola University, Chicago, and
I and Dr. Harrington (copied on this email) teach sections of COMP 150,
Introduction to Computing, using Python 3. One of the concepts we teach
students is the str methods split() and join(). I have a suggestion for
a small addition to the list class: add a join() method to lists. It
would work in a similar way to how join works for str's, except that the
object and method parameter would be reversed: <list object>.join(<str
object>).

That proposed interface doesn't make much sense to me. You're performing
a string operation ("make a new string, using this string as a
separator") not a list operation, so it's not really appropriate as a
list method. It makes much more sense as a string method.

It is also much more practical as a string method. This way, only two
objects need a join method: strings, and bytes (or if you prefer, Unicode
strings and byte strings). Otherwise, you would need to duplicate the
method in every possible iterable object:

- lists
- tuples
- dicts
- OrderedDicts
- sets
- frozensets
- iterators
- generators
- every object that obeys the sequence protocol
- every object that obeys the iterator protocol

(with the exception of iterable objects such as range objects that cannot
contain strings). Every object would have to contain code that does
exactly the same thing in every detail: walk the iterable, checking that
the item is a string, and build up a new string with the given separator:

class list: # also tuple, dict, set, frozenset, etc...
def join(self, separator):
...


Not only does that create a lot of duplicated code, but it also increases
the burden on anyone creating an iterable class, including iterators and
sequences. Anyone who writes their own iterable class has to write their
own join method, which is actually trickier than it seems at first
glance. (See below.)

Any half-decent programmer would recognise the duplicated code and factor
it out into an external function that takes a separator and a iterable
object:

def join(iterable, separator):
# common code goes here... it's *all* common code, every object's
# join method is identical


That's exactly what Python already does, except it swaps the order of the
arguments:

def join(separator, iterable):
...


and promotes it to a method on strings instead of a bare function.

Rationale: When I teach students about split(), I can intuitively tell
them split() splits the string on its left on white space or a specified
string. Explaining the current str join() method to them doesn't seem
to make as much sense: use the string on the left to join the items in
the list??

Yes, exactly. Makes perfect sense to me.

If the list class had a join method, it would be more
intuitive to say "join the items in the list using the specified string
(the method's argument)."

You can still say that. You just have to move the parenthetical aside:

"Join the items in the list (the method's argument) using the specified
string."


This is similar to Scala's List mkString() method.


This is one place where Scala gets it wrong. In my opinion, as a list
method, mkString ought to operate on the entire list, not its individual
items. The nearest equivalent in Python would be converting a list to a
string using the repr() or str() functions:

py> str([1, 2, 3])
'[1, 2, 3]'


(which of course call the special methods __repr__ or __str__ on the
list).

I've attached a proposed implementation in Python code which is a little
more general than what I've described. In this implementation the list
can contain elements of any type, and the separator can also be any data
type, not just str.

Just for the record, the implementation you provide will be O(N**2) due
to the repeated string concatenation, which means it will be *horribly*
slow for large enough lists. It's actually quite difficult to efficiently
join a lot of strings without using the str.join method. Repeated string
concatenation will, in general, be slow due to the repeated copying of
intermediate results.

By shifting the burden of writing a join method onto everyone who creates
a sequence type, we would end up with a lot of slow code.

If you must have a convenience (inconvenience?) method on lists, the
right way to do it is like this:

class list2(list):
def join(self, sep=' '):
if isinstance(sep, (str, bytes)):
return sep.join(self)
raise TypeError
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,577
Members
45,054
Latest member
LucyCarper

Latest Threads

Top