new string method in 2.5 (partition)

J

John Salerno

Forgive my excitement, especially if you are already aware of this, but
this seems like the kind of feature that is easily overlooked (yet could
be very useful):


Both 8-bit and Unicode strings have new partition(sep) and
rpartition(sep) methods that simplify a common use case.
The find(S) method is often used to get an index which is then used to
slice the string and obtain the pieces that are before and after the
separator. partition(sep) condenses this pattern into a single method
call that returns a 3-tuple containing the substring before the
separator, the separator itself, and the substring after the separator.
If the separator isn't found, the first element of the tuple is the
entire string and the other two elements are empty. rpartition(sep) also
returns a 3-tuple but starts searching from the end of the string; the
"r" stands for 'reverse'.

Some examples:

('', '', 'www.python.org')

(Implemented by Fredrik Lundh following a suggestion by Raymond Hettinger.)
 
M

metaperl

sweet thanks for the heads up.

John said:
Forgive my excitement, especially if you are already aware of this, but
this seems like the kind of feature that is easily overlooked (yet could
be very useful):


Both 8-bit and Unicode strings have new partition(sep) and
rpartition(sep) methods that simplify a common use case.
The find(S) method is often used to get an index which is then used to
slice the string and obtain the pieces that are before and after the
separator. partition(sep) condenses this pattern into a single method
call that returns a 3-tuple containing the substring before the
separator, the separator itself, and the substring after the separator.
If the separator isn't found, the first element of the tuple is the
entire string and the other two elements are empty. rpartition(sep) also
returns a 3-tuple but starts searching from the end of the string; the
"r" stands for 'reverse'.

Some examples:


('', '', 'www.python.org')

(Implemented by Fredrik Lundh following a suggestion by Raymond Hettinger.)
 
J

John Salerno

I'm confused.
What's the difference between this and string.split?
['hello', ' world']
('hello', ',', ' world')


split returns a list of the substrings on either side of the specified
argument.

partition returns a tuple of the substring on the left of the argument,
the argument itself, and the substring on the right. rpartition reads
from right to left.


But you raise a good point. Notice this:
['hello', ' world', ' how are you']
('hello', ',', ' world, how are you')

split will return all substrings. partition (and rpartition) only return
the substrings before and after the first occurrence of the argument.
 
J

John Salerno

Bruno said:
Err... is it me being dumb, or is it a perfect use case for str.split ?

Hmm, I suppose you could get nearly the same functionality as using
split(':', 1), but with partition you also get the separator returned as
well.
There are IMVHO much exciting new features in 2.5 (enhanced generators,
try/except/finally, ternary operator, with: statement etc...)

I definitely agree, but I figure everyone knows about those already.
There are also the startswith() and endswith() string methods that are
new and seem neat as well.
 
T

Tim Chase

partition(sep) condenses this pattern into a single method
I'm confused. What's the difference between this and
string.split?

(please don't top-post...I've inverted and trimmed for the sake
of readability)

I too am a bit confused but I can see uses for it, and there
could be good underlying reason to do as much. Split doesn't
return the separator. It's also guarnteed to return a 3-tuple. E.g.
2

which could make a difference when doing tuple-assignment:
[traceback]

whereas one could consistently do something like

without fear of a traceback to deal with.

Just a few thoughts...

-tkc
 
T

Tim Chase

But you raise a good point. Notice this:
['hello', ' world', ' how are you']
('hello', ',', ' world, how are you')

split will return all substrings. partition (and rpartition) only return
the substrings before and after the first occurrence of the argument.

The split()/rsplit() functions do take an optional argument for
the maximum number of splits to make, just FYI...
Help on built-in function split:

split(...)
S.split([sep [,maxsplit]]) -> list of strings

Return a list of the words in the string S, using sep as the
delimiter string. If maxsplit is given, at most maxsplit
splits are done. If sep is not specified or is None, any
whitespace string is a separator.



(as I use this on a regular basis when mashing up various text
files in a data conversion process)

-tkc
 
B

Bruno Desthuilliers

John Salerno a écrit :
Forgive my excitement, especially if you are already aware of this, but
this seems like the kind of feature that is easily overlooked (yet could
be very useful):


Both 8-bit and Unicode strings have new partition(sep) and
rpartition(sep) methods that simplify a common use case.
The find(S) method is often used to get an index which is then used to
slice the string and obtain the pieces that are before and after the
separator.

Err... is it me being dumb, or is it a perfect use case for str.split ?
partition(sep) condenses this pattern into a single method
call that returns a 3-tuple containing the substring before the
separator, the separator itself, and the substring after the separator.
If the separator isn't found, the first element of the tuple is the
entire string and the other two elements are empty. rpartition(sep) also
returns a 3-tuple but starts searching from the end of the string; the
"r" stands for 'reverse'.

Some examples:


('', '', 'www.python.org')

I must definitively be dumb, but so far I fail to see how it's better
than split and rsplit:
>>> 'http://www.python.org'.split('://') ['http', 'www.python.org']
>>> 'file:/usr/share/doc/index.html'.split('://') ['file:/usr/share/doc/index.html']
>>> u'Subject: a quick question'.split(': ') [u'Subject', u'a quick question']
>>> u'Subject: a quick question'.rsplit(': ') [u'Subject', u'a quick question']
>>> 'www.python.org'.rsplit('.', 1) ['www.python', 'org']
>>>

There are IMVHO much exciting new features in 2.5 (enhanced generators,
try/except/finally, ternary operator, with: statement etc...)
 
T

Thomas Heller

John said:
Hmm, I suppose you could get nearly the same functionality as using
split(':', 1), but with partition you also get the separator returned as
well.

Well, x.split(":", 1) returns a list of one or two elements, depending on x,
while x.partition(":") always returns a three-tuple.

Thomas
 
G

George Sakkis

Bruno said:
I must definitively be dumb, but so far I fail to see how it's better
than split and rsplit:

I fail to see it too. What's the point of returning the separator since
the caller passes it anyway* ?

George

* unless the separator can be a regex, but I don't think so.
 
L

Larry Bates

John said:
Hmm, I suppose you could get nearly the same functionality as using
split(':', 1), but with partition you also get the separator returned as
well.


I definitely agree, but I figure everyone knows about those already.
There are also the startswith() and endswith() string methods that are
new and seem neat as well.

FYI- .startswith() and .endswith() string methods aren't new in 2.5.
They have been around since at least 2.3.

Larry Bates
 
J

John Salerno

Larry said:
FYI- .startswith() and .endswith() string methods aren't new in 2.5.
They have been around since at least 2.3.

Larry Bates

Oops, just a slight change in their functionality:

The startswith() and endswith() methods of string types now accept
tuples of strings to check for.

def is_image_file (filename):
return filename.endswith(('.gif', '.jpg', '.tiff'))

(Implemented by Georg Brandl following a suggestion by Tom Lynn.)
 
J

Jack Diederich

Hmm, I suppose you could get nearly the same functionality as using
split(':', 1), but with partition you also get the separator returned as
well.


I definitely agree, but I figure everyone knows about those already.
There are also the startswith() and endswith() string methods that are
new and seem neat as well.

Partition is much, much nicer than index() or find() for many
(but not all) applications.

diff for cgi.py parsing "var=X"
- i = p.find('=')
- if i >= 0:
- name = p[:i]
- value = p[i+1:]
+ (name, sep_found, value) = p.partition('=')

Notice that preserving the seperator makes for a nice boolean
to test if the partition was successful. Partition raises an
error if you pass an empty seperator.

parition also has the very desirable feature of returning the orignal
string when the seperator isn't found

ex/

script = 'foo.cgi?a=7'
script, sep, params = script.partition('?')

"script" will be "foo.cgi" even if there are no params. With
find or index you have to slice the string by hand and with split
you would do something like.

try:
script, params = script.split('?')
except ValueError: pass

or

parts = script.split('?', 1)
script = parts[0]
params = ''.join(parts[1:])


Grep your source for index, find, and split and try rewriting
the code with partition. Not every instance will turn out cleaner
but many will.

Long-live-partition-ly,

-Jack
 
D

Duncan Booth

George Sakkis said:
I fail to see it too. What's the point of returning the separator since
the caller passes it anyway* ?
The separator is only returned if it was found otherwise you get back an
empty string. Concatenating the elements of the tuple that is returned
always gives you the original string.

It is quite similar to using split(sep,1), but reduces the amount of
special case handling for cases where the separator isn't found.
 
T

Terry Reedy

Bruno Desthuilliers said:
Err... is it me being dumb, or is it a perfect use case for str.split ?

s.partition() was invented and its design settled on as a result of looking
at some awkward constructions in the standard library and other actual use
cases. Sometimes it replaces s.find or s.index instead of s.split. In
some cases, it is meant to be used within a loop. I was not involved and
so would refer you to the pydev discussions.

tjr
 
M

MonkeeSage

s = "There should be one -- and preferably only one -- obvious way to
do it".partition('only one')
print s[0]+'more than one'+s[2]

;)

Regards,
Jordan
 
I

Irmen de Jong

Terry said:
s.partition() was invented and its design settled on as a result of looking
at some awkward constructions in the standard library and other actual use
cases. Sometimes it replaces s.find or s.index instead of s.split. In
some cases, it is meant to be used within a loop. I was not involved and
so would refer you to the pydev discussions.

While there is the functional aspect of the new partition method, I was
wondering about the following /technical/ aspect:

Because the result of partition is a non mutable tuple type containing
three substrings of the original string, is it perhaps also the case
that partition works without allocating extra memory for 3 new string
objects and copying the substrings into them?
I can imagine that the tuple type returned by partition is actually
a special object that contains a few internal pointers into the
original string to point at the locations of each substring.
Although a quick type check of the result object revealed that
it was just a regular tuple type, so I don't think the above is true...

--Irmen
 
S

Steve Holden

Irmen said:
While there is the functional aspect of the new partition method, I was
wondering about the following /technical/ aspect:

Because the result of partition is a non mutable tuple type containing
three substrings of the original string, is it perhaps also the case
that partition works without allocating extra memory for 3 new string
objects and copying the substrings into them?
I can imagine that the tuple type returned by partition is actually
a special object that contains a few internal pointers into the
original string to point at the locations of each substring.
Although a quick type check of the result object revealed that
it was just a regular tuple type, so I don't think the above is true...
It's not.

regards
Steve
 
B

Bruno Desthuilliers

John Salerno a écrit :
Hmm, I suppose you could get nearly the same functionality as using
split(':', 1), but with partition you also get the separator returned as
well.

Well, you already know it since you use it to either split() or
partition the string !-)

Not to say these two new methods are necessary useless - sometimes a
small improvement to an API greatly simplifies a lot of common use cases.
I definitely agree, but I figure everyone knows about those already.
There are also the startswith() and endswith() string methods that are
new

Err... 'new' ???
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,681
Members
48,796
Latest member
Greg L.

Latest Threads

Top