The inverse of .join

M

MRAB

Neil said:
What's the best way to do the inverse operation of the .join
function?
..split, possibly, although there will be problems if the string contains
other occurrences of the separator.
 
M

MRAB

Neil said:
split is perfect except for what happens with an empty string.
I see what you mean.

This is consistent:
>>> ','.join(['']) ''
>>> ''.split(',')
['']

but this isn't:
>>> ','.join([]) ''
>>> ''.split(',')
['']

An empty string could be the result of .join(['']) or .join([]).

Should .split grow an addition keyword argument to specify the desired
behaviour? (Although it's simple enough to define your own function.)
 
R

Robert Kern

split is perfect except for what happens with an empty string.

Why don't you try it and find out?

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
S

Stephen Hansen

Neil said:
split is perfect except for what happens with an empty string.
I see what you mean.

This is consistent:
','.join(['']) ''
''.split(',')
['']

but this isn't:
','.join([]) ''
''.split(',')
['']

An empty string could be the result of .join(['']) or .join([]).

Should .split grow an addition keyword argument to specify the desired
behaviour? (Although it's simple enough to define your own function.)

Guido finds keyword-arguments-to-change-behavior to be unPythonic, IIRC.
It generally means 'make a new API'. But, the question is-- is it worth
the mental strain of a new API?

This is such an extreme edge case, having to do:

if blah:
result = blah.split(',')
else:
result = []

Is really not asking too much, I think.

--

Stephen Hansen
... Also: Ixokai
... Mail: me+list/python (AT) ixokai (DOT) io
... Blog: http://meh.ixokai.io/


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.10 (Darwin)

iQEcBAEBAgAGBQJMGn6UAAoJEKcbwptVWx/llpUH/ixy79zYNRUg1qvnucuQlMwU
ng8odgRwWthGhgdHl5iswlPRt3QcMhABDVRuaiVZuS2fJfmPS1I6QsrRd65wFwHa
nPD3f+Sj4EwsN0rHjvgRSn3c3yXTDb1VSb3za39rdFNLu4vjmmKvKM8T3n2A3LML
K6BZHKuU5oRnm5d3VjJwzOyFWUwQniDKLClQkKHT6YYJP5gXTD5Bl1Shw5Ch4+n8
g2I6WnJVV3N8JFAFn0r0nlfGUrx4Tkh4XttuQNnL3LhW4xi90EzCCqNStFWrsMXK
zP+cQFmC/19pndyzsx+LubY9anvZIxDqy8woUKxqEvJaBFDwyxr4+kSUOxnmo80=
=xnld
-----END PGP SIGNATURE-----
 
N

Neil Cerutti

Why don't you try it and find out?

I'm currently using the following without problems, while reading
a data file. One of the fields is a comma separated list, and may
be empty.

f = rec['codes']
if f == "":
f = []
else:
f = f.split(",")

I just wondered if something smoother was available.
 
R

Robert Kern

I would like to apologize. I read that sentence as a question for some reason.

That said, it always helps for you to show the results that you are getting (and
the code that gives those results) and state what results you were expecting.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
S

Steven D'Aprano

What's the best way to do the inverse operation of the .join function?

str.join is a many-to-one function, and so it doesn't have an inverse.
You can't always get the input back unchanged:
L = ["a", "b", "c|d", "e"]
s = '|'.join(L)
s 'a|b|c|d|e'
s.split('|')
['a', 'b', 'c', 'd', 'e']


There's no general way of getting around this -- if split() takes input
"a|b|c", there is no way even in principle for it to know which of these
operations it should reverse:

"|".join(["a", "b", "c"])
"|".join(["a|b", "c"])
"|".join(["a", "b|c"])
"|".join(["a|b|c"])
"b".join(["a|", "|c"])

The behaviour with the empty string is just a special case of this.
 
S

Steven D'Aprano

I'm currently using the following without problems, while reading a data
file. One of the fields is a comma separated list, and may be empty.

f = rec['codes']
if f == "":
f = []
else:
f = f.split(",")

I just wondered if something smoother was available.

Seems pretty smooth to me. What's wrong with it? I assume you've put it
into a function for ease of use and reduction of code duplication.

You could also use the ternary operator, in which case it's a mere two-
liner and short enough to inline wherever you need it:

f = rec['codes']
f = f.split(",") if f else []
 
N

Neil Cerutti

I'm currently using the following without problems, while
reading a data file. One of the fields is a comma separated
list, and may be empty.

f = rec['codes']
if f == "":
f = []
else:
f = f.split(",")

I just wondered if something smoother was available.

Seems pretty smooth to me. What's wrong with it? I assume
you've put it into a function for ease of use and reduction of
code duplication.

The part that's wrong with it, and it's probably my fault, is
that I can never think of it. I had to go dig it out of my code
to remember what the special case was.
You could also use the ternary operator, in which case it's a
mere two- liner and short enough to inline wherever you need
it:

f = rec['codes']
f = f.split(",") if f else []

That's pretty cool.

Thanks to everybody for their thoughts.
 
J

Jon Clements

Why don't you try it and find out?

I'm currently using the following without problems, while reading
a data file. One of the fields is a comma separated list, and may
be empty.

  f = rec['codes']
  if f == "":
    f = []
  else:
    f = f.split(",")

I just wondered if something smoother was available.

In terms of behaviour and 'safety', I'd go for:
rec = { 'code1': '1,2,3', 'code2': '' }
next(csv.reader([rec['code1']])) ['1', '2', '3']
next(csv.reader([rec['code2']]))
[]

hth
Jon.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top