.split() Qeustion

eschneider92 · Aug 14, 2013

How can I use the '.split()' method (am I right in calling it a method?) without instead of writing each comma between words in the pie list in the following code? Also, is there a way to use .split instead of typing the apostrophes? Thank you.

import random
pie=['keylime', 'peach', 'apple', 'cherry', 'pecan']
print(random.choice(pie))

Eric

Gary Herron · Aug 14, 2013

How can I use the '.split()' method (am I right in calling it a method?) without instead of writing each comma between words in the pie list in the following code? Also, is there a way to use .split instead of typing the apostrophes? Thank you.

import random
pie=['keylime', 'peach', 'apple', 'cherry', 'pecan']
print(random.choice(pie))

Eric

I think you are referring to this:
pie = 'keylime peach apple cherry pecan'.split()

While it's easier to type, and does save a few characters, I think the
original list is clearer to a reader of your program.

Gary Herron

Dave Angel · Aug 14, 2013

How can I use the '.split()' method (am I right in calling it a method?) without instead of writing each comma between words in the pie list in the following code? Also, is there a way to use .split instead of typing the apostrophes? Thank you.

import random
pie=['keylime', 'peach', 'apple', 'cherry', 'pecan']
print(random.choice(pie))

I can't make any sense out of the first sentence. But maybe I can guess
what you're looking for.

The split() method is indeed a method of the str class. It takes an
optional argument for the separator character. By default it uses
whitespace. So if you're trying to specify a series of items, none of
which contain any whitespace, you can readily use split to build your
list:

pie = "keylime peach apple cherry pecan".split()

However, there'd be no way to specify an item called "chocolate
marshmallow". If you need to include whitespace in any item, then you'd
have to use some other separator, like a comma:

pie = "keylime,chocolate marshmallow,peach,apple,cherry,
pecan".split(",")

Krishnan Shankar · Aug 14, 2013

Hi,

How can I use the '.split()' method (am I right in calling it a method?)

The .split() is a method in Python which comes as in built method for
String objects in Python. Any string defined in python will have the
ability to call this function.
['__add__', '__class__', '__contains__', '__delattr__', '__doc__',
'__eq__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__',
'__getslice__', '__gt__', '__hash__', '__init__', '__le__', '__len__',
'__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__',
'__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__',
'__str__', 'capitalize', 'center', 'count', 'decode', 'encode', 'endswith',
'expandtabs', 'find', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower',
'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip',
'replace', 'rfind', 'rindex', 'rjust', 'rsplit', 'rstrip', 'split',
'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate',
'upper', 'zfill']

var.split() ['Hello', 'how', 'r', 'u?']

Click to expand...

Click to expand...

writing each comma between words in the pie list in the following code?

Also, is there >a way to use .split instead of typing the apostrophes?
Thank you.

import random
pie=['keylime', 'peach', 'apple', 'cherry', 'pecan']
print(random.choice(pie))

If you are talking about having predefined list pie with limited elements
like above it is ok to code them straightaway with apostrophes and others
will know that it is a predefined list.

Suppose if the elements in list come as a line in a file or is a string, it
will be better to use split() method and form a list. I hope Gary has
provided the example for the same.

pie = 'keylime peach apple cherry pecan'.split()

I hope this clarifies your doubt.

Regards,
Krishnan

How can I use the '.split()' method (am I right in calling it a method?)
without instead of writing each comma between words in the pie list in the
following code? Also, is there a way to use .split instead of typing the
apostrophes? Thank you.

import random
pie=['keylime', 'peach', 'apple', 'cherry', 'pecan']
print(random.choice(pie))

Eric

eschneider92 · Aug 14, 2013

It's obvious that the word 'without' in my first sentence was meant to be ommited, and it's a simple question. Thank Gary!

Peter Otten · Aug 14, 2013

Joshua said:
On 08/13/2013 09:51 PM, (e-mail address removed) wrote:
How can I use the '.split()' method (am I right in calling it a
method?) without instead of writing each comma between words in the pie
list in the following code? Also, is there a way to use .split instead
of typing the apostrophes? Thank you.

import random pie=['keylime', 'peach', 'apple', 'cherry', 'pecan']
print(random.choice(pie))

Eric

I think you are referring to this:
pie = 'keylime peach apple cherry pecan'.split()

While it's easier to type, and does save a few characters, I think the
original list is clearer to a reader of your program.

Gary Herron

Click to expand...

I would agree with the last statement.
Please write list definitions as lists rather than taking a short-cut to
save a few key presses

Click to expand...

That's true with this example, but is:

lines = [
"Developments in high-speed rail, and high-speed", ....
"same problems the latter was designed to solve."
]

really more readable than:

lines = """\
Developments in high-speed rail, and high-speed ....
same problems the latter was designed to solve.
"""[1:-1].split("\n")

?

It's definitely more correct -- unless you meant to strip the "D" from the
first line

I would use

lines = """\
Developments in high-speed rail, and high-speed
....
same problems the latter was designed to solve.
""".splitlines()

Joshua Landau · Aug 14, 2013

Joshua said:
Joshua said:

I would agree with the last statement.
Please write list definitions as lists rather than taking a short-cut to
save a few key presses

Click to expand...

That's true with this example, but is:

lines = [
"Developments in high-speed rail, and high-speed", ...
"same problems the latter was designed to solve."
]

really more readable than:

lines = """\
Developments in high-speed rail, and high-speed ...
same problems the latter was designed to solve.
"""[1:-1].split("\n")

?

Click to expand...

It's definitely more correct -- unless you meant to strip the "D" from the
first line

I would use

lines = """\
Developments in high-speed rail, and high-speed
...
same problems the latter was designed to solve.
""".splitlines()

Thanks, I didn't actually know about .splitlines()!

wxjmfauth · Aug 14, 2013

Le mercredi 14 août 2013 13:55:23 UTC+2, Joshua Landau a écrit :

Joshua Landau wrote:

I would agree with the last statement.
Please write list definitions as lists rather than taking a short-cutto
save a few key presses

That's true with this example, but is:

lines = [
"Developments in high-speed rail, and high-speed",

Click to expand...

...

Click to expand...

"same problems the latter was designed to solve."
]

really more readable than:

lines = """\
Developments in high-speed rail, and high-speed

Click to expand...

...

Click to expand...

same problems the latter was designed to solve.
"""[1:-1].split("\n")

?

Click to expand...

It's definitely more correct -- unless you meant to strip the "D" from the

Click to expand...

first line

I would use

lines = """\

Click to expand...

Developments in high-speed rail, and high-speed

same problems the latter was designed to solve.

Click to expand...

""".splitlines()

Click to expand...

Thanks, I didn't actually know about .splitlines()!

a = ['==\r**', '==\n**', '==\r\n**', '==\u0085**',
'==\u000b**', '==\u000c**', '==\u2028**', '==\u2029**']
for e in a:
print(e.splitlines())

['==', '**']
['==', '**']
['==', '**']
['==', '**']
['==', '**']
['==', '**']
['==', '**']
['==', '**']

Do not confuse these NLF's (new line functions) in the Unicode
terminology, with the end of line *symbols* (pilcrow, \u2424, ...)

I'm always and still be suprised by the number of hard coded
'\n' one can find in Python code when the portable (here
win)
'\r\n'

exists.

jmf

random832 · Aug 14, 2013

I'm always and still be suprised by the number of hard coded
'\n' one can find in Python code when the portable (here
win)

'\r\n'

exists.

Because high-level code isn't supposed to use the os module directly.
Text-mode streams automatically convert newlines you write to them.

Chris Angelico · Aug 14, 2013

Because high-level code isn't supposed to use the os module directly.
Text-mode streams automatically convert newlines you write to them.

I'm always, and will still be, surprised by the number of hard coded
decimal integers one can find in Python code, when the portable way to
do it is to use ctypes and figure out whether your literals should be
big-endian or little-endian, 32-bit or 64-bit, etc. Yet people
continue to just put decimal literals in their code! It can't be
portable.

ChrisA

Tim Chase · Aug 14, 2013

I'm always, and will still be, surprised by the number of hard coded
decimal integers one can find in Python code, when the portable way
to do it is to use ctypes and figure out whether your literals
should be big-endian or little-endian, 32-bit or 64-bit, etc. Yet
people continue to just put decimal literals in their code! It
can't be portable.

No, no, no...you want

from sys.platform.integers import 0, 1, 2, 3, 14, 42

to be portable against endian'ness and bit-width. Granted, one might
confuse them with regular numeric literals, so it would be best to
clarify them by namespace:

import sys
answer_to_life = sys.platform.integers.42
print(sum(range(sys.platform.integers.0, sys.platform.integers.14)))

That way you ensure platform independence, and *much* clearer! ;-)

-tkc

Skip Montanaro · Aug 14, 2013

Because high-level code isn't supposed to use the os module directly.

That seems a bit extreme. One would hope that Guido and the rest of
the crew created the os module so people would use it instead of
resorting to other lower level hacks. A quick find/grep of my own
code suggests that I import os more than sys. I use it mostly for
os.path.* and os.environ. I'm not sure there's a higher level way to
access them without putting more layers between your code and those
objects, which code would obviously have to call them anyway.

Did I just misread your comment?

Skip

Chris Angelico · Aug 14, 2013

No, no, no...you want

from sys.platform.integers import 0, 1, 2, 3, 14, 42

to be portable against endian'ness and bit-width.

Oh! I didn't know about sys.platform.integers. All this time I've been
doing it manually, usually copying and pasting a block of integer
definitions from the re module. (I used to copy them from
adamant.princess.ida but some of them were buggy. 2+2 made 5, or 3, or
7, or 25, depending on need.)

ChrisA

Terry Reedy · Aug 14, 2013

Because high-level code isn't supposed to use the os module directly.

This is a bit extreme, but definitely true for os.linesep and *much* of
os other than os.path and maybe os.environ.

Text-mode streams automatically convert newlines you write to them.

By default, <any possible linesep> to \n when reading files;, \n to
os.linesep when writing. Windows is the only major OS for which
os.linesep is not \n.

The full details, from the builtin 'open' entry:
"
newline controls how universal newlines mode works (it only applies to
text mode). It can be None, '', '\n', '\r', and '\r\n'. It works as follows:

When reading input from the stream, if newline is None, universal
newlines mode is enabled. Lines in the input can end in '\n', '\r', or
'\r\n', and these are translated into '\n' before being returned to the
caller. If it is '', universal newlines mode is enabled, but line
endings are returned to the caller untranslated. If it has any of the
other legal values, input lines are only terminated by the given string,
and the line ending is returned to the caller untranslated.

When writing output to the stream, if newline is None, any '\n'
characters written are translated to the system default line separator,
os.linesep. If newline is '' or '\n', no translation takes place. If
newline is any of the other legal values, any '\n' characters written
are translated to the given string.
"

Steven D'Aprano · Aug 15, 2013

Because high-level code isn't supposed to use the os module directly.

Say what? My brain hurts. The os module is full of lots of platform
independent goodies. It is the right way to split and combine pathnames,
test environment variables, manipulate file permissions, and other useful
tasks.

Text-mode streams automatically convert newlines you write to them.

Maybe they do, maybe they don't. It depends on whether Python is built
with universal newline support, and what sort of text-mode stream you are
using, and a bunch of other rules that make it a little more complicated
than just "automatically convert".

wxjmfauth · Aug 15, 2013

Le mercredi 14 août 2013 19:14:59 UTC+2, Chris Angelico a écrit :

I'm always, and will still be, surprised by the number of hard coded

decimal integers one can find in Python code, when the portable way to

do it is to use ctypes and figure out whether your literals should be

big-endian or little-endian, 32-bit or 64-bit, etc. Yet people

continue to just put decimal literals in their code! It can't be

portable.

ChrisA

------

As a stupid scientist, I have the habbit to compare
things of the same nature with the same units.

This *string* containing one *character*
26

consumes 26 *bytes*.

This *string* containing one *character*
40

consumes 40 *bytes*.

and the difference is

40 [bytes] - 26 [bytes] = 14 [bytes] .

—————

Python seems to consider os.linesep as a
str.
True

—————

PS A "mole" is not a number.

jmf

wxjmfauth · Aug 15, 2013

A technical ascpect of triple quoted strings is
that the "end of lines" are not respected.
.... r = fo.read()
....b'"""abc\r\ndef\r\n"""\r\n'

Now, one can argue...

jmf

Chris Angelico · Aug 15, 2013

A technical ascpect of triple quoted strings is
that the "end of lines" are not respected.

... r = fo.read()
...
b'"""abc\r\ndef\r\n"""\r\n'

Now, one can argue...

Actually, they are respected. Triple-quoted strings are parsed after
the file is read in, and newline handling is dealt with at an earlier
level, same as encodings are. You wouldn't expect that string to
retain information about the source file's encoding, and nor do you
expect it to retain the source file's newline style. A newline is
represented in the file on disk as \r\n or \n (or something else,
even), and in the string as \n. It's that simple.

ChrisA

Steven D'Aprano · Aug 15, 2013

A technical ascpect of triple quoted strings is that the "end of lines"
are not respected.

'abc\ndef\n'

You are misinterpreting what you are seeing. You are not reading lines of
text from a file. You are importing a module, and they accessing its
__doc__ attribute. The relationship between the module object and text
from a file is tenuous, at best:

- the module's file could use \n line endings, or \r, or \r\n, or even
something else, depending on the platform;

- the module might be a compiled .pyc file, and there are no lines of
text to read, just byte code;

- or a .dll or .so library, again, no lines of text, just compiled code;

- or there might not even be a file, like the sys module, which is
entirely built into the interpreter;

- or it might not even be a module object, you can put anything into
sys.module. It might be an instance with docstrings computed on the fly.

So you can't conclude *anything* about text files from the fact that
module docstrings typically contain only \n rather than \r\n line
endings. Modules are not necessarily text files, and even when they are,
once you import them, what you get is *not text*, but Python objects.

... r = fo.read()
...
b'"""abc\r\ndef\r\n"""\r\n'

And again, you are misinterpreting what you are seeing. By opening the
file in binary mode, you are instructing Python to treat it as binary
bytes, and return *exactly* what is stored on disk. If you opened the
file in text mode, you would (likely, but not necessarily) get a very
different result: the string would contain only \n line endings.

Python is not a low-level language like C. If you expect it to behave
like a low-level language like C, you will be confused and upset.

But to prove that you are mistaken, we can do this:

py> s = """Triple-quote string\r
.... containing carriage-return+newline line\r
.... endings."""
py> s
'Triple-quote string\r\ncontaining carriage-return+newline line\r
\nendings.'

wxjmfauth · Aug 15, 2013

--------

I perfectly knows what Python does.
I missinterpreting nothing.
I opened my example in binary mode just to show the real
endings.
It still remains the """...""" has its owns EOL and one
has to be aware of it.
No more, no less.

("""...""" and tokenize.py is funny)

jmf

Print word from list	3	Aug 3, 2013
Genetic algoritm generating the text	0	Aug 18, 2023
Converting an Array to a String in JavaScript	7	Sep 22, 2023
I have to finish this code for my assignment but I cant figure out how to solve it	1	Jun 27, 2023
Newbie needs help with regex strings	5	Dec 14, 2005
Translater + module + tkinter	1	Feb 16, 2023
ChatBot	4	Jan 19, 2021
Simple program question.	3	Jun 10, 2013

.split() Qeustion

eschneider92

Gary Herron

Dave Angel

Krishnan Shankar

eschneider92

Peter Otten

Joshua Landau

wxjmfauth

random832

Chris Angelico

Tim Chase

Skip Montanaro

Chris Angelico

Terry Reedy

Steven D'Aprano

wxjmfauth

wxjmfauth

Chris Angelico

Steven D'Aprano

wxjmfauth

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads