slicing the end of a string in a list

J

John Salerno

Here's the code I wrote:

file = open('C:\switches.txt', 'r')
switches = file.readlines()
i = 0

for line in switches:
line = switches[:-1]
i += 1

print switches


You can probably tell what I'm doing. Read a list of lines from a file,
and then I want to slice off the '\n' character from each line. But
after this code runs, the \n is still there. I thought it might have
something to do with the fact that strings are immutable, but a test
such as:

switches[0][:-1]

does slice off the \n character. So I guess the problem lies in the
assignment or somewhere in there.

Also, is this the best way to index the list?
 
B

Ben Cartwright

John said:
You can probably tell what I'm doing. Read a list of lines from a file,
and then I want to slice off the '\n' character from each line. But
after this code runs, the \n is still there. I thought it might have
something to do with the fact that strings are immutable, but a test
such as:

switches[0][:-1]

does slice off the \n character.

Actually, it creates a new string instance with the \n character
removed, then discards it. The original switches[0] string hasn't
changed.
>>> foo = 'Hello world!'
>>> foo[:-1] 'Hello world'
>>> foo
'Hello world!'
So I guess the problem lies in the
assignment or somewhere in there.

Yes. You are repeated assigning a new string instance to "line", which
is then never referenced again. If you want to update the switches
list, then instead of assigning to "line" inside the loop, you need:

switches = switches[:-1]
Also, is this the best way to index the list?

No, since the line variable is unused. This:

i = 0
for line in switches:
line = switches[:-1]
i += 1

Would be better written as:

for i in range(len(switches)):
switches = switches[:-1]

For most looping scenarios in Python, you shouldn't have to manually
increment a counter variable.

--Ben

PS - actually, you can accomplish all of the above in a single line of
code:
print [line[:-1] for line in open('C:\\switches.txt')]
 
J

John Salerno

Ben said:
Actually, it creates a new string instance with the \n character
removed, then discards it. The original switches[0] string hasn't
changed.
Yes. You are repeated assigning a new string instance to "line", which
is then never referenced again.

Ah, thank you!
PS - actually, you can accomplish all of the above in a single line of
code:
print [line[:-1] for line in open('C:\\switches.txt')]

Wow, that just replaced 7 lines of code! So *this* is why Python is
good. :)
 
J

John Salerno

Ben said:
print [line[:-1] for line in open('C:\\switches.txt')]

Hmm, I just realized in my original code that I didn't escape the
backslash. Why did it still work properly?

By the way, this whole 'one line' thing has blown me away. I wasn't
thinking about list comprehensions when I started working on this, but
just the fact that it can all be done in one line is amazing. I tried
this in C# and of course I had to create a class first, and open the
file streams, etc. :)

And do I not need the 'r' parameter in the open function?
 
P

Paul Rubin

John Salerno said:
print [line[:-1] for line in open('C:\\switches.txt')]

Hmm, I just realized in my original code that I didn't escape the
backslash. Why did it still work properly?

The character the backslash isn't special: \s doesn't get into
a code like \n, so the backslash is passed through. Best not to
rely on that.

The preferred way to remove the newline is more like:
for line in open('C:\\switches.txt'):
print line.rstrip()

the rstrip method removes trailing whitespace, which might be \n
on some systems, \r\n on other systems, etc.
And do I not need the 'r' parameter in the open function?

No you get 'r' by default. If you want to write to the file you need
to pass the parameter.
 
J

John Salerno

Paul said:
The preferred way to remove the newline is more like:
for line in open('C:\\switches.txt'):
print line.rstrip()

Interesting. So I would say:

[line.rstrip() for line in open('C:\\switches.txt')]
 
J

John Salerno

John said:
Paul said:
The preferred way to remove the newline is more like:
for line in open('C:\\switches.txt'):
print line.rstrip()

Interesting. So I would say:

[line.rstrip() for line in open('C:\\switches.txt')]

That seems to work. And on a related note, it seems to allow me to end
my file on the last line, instead of having to add a newline character
at the end of it so it will get sliced properly too.
 
P

Paul Rubin

John Salerno said:
Interesting. So I would say:

[line.rstrip() for line in open('C:\\switches.txt')]

Yes, you could do that. Note that it builds up an in-memory list of
all those lines, instead of processing the file one line at a time.
If the file is very large, that might be a problem.

If you use parentheses instead:

(line.rstrip() for line in open('C:\\switches.txt'))

you get what's called a generator expression that you can loop
through, but that's a bit complicated to explain, it's probably better
to get used to other parts of Python before worrying about that.
 
L

Leif K-Brooks

Ben said:
No, since the line variable is unused. This:

i = 0
for line in switches:
line = switches[:-1]
i += 1

Would be better written as:

for i in range(len(switches)):
switches = switches[:-1]


This is better, IMHO:

for i, switch in enumerate(switches):
switches = switch[:-1]
 
P

Peter Otten

John said:
You can probably tell what I'm doing. Read a list of lines from a file,
and then I want to slice off the '\n' character from each line.

If you are not concerned about memory consumption there is also

open(filename).read().splitlines()

Peter
 
P

P Boy

One liners are cool. Personally however, I would not promote one liners
in Python. Python code is meant to be read. Cryptic coding is in perl's
world.

Code below is intuitive and almost a three year old would understand.

for line in open('C:\\switches.txt'):
print line.rstrip()

BTW, if the file is huge, one may want to consider using
open('c:\\switches.txt', 'rb') instead.
 
P

P Boy

I had some issues while ago trying to open a large binary file.

Anyway, from file() man page:

If mode is omitted, it defaults to 'r'. When opening a binary file, you
should append 'b' to the mode value for improved portability. (It's
useful even on systems which don't treat binary and text files
differently, where it serves as documentation.) The optional bufsize
argument specifies the file's desired buffer size: 0 means unbuffered,
1 means line buffered, any other positive value means use a buffer of
(approximately) that size. A negative bufsize means to use the system
default, which is usually line buffered for tty devices and fully
buffered for other files. If omitted, the system default is used.2.3
 
S

Steven D'Aprano

I had some issues while ago trying to open a large binary file.

The important term there is BINARY, not large. Many problems *reading*
(not opening) binary files will go away if you use 'rb', regardless of
whether they are small, medium or large.
Anyway, from file() man page:

If mode is omitted, it defaults to 'r'. When opening a binary file, you
should append 'b' to the mode value for improved portability. (It's
useful even on systems which don't treat binary and text files
differently, where it serves as documentation.)

Which does not suggest that using 'rb' is better for large files and 'r'
for small. It suggests that using 'rb' is better for binary files and 'r'
for text.
The optional bufsize
argument specifies the file's desired buffer size: 0 means unbuffered,
1 means line buffered, any other positive value means use a buffer of
(approximately) that size. A negative bufsize means to use the system
default, which is usually line buffered for tty devices and fully
buffered for other files. If omitted, the system default is used.2.3

If you are having problems with large files, changing the buffering will
help far more than changing the mode.
 
J

John Salerno

Steven said:
The important term there is BINARY, not large. Many problems *reading*
(not opening) binary files will go away if you use 'rb', regardless of
whether they are small, medium or large.

Is 'b' the proper parameter to use when you want to read/write a binary
file? I was wondering about this, because the book I'm reading doesn't
talk about dealing with binary files.
 
S

Steven D'Aprano

Is 'b' the proper parameter to use when you want to read/write a binary
file? I was wondering about this, because the book I'm reading doesn't
talk about dealing with binary files.

The interactive interpreter is your friend. Call help(file), and you will
get:

class file(object)
| file(name[, mode[, buffering]]) -> file object
|
| Open a file. The mode can be 'r', 'w' or 'a' for reading (default),
| writing or appending. The file will be created if it doesn't exist
| when opened for writing or appending; it will be truncated when
| opened for writing. Add a 'b' to the mode for binary files.

plus extra information.

Take note that the mode is NOT "b". It is "rb".
 
J

John Salerno

Steven said:
Is 'b' the proper parameter to use when you want to read/write a binary
file? I was wondering about this, because the book I'm reading doesn't
talk about dealing with binary files.

The interactive interpreter is your friend. Call help(file), and you will
get:

class file(object)
| file(name[, mode[, buffering]]) -> file object
|
| Open a file. The mode can be 'r', 'w' or 'a' for reading (default),
| writing or appending. The file will be created if it doesn't exist
| when opened for writing or appending; it will be truncated when
| opened for writing. Add a 'b' to the mode for binary files.

plus extra information.

Take note that the mode is NOT "b". It is "rb".

Awesome! I'm trying to push away thoughts of C#'s binary reader and
writer classes now. :)
 
J

John Salerno

Paul said:
John Salerno said:
Interesting. So I would say:

[line.rstrip() for line in open('C:\\switches.txt')]


How would I manually close a file that's been opened this way? Or is it
not possible in this case? Is it necessary?
 
S

Steve Holden

John said:
Paul said:
John Salerno said:
Interesting. So I would say:

[line.rstrip() for line in open('C:\\switches.txt')]



How would I manually close a file that's been opened this way? Or is it
not possible in this case? Is it necessary?

It's not possible to perform an explicit close if, as in this case, you
don't have an explicit reference to the file object.

In CPython it's not strictly necessary to close the file, but other
implementations don't guarantee that a file will be closed after the
last reference is deleted.

So for fullest portability it's better explicitly close the file.

regards
Steve
 
J

John Salerno

Steve said:
It's not possible to perform an explicit close if, as in this case, you
don't have an explicit reference to the file object.

In CPython it's not strictly necessary to close the file, but other
implementations don't guarantee that a file will be closed after the
last reference is deleted.

So for fullest portability it's better explicitly close the file.

regards
Steve

Thanks!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,438
Messages
2,571,699
Members
48,796
Latest member
Greg L.
Top