strip module bug

P

Poppy

I'm using versions 2.5.2 and 2.5.1 of python and have encountered a
potential bug. Not sure if I'm misunderstanding the usage of the strip
function but here's my example.

var = "detail.xml"
print var.strip(".xml") ### expect to see 'detail', but get 'detai'
var = "overview.xml"
print var.strip(".xml") ### expect and get 'overview'

I have a work around using the replace function which happens to be the
better choice for my script anyhow. But am curious about the strip module.
Any thoughts? Is it removing the 'l' in detail because the strip function
text ends in 'l'?
 
P

Peter Otten

Poppy said:
I'm using versions 2.5.2 and 2.5.1 of python and have encountered a
potential bug. Not sure if I'm misunderstanding the usage of the strip
function but here's my example.

var = "detail.xml"
print var.strip(".xml") ### expect to see 'detail', but get 'detai'
var = "overview.xml"
print var.strip(".xml") ### expect and get 'overview'

I have a work around using the replace function which happens to be the
better choice for my script anyhow. But am curious about the strip module.
Any thoughts? Is it removing the 'l' in detail because the strip function
text ends in 'l'?

The behaviour you intend, stripping a suffix, is achieved by
.... var = var[:-len(suffix)]
....

The var.strip(chars) /method/ does something different. It treats chars as a
set of characters and removes any of these characters from the end and the
beginning of the var string, i. e.

"detail.xml" d is not in ".xml" -> remove no further chars from the
beginning
"detail.xml" l is in ".xml", remove it
"detail.xm" m is in ".xml", remove it
"detail.x" x is in ".xml", remove it
"detail." . is in ".xml", remove it
"detail" l is in ".xml", remove it
"detai" i is not in ".xml", we're done

Peter
 
B

bearophileHUGS

Poppy:
var = "detail.xml"
print var.strip(".xml") ### expect to see 'detail', but get 'detai'
var = "overview.xml"
print var.strip(".xml") ### expect and get 'overview'

Python V.2.5 is not flawless, but you can't find bugs so easily. I've
found only two bugs in about three years of continuous usage (while I
have found about 27 new different bugs in the DMD compiler in about
one year of usage).

str.strip() isn't a module, and generally in Python it's called
"method", in this case a method of str (or unicode).

str.strip() as optional argument doesn't take the string you want to
remove, but a string that represent the set of chars you want to strip
away from both ends, so:
'etail.xm'

To strip the ending ".xml" you can use the str.partition() method, or
an alternative solution:
.... var2 = var[: -len(suffix)]
....'detail'

Bye,
bearophile
 
T

Tim Chase

I'm using versions 2.5.2 and 2.5.1 of python and have encountered a
potential bug. Not sure if I'm misunderstanding the usage of the strip
function but here's my example.

var = "detail.xml"
print var.strip(".xml") ### expect to see 'detail', but get 'detai'
var = "overview.xml"
print var.strip(".xml") ### expect and get 'overview'

..strip() takes a *set* of characters to remove, as many as are found:
_______ _____ 'yxxy'

Using .replace() may work, but might have some odd side effects:
'this file has in its name'

This also has problems if your filename and extension differ in
case (removing ".xml" from "file.XML")

If you want to remove the extension, then I recommend using the
python builtin:

Or, if you know what the extension is:
>>> fname = 'this file has .xml in its name.xml'
>>> ext = '.xml'
>>> fname[:-len(ext)]
'this file has .xml in its name'

Just a few ideas that are hopefully more robust.

-tkc
 
S

Steven D'Aprano

I'm using versions 2.5.2 and 2.5.1 of python and have encountered a
potential bug. Not sure if I'm misunderstanding the usage of the strip
function but here's my example.

var = "detail.xml"
print var.strip(".xml") ### expect to see 'detail', but get 'detai'
var = "overview.xml"
print var.strip(".xml") ### expect and get 'overview'


I got bitten by this once too. Most embarrassingly, I already knew the
right behaviour but when somebody suggested it was a bug I got confused
and convinced myself it was a bug. It's not.

You have misunderstood what strip() does. It does NOT mean "remove this
string from the string if it is a suffix or prefix".

Consider:
'abc'

strip() removes *characters*, not substrings. It doesn't matter what
order it sees them.

See help(''.strip) in the interactive interpreter for more detail.


By the way, the right way to deal with file extensions is:
('detail', '.xml')
 
P

Poppy

Thanks Steven and Tim, I understand the strip module a lot more today. Also
for some reason I was deciding against using the path functions but now
decided to try and thus implemented them. My script is reading one file and
writing a new file with a different extension.

So based on your suggestions I wrote this line.

import sys, os
xmlfile = sys.argv[1]
filout = os.path.splitext(xmlfile)[0] + ".xmlparse" ### here is the new
line
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,270
Latest member
TopCryptoTwitterChannels_

Latest Threads

Top