startswith( prefix[, start[, end]]) Query

C

cjt22

Hi

startswith( prefix[, start[, end]]) States:

Return True if string starts with the prefix, otherwise return False.
prefix can also be a tuple of suffixes to look for. However when I try
and add a tuple of suffixes I get the following error:

Type Error: expected a character buffer object

For example:

file = f.readlines()
for line in file:
if line.startswith(("abc","df"))
CODE

It would generate the above error

To overcome this problem, I am currently just joining individual
startswith methods
i.e. if line.startswith("if") or line.startswith("df")
but know there must be a way to define all my suffixes in one tuple.

Thanks in advance
 
T

Tim Golden

Hi

startswith( prefix[, start[, end]]) States:

Return True if string starts with the prefix, otherwise return False.
prefix can also be a tuple of suffixes to look for.

That particular aspect of the functionality (the multiple
prefixes in a tuple) was only added Python 2.5. If you're
using <= 2.4 you'll need to use "or" or some other approach,
eg looping over a sequence of prefixes.

TJG
 
A

attn.steven.kuo

Hi

startswith( prefix[, start[, end]]) States:

Return True if string starts with the prefix, otherwise return False.
prefix can also be a tuple of suffixes to look for. However when I try
and add a tuple of suffixes I get the following error:

Type Error: expected a character buffer object

For example:

file = f.readlines()
for line in file:
if line.startswith(("abc","df"))
CODE

It would generate the above error

(snipped)

You see to be using an older version of Python.
For me it works as advertised with 2.5.1,
but runs into the problem you described with 2.4.4:

Python 2.5.1c1 (r251c1:54692, Apr 17 2007, 21:12:16)
[GCC 4.0.0 (Apple Computer, Inc. build 5026)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
....
foobar
.... print line
....
foobar


VS.

Python 2.4.4 (#1, Oct 18 2006, 10:34:39)
[GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin
Type "help", "copyright", "credits" or "license" for more information.....
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: expected a character buffer object
 
B

Bruno Desthuilliers

(e-mail address removed) a écrit :
Hi

startswith( prefix[, start[, end]]) States:

Return True if string starts with the prefix, otherwise return False.
prefix can also be a tuple of suffixes to look for. However when I try
and add a tuple of suffixes I get the following error:

Type Error: expected a character buffer object

For example:

file = f.readlines()
for line in file:

slightly OT, but:
1/ you should not use 'file' as an identifier, it shadowas the builtin
file type
2/ FWIW, it's also a pretty bad naming choice for a list of lines - why
not just name this list 'lines' ?-)
3/ anyway, unless you need to store this whole list in memory, you'd be
better using the iterator idiom (Python files are iterables):

f = open('some_file.ext')
for line in f:
print line

if line.startswith(("abc","df"))
CODE

It would generate the above error

May I suggest that you read the appropriate version of the doc ? That
is, the one corresponding to your installed Python version ?-)

Passing a tuple to str.startswith is new in 2.5. I bet you're trying it
on a 2.4 or older version.
To overcome this problem, I am currently just joining individual
startswith methods
i.e. if line.startswith("if") or line.startswith("df")
but know there must be a way to define all my suffixes in one tuple.

You may want to try with a regexp, but I'm not sure it's worth it (hint:
the timeit module is great for quick small benchmarks).

Else, you could as well write your own testing function:

def str_starts_with(astring, *prefixes):
startswith = astring.startswith
for prefix in prefixes:
if startswith(prefix):
return true
return false

for line in f:
if str_starts_with(line, 'abc, 'de', 'xxx'):
# CODE HERE

HTH
 
C

Carsten Haese

Hi

startswith( prefix[, start[, end]]) States:

Return True if string starts with the prefix, otherwise return False.
prefix can also be a tuple of suffixes to look for. However when I try
and add a tuple of suffixes I get the following error:

Type Error: expected a character buffer object

You are probably looking at the documentation for Python 2.5, but you're
using Python 2.4 or older:

#######################################################################
Python 2.5 (r25:51908, Oct 28 2006, 12:26:14)
[GCC 4.1.1 20060525 (Red Hat 4.1.1-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.True
#######################################################################
Python 2.4.3 (#1, Oct 23 2006, 14:19:47)
[GCC 4.1.1 20060525 (Red Hat 4.1.1-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: expected a character buffer object
#######################################################################

HTH,
 
T

Tim Williams

You may want to try with a regexp, but I'm not sure it's worth it (hint:
the timeit module is great for quick small benchmarks).

Else, you could as well write your own testing function:

def str_starts_with(astring, *prefixes):
startswith = astring.startswith
for prefix in prefixes:
if startswith(prefix):
return true
return false

for line in f:
if str_starts_with(line, 'abc, 'de', 'xxx'):
# CODE HERE

Isn't slicing still faster than startswith? As you mention timeit,
then you should probably add slicing to the pot too :)

if astring[:len(prefix)] == prefix:
do_stuff()

:)
 
T

TheFlyingDutchman

Else, you could as well write your own testing function:

def str_starts_with(astring, *prefixes):
startswith = astring.startswith
for prefix in prefixes:
if startswith(prefix):
return true
return false

What is the reason for
startswith = astring.startswith
startswith(prefix)

instead of
astring.startswith(prefix)
 
S

Steve Holden

TheFlyingDutchman said:
What is the reason for
startswith = astring.startswith
startswith(prefix)

instead of
astring.startswith(prefix)
It's an optimization: the assigment creates a "bound method" (i.e. a
method associated with a specific string instance) and avoids having to
look up the startswith method of astring for each iteration of the inner
loop.

Probably not really necessary, though, and they do say that premature
optimization is the root of all evil ...

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------
 
B

Bruno Desthuilliers

Steve Holden a écrit :
It's an optimization: the assigment creates a "bound method" (i.e. a
method associated with a specific string instance) and avoids having to
look up the startswith method of astring for each iteration of the inner
loop.

Probably not really necessary, though, and they do say that premature
optimization is the root of all evil ...

I wouldn't call this one "premature" optimization, since it doesn't
change the algorithm, doesn't introduce (much) complication, and is
proven to really save on lookup time.

Now I do agree that unless you have quite a lot of prefixes to test, it
might not be that necessary in this particular case...
 
D

Duncan Booth

Tim Williams said:
Isn't slicing still faster than startswith? As you mention timeit,
then you should probably add slicing to the pot too :)

Possibly, but there are so many other factors that affect the timing
that writing it clearly should be your first choice.

Some timings:

@echo off
setlocal
cd \python25\lib
echo "startswith"
...\python timeit.py -s "s='abracadabra1'*1000;t='abracadabra2'" s.startswith(t)
...\python timeit.py -s "s='abracadabra1'*1000;t='abracadabra1'" s.startswith(t)
echo "prebound startswith"
...\python timeit.py -s "s='abracadabra1'*1000;t='abracadabra2';startswith=s.startswith" startswith(t)
...\python timeit.py -s "s='abracadabra1'*1000;t='abracadabra1';startswith=s.startswith" startswith(t)
echo "slice with len"
...\python timeit.py -s "s='abracadabra1'*1000;t='abracadabra2'" s[:len(t)]==t
...\python timeit.py -s "s='abracadabra1'*1000;t='abracadabra1'" s[:len(t)]==t
echo "slice with magic number"
...\python timeit.py -s "s='abracadabra1'*1000;t='abracadabra2'" s[:12]==t
...\python timeit.py -s "s='abracadabra1'*1000;t='abracadabra1'" s[:12]==t

and typical output from this is:

"startswith"
1000000 loops, best of 3: 0.542 usec per loop
1000000 loops, best of 3: 0.514 usec per loop
"prebound startswith"
1000000 loops, best of 3: 0.472 usec per loop
1000000 loops, best of 3: 0.474 usec per loop
"slice with len"
1000000 loops, best of 3: 0.501 usec per loop
1000000 loops, best of 3: 0.456 usec per loop
"slice with magic number"
1000000 loops, best of 3: 0.34 usec per loop
1000000 loops, best of 3: 0.315 usec per loop

So for these particular strings, the naive slice wins if the comparison is
true, but loses to the pre-bound method if the comparison fails. The slice is
taking a hit from calling len every time, so pre-calculating the length
(which should be possible in the same situations as pre-binding startswith)
might be worthwhile, but I would still favour using startswith unless I knew
the code was time critical.
 
S

Steve Holden

Bruno said:
Steve Holden a écrit : [...]
Probably not really necessary, though, and they do say that premature
optimization is the root of all evil ...

I wouldn't call this one "premature" optimization, since it doesn't
change the algorithm, doesn't introduce (much) complication, and is
proven to really save on lookup time.

Now I do agree that unless you have quite a lot of prefixes to test, it
might not be that necessary in this particular case...

The defense rests.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------
 
R

rzed

I went through your example to get timings for my machine, and I
ran into an issue I didn't expect.

My bat file did the following 10 times in a row:
(the command line wraps in this post)

call timeit -s "s='abracadabra1'*1000;t='abracadabra2';
startswith=s.startswith" startswith(t)
.... giving me these times:

1000000 loops, best of 3: 0.483 usec per loop
1000000 loops, best of 3: 0.49 usec per loop
1000000 loops, best of 3: 0.489 usec per loop
1000000 loops, best of 3: 0.491 usec per loop
1000000 loops, best of 3: 0.488 usec per loop
1000000 loops, best of 3: 0.492 usec per loop
1000000 loops, best of 3: 0.49 usec per loop
1000000 loops, best of 3: 0.493 usec per loop
1000000 loops, best of 3: 0.486 usec per loop
1000000 loops, best of 3: 0.489 usec per loop

Then I thought that a shorter name for the lookup might affect the
timings, so I changed the bat file, which now did the following 10
times in a row:

timeit -s "s='abracadabra1'* 1000;t='abracadabra2';
sw=s.startswith" sw(t)

.... giving me these times:
1000000 loops, best of 3: 0.516 usec per loop
1000000 loops, best of 3: 0.512 usec per loop
1000000 loops, best of 3: 0.514 usec per loop
1000000 loops, best of 3: 0.517 usec per loop
1000000 loops, best of 3: 0.515 usec per loop
1000000 loops, best of 3: 0.518 usec per loop
1000000 loops, best of 3: 0.523 usec per loop
1000000 loops, best of 3: 0.513 usec per loop
1000000 loops, best of 3: 0.514 usec per loop
1000000 loops, best of 3: 0.515 usec per loop

In other words, the shorter name did seem to affect the timings,
but in a negative way. Why it would actually change at all is
beyond me, but it is consistently this way on my machine.

Can anyone explain this?
 
B

Bruno Desthuilliers

Steve Holden a écrit :
Bruno said:
Steve Holden a écrit :
[...]
Probably not really necessary, though, and they do say that premature
optimization is the root of all evil ...


I wouldn't call this one "premature" optimization, since it doesn't
change the algorithm, doesn't introduce (much) complication, and is
proven to really save on lookup time.

Now I do agree that unless you have quite a lot of prefixes to test,
it might not be that necessary in this particular case...


The defense rests.

Sorry, I don't understand this one (please bare with a poor french boy).
 
D

Dennis Lee Bieber

Steve Holden a écrit :

Sorry, I don't understand this one (please bare with a poor french boy).

Ah well... Under Napoleonic justice, the defense can't afford to
rest.

I believe the sense here is:

"The defense rests" implies that the defending attorney in the
arguments believes his case has been proven -- used in a message like
this, it implies that the prosecution made a statement that proves the
defense side.
--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top