string.count issue (i'm stupid?)

M

Matteo Rattotti

Hi all,

i've noticed a strange beaviour of string.count:

in my mind this code must work in this way:

str = "a_a_a_a_"
howmuch = str.count("_a_")
print howmuch -> 3

but the count return only 2

Ok this can be fine, but why? The doc string tell that count will
return the number of substring in the master string, if we spoke about
substring i count 3 substring...

Can someone explain me this? And in which way i can count all the
occurrence of a substring in a master string? (yes all occurrence
reusing already counter character if needed)

Thanks a lot

Matteo Rattotti www.rknet.it
Powered by:
- MacOsX
- Gnu / Linux Debian Sarge
- Amiga Os 3.9
- Milk
 
D

Dirk Hagemann

I think I can tell you WHY this happens, but I don't know a work-around
at the moment.
It seems as if only the following "_a_" (A) are counted: a_A_a_A_

regards
Dirk
 
D

Diez B. Roggisch

Matteo said:
Hi all,

i've noticed a strange beaviour of string.count:

in my mind this code must work in this way:

str = "a_a_a_a_"
howmuch = str.count("_a_")
print howmuch -> 3

but the count return only 2

Ok this can be fine, but why? The doc string tell that count will
return the number of substring in the master string, if we spoke about
substring i count 3 substring...

It appears to be a documentation bug. It should say something about
non-overlapping occurences.

Better would be of course to introduce a parameter that defines the
behavior, either overlapping or not.

Diez
 
A

Alexandre Fayolle

Le 22-05-2006 said:
Hi all,

i've noticed a strange beaviour of string.count:

in my mind this code must work in this way:

str = "a_a_a_a_"
howmuch = str.count("_a_")
print howmuch -> 3

but the count return only 2

Ok this can be fine, but why? The doc string tell that count will
return the number of substring in the master string, if we spoke about
substring i count 3 substring...

Can someone explain me this? And in which way i can count all the
occurrence of a substring in a master string? (yes all occurrence
reusing already counter character if needed)

Use the optional start argument of find or index in a loop, such as:
.... index = 0
.... count = 0
.... while True:
.... index = string.find(substring, index)
.... if index < 0:
.... return count
.... else:
.... count += 1
.... index += 1
.... 3
 
A

Alexander Schmolck

Dirk Hagemann said:
I think I can tell you WHY this happens, but I don't know a work-around
at the moment.

len(re.findall('_(?=a_)', '_a_a_a_a_'))

# untested
def countWithOverlaps(s, pat):
return len(re.findall("%s(?=%s)" % (re.escape(pat[0]), re.escape(pat[1:])),s))

'as
 
B

bruno at modulix

Matteo said:
Hi all,

i've noticed a strange beaviour of string.count:

in my mind this code must work in this way:

str = "a_a_a_a_"

dont use 'str' as an identifier, it shadows the builtin str type.
howmuch = str.count("_a_")
print howmuch -> 3

but the count return only 2

Ok this can be fine, but why? The doc string tell that count will
return the number of substring in the master string, if we spoke about
substring i count 3 substring...

depends on how you define "number of substring", I mean, overlapping or
not. FWIW, I agree that this may be somewhat unintuitive, and would at
least require a little bit more precision in the docstring.
Can someone explain me this?

It seems obvious that str.count counts non-overlapping substrings.
And in which way i can count all the
occurrence of a substring in a master string? (yes all occurrence
reusing already counter character if needed)

Look at the re module.
 
T

Tim Chase

I agree the docstring is a bit confusing and could be clarified
as to what's happening
Can someone explain me this? And in which way i can count all
the occurrence of a substring in a master string? (yes all
occurrence reusing already counter character if needed)


You should be able to use something like

s = "a_a_a_a_"
count = len([i for i in range(len(s)) if s.startswith("_a_", i)])

which will count the way you wanted, rather than the currently
existing count() behavior.

-tkc
 
B

BartlebyScrivener

We were doing something like this last week

thestring = "a_a_a_a_".... try:
.... thestring.count("_a_", x, x + 3)
.... except ValueError:
.... pass
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top