passing multiple strings to string.find()

H

hokiegal99

How do I say:

x = string.find(files, 'this', 'that', 'the-other')

currently I have to write it like this to make it work:

x = string.find(files, 'this')
y = string.find(files, 'that')
z = string.find(files, 'the-other')
 
R

Raymond Hettinger

hokiegal99 said:
How do I say:

x = string.find(files, 'this', 'that', 'the-other')

currently I have to write it like this to make it work:

x = string.find(files, 'this')
y = string.find(files, 'that')
z = string.find(files, 'the-other')

Try this:

x, y, z = map(files.find, ['this', 'that', 'the-other'])

or, if you're just trying to find the first match:

re.search('this|that|the-other', files).start()


OTOH, you've hinted at an application that may not
appropriate for multiple string searches. Instead, look
at building a dictionary or list of files -- they are most
easily searched and better suited for associating other
data such as file sizes, etc.




Raymond Hettinger
 
B

Bengt Richter

How do I say:

x = string.find(files, 'this', 'that', 'the-other')

currently I have to write it like this to make it work:

x = string.find(files, 'this')
y = string.find(files, 'that')
z = string.find(files, 'the-other')

You might try the re module, e.g.,
... m = rxo.search(' Find this or the-other or that and this.', pos)
... if not m: break
... print '%4s: %s' % (m.start(), m.group())
... pos = m.end()
...
6: this
14: the-other
27: that
36: this

If some search strings have a common prefix, you'll have to put
the longest first in the regex, since re grabs the first match it sees.

Regards,
Bengt Richter
 
?

=?iso-8859-1?q?Fran=E7ois_Pinard?=

[Fredrik Lundh]
Francois Pinard wrote:
Given the above,

build_regexp(['this', 'that', 'the-other'])

yields the string 'th(?:is|at|e\\-other)', which one may choose to
`re.compile' before use.
the SRE compiler looks for common prefixes, so "th(?:is|at|e\\-other)" is
no different from "this|that|the-other" on the engine level.

Thanks for the note. So the `build_regexp' function is not useful after
all. It was indirectly written around a speed problem in the GNU regexp
engine, but seemingly, the Python regexp engine knows better already. As I
wrote earlier, I first saw Emacs Lisp `regexp-opt' used within `enscript'..

A speed comparison between both methods shows that they are fairly
equivalent. A small difference is that `build_regexp', given that one of
the word is a prefix of another, automatically recognises the longest one,
while a naive regexp of '|'.join(words) recognises whatever happens to be
listed first. Of course, this is easily solved by sorting, then reversing
the word list before producing the naive regexp.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top