Can I rely on...

Emanuele D'Arrigo · Mar 19, 2009

Sorry for the double-post, the first one was sent by mistake before
completion.

Hi everybody,

I just had a bit of a shiver for something I'm doing often in my code
but that might be based on a wrong assumption on my part. Take the
following code:

pattern = "aPattern"

compiledPatterns = [ ]
compiledPatterns.append(re.compile(pattern))

if(re.compile(pattern) in compiledPatterns):
print("The compiled pattern is stored.")

As you can see I'm effectively assuming that every time re.compile()
is called with the same input pattern it will return the exact same
object rather than a second, identical, object. In interactive tests
via python shell this seems to be the case but... can I rely on it -
always- being the case?

If the answer is no, am I right to state the in the case portrayed
above the only way to be safe is to use the following code instead?

for item in compiledPatterns:
if(item.pattern == pattern):
print("The compiled pattern is stored.")
break

And what about any other function or class/method? Is there a way to
discriminate between methods and functions that when invoked twice
with the same arguments will return the same object and those that in
the same circumstances will return two identical objects? Or is it one
of those implementation-specific issues?

Manu

MRAB · Mar 19, 2009

Emanuele D'Arrigo wrote:
[snip]

If the answer is no, am I right to state the in the case portrayed
above the only way to be safe is to use the following code instead?

for item in compiledPatterns:
if(item.pattern == pattern):
print("The compiled pattern is stored.")
break

Correction to my last post: this isn't the same as using 'in'.

It should work, but remember that it compares only the pattern and not
any flags you might have used in the original re.compile().

Emanuele D'Arrigo · Mar 19, 2009

Thank you everybody for the informative replies.

I'll have to comb my code for all the instances of "item in sequence"
statement because I suspect some of them are as unsafe as my first
example. Oh well. One more lesson learned.

Thank you again.

Manu

R. David Murray · Mar 19, 2009

Emanuele D'Arrigo said:
Thank you everybody for the informative replies.

I'll have to comb my code for all the instances of "item in sequence"
statement because I suspect some of them are as unsafe as my first
example. Oh well. One more lesson learned.

You may have far fewer unsafe cases than you think, depending
on how you understood the answers you got, some of which
were a bit confusing. Just to make sure it is clear
what is going on in your example....

From the documentation of 'in':

x in s True if an item of s is equal to x, else False

(http://docs.python.org/library/stdtypes.html#sequence-types-str-unicode-list-tuple-buffer-xrange)

Note the use of 'equal' there. So for lists and tuples,

if x in s: dosomething

is the same as

for item in s:
if item == x:
do something
break

So:

>>> s = ['sdb*&', 'uuyh', 'foo']
>>> x = 'sdb*&'
>>> x is s[0] False
>>> x in s

Click to expand...

Click to expand...

True

(I used a string with special characters in it to avoid Python's
interning of identifier-like strings so that x and s[0] would not be
the same object).

Your problem with the regex example is that re makes no promise that
patterns compiled from the same source string will compare equal to
each other. Thus their _equality_ is not guaranteed. Switching to
using an equals comparison won't help you avoid your problem in
the example you showed.

Now, if you have a custom sequence type, 'in' and and an '==' loop
might produce different results, since 'in' is evaluated by the special
method __contains__ if it exists (and list iteration with equality if
it doesn't). But the _intent_ of __contains__ is that comparison be
by equality, not object identity, so if the two are not the same something
weird is going on and there'd better be a good reason for it

In summary, 'in' is the thing to use if you want to know if your
sample object is _equal to_ any of the objects in the container.
As long as equality is meaningful for the objects involved, there's
no reason to switch to a loop.

Can I rely on...	6	Mar 19, 2009
How can I fix my pattern coding error in c++	0	Mar 19, 2023
I can NOT install Anaconda on my Windows laptop correctly	2	Sep 18, 2023
Uhhhhh, What can I do next?	6	Nov 25, 2023
CSS: How can I stop overflow on the y-axis?	2	Dec 24, 2022
How can I view / open / render / display a pdf file with c code?	0	Sep 23, 2023
Big problem I need to solve with some unix utils	1	Jun 19, 2022
While loop unclear, can someone help?	4	Dec 6, 2023

Can I rely on...

Emanuele D'Arrigo

MRAB

Emanuele D'Arrigo

R. David Murray

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads