Can I rely on...

E

Emanuele D'Arrigo

Sorry for the double-post, the first one was sent by mistake before
completion.

Hi everybody,

I just had a bit of a shiver for something I'm doing often in my code
but that might be based on a wrong assumption on my part. Take the
following code:

pattern = "aPattern"

compiledPatterns = [ ]
compiledPatterns.append(re.compile(pattern))

if(re.compile(pattern) in compiledPatterns):
print("The compiled pattern is stored.")

As you can see I'm effectively assuming that every time re.compile()
is called with the same input pattern it will return the exact same
object rather than a second, identical, object. In interactive tests
via python shell this seems to be the case but... can I rely on it -
always- being the case?

If the answer is no, am I right to state the in the case portrayed
above the only way to be safe is to use the following code instead?

for item in compiledPatterns:
if(item.pattern == pattern):
print("The compiled pattern is stored.")
break

And what about any other function or class/method? Is there a way to
discriminate between methods and functions that when invoked twice
with the same arguments will return the same object and those that in
the same circumstances will return two identical objects? Or is it one
of those implementation-specific issues?

Manu
 
M

MRAB

Emanuele D'Arrigo wrote:
[snip]
If the answer is no, am I right to state the in the case portrayed
above the only way to be safe is to use the following code instead?

for item in compiledPatterns:
if(item.pattern == pattern):
print("The compiled pattern is stored.")
break
Correction to my last post: this isn't the same as using 'in'.

It should work, but remember that it compares only the pattern and not
any flags you might have used in the original re.compile().
 
E

Emanuele D'Arrigo

Thank you everybody for the informative replies.

I'll have to comb my code for all the instances of "item in sequence"
statement because I suspect some of them are as unsafe as my first
example. Oh well. One more lesson learned.

Thank you again.

Manu
 
R

R. David Murray

Emanuele D'Arrigo said:
Thank you everybody for the informative replies.

I'll have to comb my code for all the instances of "item in sequence"
statement because I suspect some of them are as unsafe as my first
example. Oh well. One more lesson learned.

You may have far fewer unsafe cases than you think, depending
on how you understood the answers you got, some of which
were a bit confusing. Just to make sure it is clear
what is going on in your example....
From the documentation of 'in':

x in s True if an item of s is equal to x, else False

(http://docs.python.org/library/stdtypes.html#sequence-types-str-unicode-list-tuple-buffer-xrange)

Note the use of 'equal' there. So for lists and tuples,

if x in s: dosomething

is the same as

for item in s:
if item == x:
do something
break

So:
>>> s = ['sdb*&', 'uuyh', 'foo']
>>> x = 'sdb*&'
>>> x is s[0] False
>>> x in s
True

(I used a string with special characters in it to avoid Python's
interning of identifier-like strings so that x and s[0] would not be
the same object).

Your problem with the regex example is that re makes no promise that
patterns compiled from the same source string will compare equal to
each other. Thus their _equality_ is not guaranteed. Switching to
using an equals comparison won't help you avoid your problem in
the example you showed.

Now, if you have a custom sequence type, 'in' and and an '==' loop
might produce different results, since 'in' is evaluated by the special
method __contains__ if it exists (and list iteration with equality if
it doesn't). But the _intent_ of __contains__ is that comparison be
by equality, not object identity, so if the two are not the same something
weird is going on and there'd better be a good reason for it :)

In summary, 'in' is the thing to use if you want to know if your
sample object is _equal to_ any of the objects in the container.
As long as equality is meaningful for the objects involved, there's
no reason to switch to a loop.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top