Set & Frozenset?

H

Hans Larsen

Could you help me ?
How could I "take" an elemment from a set or a frozenset .-) ?

From a string (unicode? Python<3), or from a tuple,or
from a list: Element by index or slice.
From a dict: by key.
But what concerning a set or frozenset!

hope somebody can help!
 
D

Diez B. Roggisch

Hans said:
Could you help me ?
How could I "take" an elemment from a set or a frozenset .-) ?

From a string (unicode? Python<3), or from a tuple,or
from a list: Element by index or slice.
From a dict: by key.
But what concerning a set or frozenset!

hope somebody can help!

You iterate over them. If you only want one value, use


iter(the_set).next()

Diez
 
A

Alan G Isaac

Hans said:
You iterate over them. If you only want one value, use
iter(the_set).next()


I recall a claim that

for result in myset: break

is the most efficient way to get one result.
Is this right? (It seems nearly the same.)

Alan Isaac
 
M

Matt Nordhoff

Alan said:
I recall a claim that

for result in myset: break

is the most efficient way to get one result.
Is this right? (It seems nearly the same.)

Alan Isaac

Checking Python 2.5 on Linux, your solution is much faster, but seeing
as they both come in under a microsecond, it hardly matters.
--
 
L

Lie Ryan

Matt said:
Checking Python 2.5 on Linux, your solution is much faster, but seeing
as they both come in under a microsecond, it hardly matters.


It's unexpected...
myset=iter(myset)')
0.49165520000002516
0.32933007999997699

I'd never expect that for-loop assignment is even faster than a
precreated iter object (the second test)... but I don't think this
for-looping variable leaking behavior is guaranteed, isn't it?

Note: the second one exhausts the iter object.
 
T

Terry Reedy

I'd never expect that for-loop assignment is even faster than a
precreated iter object (the second test)... but I don't think this
for-looping variable leaking behavior is guaranteed, isn't it?

It is an intentional, documented feature:

"Names in the target list are not deleted when the loop is finished, but
if the sequence is empty, it will not have been assigned to at all by
the loop."
 
P

Paul Rubin

Terry Reedy said:
It is an intentional, documented feature: ...

I prefer thinking of it as a documented bug. It is fixed in 3.x.
I usually avoid the [... for x in xiter] listcomp syntax in favor of
list(... for x in xiter) just as an effort to be a bit less bug-prone.
 
R

R. David Murray

Lie Ryan said:
It's unexpected...

0.32933007999997699

I'd never expect that for-loop assignment is even faster than a
precreated iter object (the second test)... but I don't think this
for-looping variable leaking behavior is guaranteed, isn't it?

My guess would be that what's controlling the timing here is
name lookup. Three in the first example, two in the second,
and one in the third.
 
L

Lie Ryan

R. David Murray said:
My guess would be that what's controlling the timing here is
name lookup. Three in the first example, two in the second,
and one in the third.

You got it:
myset=iter(myset).next')
0.26465903999999796


----------------------

The following is a complete benchmark:
number=10000000)
8.5145002000000432
myset=iter(myset)', number=10000000)
4.5509802800000898
number=10000000)
2.9994213600000421
myset=iter(myset).next', number=10000000)
2.2228832400001011

----------------------
I also performed additional timing for overhead:

Local name lookup:1.1086400799999865

Global name lookup:1.8149410799999259

Attribute lookup:myset=iter(myset)', number=10000000)
3.3011333999997987

Combined multiple name lookup that troubled first testnumber=10000000)
6.5599374800000305

Creating iterables:4.259406719999788

----------------------
So adjusting the overheads:

Attribute lookup:myset=iter(myset)', number=10000000)
3.3011333999997987
The timing for Attribute also include a local name lookup (myset), so
the real attribute lookup time shold be:
3.3011333999997987 - 1.1086400799999865 = 2.1924933199998122

Creating iterables:4.259406719999788
Creating iterable involve global name lookup, so the real time should be:
4.259406719999788 - 1.8149410799999259 = 2.4444656399998621

----------------------
To summarize the adjusted overheads:

Local name lookup: 1.1086400799999865
Global name lookup: 1.8149410799999259
Attribute lookup: 2.1924933199998122
Creating iterables: 2.4444656399998621

----------------------
Back to the problem, now we'll be adjusting the timing of each codes:
'res=iter(myset).next()': 8.5145002000000432
Adjusting with the "Combined multiple name lookup"
8.5145002000000432 - 6.5599374800000305 = 1.9545627200000126
Another way to do the adjustment:
Adjusting global name lookup (iter):
8.5145002000000432 - 1.8149410799999259 = 6.6995591200001172
Adjusting iterable creation:
6.6995591200001172 - 2.4444656399998621 = 4.2550934800002551
Adjusting attribute lookup:
4.2550934800002551 - 2.1924933199998122 = 2.0626001600004429

'res=myset.next()': 4.5509802800000898
Adjusting with |unadjusted| attribute lookup:
4.5509802800000898 - 3.3011333999997987 = 1.2498468800002911
Another way to do the adjustment:
Adjusting with local name lookup:
4.5509802800000898 - 1.1086400799999865 = 3.4423402000001033
Adjusting with attribute lookup:
3.4423402000001033 - 2.1924933199998122 = 1.2498468800002911

'for res in myset: break': 2.9994213600000421
Adjusting for local name lookup (myset):
2.9994213600000421 - 1.1086400799999865 = 1.8907812800000556

'res=myset()': 2.2228832400001011
Adjusting for local name lookup
2.2228832400001011 - 1.1086400799999865 = 1.1142431600001146

----------------------

To summarize:
'res=iter(myset).next()': 1.9545627200000126 / 2.0626001600004429
'res=myset.next()': 1.2498468800002911 / 1.2498468800002911
'for res in myset: break': 1.8907812800000556
'res=myset()': 1.1142431600001146

----------------------

To conclude, 'for res in myset: break' is actually not much faster than
'res=iter(myset).next()' except the former saves a lot of name lookup.
The problem with 'res=iter(myset).next()' is too many name lookup and
creating iter() object.

The fastest method is 'res=myset()' which eliminates the name lookup, it
is twice as fast as any other methods after all the overheads are
eliminated.

DISCLAIMER: I cannot guarantee there aren't any mistake.

PS: The result of the benchmark must be taken with a grain of salt. It
is only apparent after 10000000 (10**7) iteration, which means a second
difference is only 10**-7 difference in reality.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,521
Members
44,995
Latest member
PinupduzSap

Latest Threads

Top