frozenset question

W

Will McGugan

Hi,

Are there any benefits in using a frozenset over a set, other than it
being immutable?


Will McGugan
 
Q

Qiangning Hong

Hi,

Are there any benefits in using a frozenset over a set, other than it
being immutable?

A frozenset can be used as a key of a dict:

..>> s1 = set([1])
..>> s2 = frozenset([2])
..>> {s1: 1}
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: set objects are unhashable
..>> {s2:1}
{frozenset([2]): 1}
 
W

Will McGugan

Qiangning said:
A frozenset can be used as a key of a dict:

Thanks, but I meant to imply that.

I was wondering if frozenset was faster or more efficient in some way.
Thinking back to the dark ages of C++, you could optimize things that
you knew to be constant.


Will
 
S

Steven D'Aprano

I was wondering if frozenset was faster or more efficient in some way.

Thinking back to the dark ages of C++, you could optimize things that
you knew to be constant.

Why would you want to?

py> import sets
py> import time
py> bigset = sets.Set(range(500000))
py> bigimmutableset = sets.ImmutableSet(range(500000))
py> assert len(bigset) == len(bigimmutableset)
py>
py> def tester(S, L):
.... """Test if items from L are in S, and time the process."""
.... t = time.time()
.... for i in range(100):
.... for item in L:
.... item in S
.... return time.time() - t # time returned is for 100 loops
....
py>

Time some successful tests:

py> tester(bigset, range(100, 500))
0.11539506912231445
py> tester(bigimmutableset, range(100, 500))
0.12014198303222656

Practically no difference when doing 100*400 checks of whether an integer
is in the set. But let's try again, just in case:

py> tester(bigset, range(100, 500))
0.10998892784118652
py> tester(bigimmutableset, range(100, 500))
0.11114096641540527

The difference is insignificant. How about unsuccessful checks?

py> tester(bigset, range(-100, -500, -1))
0.12070298194885254
py> tester(bigset, range(-100, -500, -1))
0.11681413650512695
py> tester(bigimmutableset, range(-100, -500, -1))
0.11313891410827637
py> tester(bigimmutableset, range(-100, -500, -1))
0.11315703392028809

There is no significant speed difference between immutable and mutable
sets, at least for queries. Regardless of whether it is successful or
unsuccessful, mutable or immutable, it takes about 0.0000025 second to do
each test of item in set. Why would you need to optimize that?

If you tell us what you are trying to do, and under what circumstances it
is too slow, we'll see if we can suggest some ways to optimize it.

But if you are just trying to optimize for the sake of optimization,
that's a terrible idea. Get your program working first. Then when it
works, measure how fast it runs. If, and ONLY if, it is too slow,
identify the parts of the program that make it too slow. That means
profiling and timing. Then optimize those parts, and nothing else.

Otherwise, you will be like the car designer trying to speed up his sports
cars by making the seatbelts aerodynamic.
 
M

Michael Hudson

Will McGugan said:
Thanks, but I meant to imply that.

I was wondering if frozenset was faster or more efficient in some
way.

No, the 'usable as a dict key' is the main motivation for frozenset's
existence.

Cheers,
mwh
 
W

Will McGugan

Steven said:
There is no significant speed difference between immutable and mutable
sets, at least for queries. Regardless of whether it is successful or
unsuccessful, mutable or immutable, it takes about 0.0000025 second to do
each test of item in set. Why would you need to optimize that?

If you tell us what you are trying to do, and under what circumstances it
is too slow, we'll see if we can suggest some ways to optimize it.

But if you are just trying to optimize for the sake of optimization,
that's a terrible idea. Get your program working first. Then when it
works, measure how fast it runs. If, and ONLY if, it is too slow,
identify the parts of the program that make it too slow. That means
profiling and timing. Then optimize those parts, and nothing else.

Otherwise, you will be like the car designer trying to speed up his sports
cars by making the seatbelts aerodynamic.

No need for the 'premature optimization is the root of all evil' speech.
I'm not trying to optimize anything - just enquiring about the nature of
frozenset. If typing 'frozenset' over 'set' gave me a saving in time or
memory (even if tiny) I would favour immutable sets, where appropriate.

Thanks for the info.

Will
 
R

Raymond Hettinger

Will said:
Are there any benefits in using a frozenset over a set, other than it
being immutable?

No. The underlying implementation is identical with set. The only
difference is the addition of a hash method and absence of mutating
methods. Everything else is the same.


Raymond
 
S

Steven D'Aprano

No need for the 'premature optimization is the root of all evil' speech.
I'm not trying to optimize anything - just enquiring about the nature of
frozenset. If typing 'frozenset' over 'set' gave me a saving in time or
memory (even if tiny) I would favour immutable sets, where appropriate.

Well, obviously the "premature optimization" speech just went in one ear
and out the other. "Saving in time or memory" is what optimization is
about. What did you think optimization means?

set and frozenset have different functionality (one is mutable, the other
is not) and use cases (you use one where you need to dynamically add and
remove items from a set, and the other where you need to use a set as a
key in a dictionary). In most cases, they aren't interchangable. While
you're spending time worrying about shaving a thousandth of a millisecond
off a program that takes five seconds to run, I'll get on with development
using the right object for the functionality needed.
 
G

George Sakkis

Steven D'Aprano said:
Well, obviously the "premature optimization" speech just went in one ear
and out the other. "Saving in time or memory" is what optimization is
about. What did you think optimization means?

set and frozenset have different functionality (one is mutable, the other
is not) and use cases (you use one where you need to dynamically add and
remove items from a set, and the other where you need to use a set as a
key in a dictionary). In most cases, they aren't interchangable.

Well, they *may* be interchangable under some conditions, and that was
the OP's point you apparently missed. If you don't need to dynamically
modify a set and don't add sets as keys in dictionaries, you currently
have a choice; both frozenset and set will work. So all other things
being equal, it makes sense to ask about the relative performance if
the only 'evil' it may introduce can be rectified in a single import.

George
 
S

Steven D'Aprano

Well, they *may* be interchangable under some conditions, and that was
the OP's point you apparently missed.

I didn't miss anything of the sort. That's why I spent 15 minutes actually
producing test cases to MEASURE if there was any detectable speed
differences between accessing set and frozenset instead of wasting time
worrying about "tiny" differences in performance that are almost certainly
lost in the noise in real code.

If, over a thousand runs of the program, you save a millisecond of time in
total, but it costs you two seconds to type the comment in the code
explaining why you used frozenset instead of the more natural set, then
your "optimization" is counter-productive. Even just considering the
question is a waste of valuable developer time! THAT is the lesson of
premature optimization: don't even THINK about optimizing code until you
have it WORKING and you have MEASURED that it is too slow.
 
G

George Sakkis

Steven D'Aprano said:
I didn't miss anything of the sort. That's why I spent 15 minutes actually
producing test cases to MEASURE if there was any detectable speed
differences between accessing set and frozenset instead of wasting time
worrying about "tiny" differences in performance that are almost certainly
lost in the noise in real code.

If, over a thousand runs of the program, you save a millisecond of time in
total, but it costs you two seconds to type the comment in the code
explaining why you used frozenset instead of the more natural set, then
your "optimization" is counter-productive. Even just considering the
question is a waste of valuable developer time! THAT is the lesson of
premature optimization: don't even THINK about optimizing code until you
have it WORKING and you have MEASURED that it is too slow.

I fully agree with your last sentence, first profile and then consider
if it's worth optimizing. Still, you either missed the point again or
you are arguing for the sake of the argument. Neither me nor the OP
claimed that it's worthwhile saving a millisecond or two. The OP hadn't
done the timing measurements you did, so he didn't know in advance if
the difference is 1 millisecond or 500. After you replied with evidence
that it's not worth it, he made clear that he was just enquiring, not
optimizing prematurely as you took for granted and ranted against it
from the beginning.

Hope it's clear now,

George
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,527
Members
44,998
Latest member
MarissaEub

Latest Threads

Top