Sets in C++

A

Aaron Gray

I want to represent a set in C++ that is indexed by number. I need to be
able to do unions and stuff between two existing sets forming a new set. I
have looked at 'bitset' but its a bit lame then theres vector<bool> but I
cannot find ant good documentation on either.

Help and suggestions most welcome,

Aaron
 
K

Kai-Uwe Bux

Aaron said:
I want to represent a set in C++ that is indexed by number.

Indexed? Do you mean that the elements of the set are integers?
I need to be
able to do unions and stuff between two existing sets forming a new set. I
have looked at 'bitset' but its a bit lame then theres vector<bool> but I
cannot find ant good documentation on either.

The standard documents both.
Help and suggestions most welcome,

There is a fundamental difference between sets of (comparatively small)
known maximum size and sets of arbitrary large size. The container
std::bitset<> is for the former, the container std::vector<bool> is of
questionable value, and std::set<int> is the standard choice when the size
is unknown. You could also use std::vector<int> (or std::deque<int>) and
keep the sequence sorted. It is easy to use binary_search<> to test for
elements and you can find the union or intersection of two such sets in
time linear in the total number of elements (probably best possible). The
algorithms std::set_union, std::set_difference, std::intersection, and
std::set_symmetric_difference make that fairly easy. Complements are more
tricky with all dynamic data structures.

I would recommend devising a class int_set with a minimal interface (just
enough for your application) and a straight forward initial implementation.
Make sure that it works by writing a bunch of tests. Then, you can replace
the implementation (i.e., replace the internal data structure) and try to
make it more efficient if a profiler shows the need. Without further
information, it is impossible to tell a priory which data structure is best
for your application.


Best

Kai-Uwe Bux
 
E

Erik Wikström

std::set ?

And the std::set_* functions from <algorithm>.

To find documentation try googling for 'bitset c++'. If you use a bitset
(I would not use vector<bool> since it is kind of special) the set
operations can be performed using bit operations, or is the union, and
is the intersection, xor the symmetric difference.
 
J

James Kanze

Indexed? Do you mean that the elements of the set are
integers?
The standard documents both.

But finding where isn't always that obvious:).
There is a fundamental difference between sets of
(comparatively small) known maximum size and sets of arbitrary
large size. The container std::bitset<> is for the former,
the container std::vector<bool> is of questionable value, and
std::set<int> is the standard choice when the size is unknown.

I think it's just a wording issue, and maybe I'm
misunderstanding your English, but don't you mean that the
difference depends on the cardinality of universe over which the
sets are formed. Even if the maximum size of the set is 10,
std::bitset<> won't work without a lot of external support if it
is a set of cities in the world.

Generally, if the universe is that of small natural numbers, or
can easily be mapped to small natural numbers (countries in the
European Union, for example), then something like std::bitset<>
is appropriate. Note, however, that this requires that the
upper bound be known at compile time. I still use my
pre-standard BitVector and DynamicBitVector classes (which
support the set operations as overloaded operators); the latter
has no fixed upper bound, but the internal implementation is
still that of a bit vector. I find it useful for things like
parser states, etc. where realistically, you have a small set of
natural numbers, but you don't know the actual upper bound until
you're finished. For more general use, of course, either
std::set or a sorted std::vector are the answer. (I find my use
split more or less evenly between the two.)
 
K

Kai-Uwe Bux

James said:
But finding where isn't always that obvious:).



I think it's just a wording issue, and maybe I'm
misunderstanding your English, but don't you mean that the
difference depends on the cardinality of universe over which the
sets are formed.

I guess that is what I wrote. But I meant just what you are saying latter:
the universe should map easily to small numbers and the bound on those
numbers has to be known at compile time.
Even if the maximum size of the set is 10,
std::bitset<> won't work without a lot of external support if it
is a set of cities in the world.

Nice example. It illustrates that the universe is what counts.
Generally, if the universe is that of small natural numbers, or
can easily be mapped to small natural numbers (countries in the
European Union, for example), then something like std::bitset<>
is appropriate. Note, however, that this requires that the
upper bound be known at compile time.

Right.


[snip]

Best

Kai-Uwe Bux
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,797
Messages
2,569,647
Members
45,380
Latest member
LatonyaEde

Latest Threads

Top