map or set for handling struct with a key?

Digital Puer · Aug 8, 2007

Hi, I was wondering whether to use a map or a set for
my purposes.

I am reading rows from a database and am populating a
struct (or class) for each row. It has a key and ancillary
data, e.g.:

struct Foo {
int key;
float data1;
int data2;
string data3;
};

Now I want to keep all the rows in either a map or a set.
If I use a set, I can keep the struct the way it is; the only
thing I need is to provide a comparison function for the set
so that the ordering is based on the key.

On the other hand, if I use a map, I am inclined to break
the key out and use the key as the first part in the map,
e.g.:

struct Foo {
float data1;
int data2;
string data3;
};
map<int, Foo> mydata;

Which approach is preferable for the following cases:

1. My only operations on the data structure are to iterate
over it and to use find().

2. I will iterate over as well as update the data.

Victor Bazarov · Aug 8, 2007

Digital said:
Hi, I was wondering whether to use a map or a set for
my purposes.

I am reading rows from a database and am populating a
struct (or class) for each row. It has a key and ancillary
data, e.g.:

struct Foo {
int key;
float data1;
int data2;
string data3;
};

Now I want to keep all the rows in either a map or a set.
If I use a set, I can keep the struct the way it is; the only
thing I need is to provide a comparison function for the set
so that the ordering is based on the key.

On the other hand, if I use a map, I am inclined to break
the key out and use the key as the first part in the map,
e.g.:

struct Foo {
float data1;
int data2;
string data3;
};
map<int, Foo> mydata;

Which approach is preferable for the following cases:

1. My only operations on the data structure are to iterate
over it and to use find().

2. I will iterate over as well as update the data.

If you don't update the keys, ever, then you're better off with
a sorted vector/deque.

V

PicO · Aug 8, 2007

To use set to sort objects , you need to make comparison function or
compare operator like ( < or > )

so i suppose you to make bool operator in the struct ...

struct Foo {
int key;
float data1;
int data2;
string data3;

bool operator < ( const foo & X )
const
{
return key < X.key ? true : false ;
}

};

that operator will get set to sort this objects according to the
key ...

Digital Puer · Aug 8, 2007

Yes, I already understood that part. I mentioned that in
my first post. My question pertains to choosing between a map
or a set.

Digital Puer · Aug 8, 2007

If you don't update the keys, ever, then you're better off with
a sorted vector/deque.

V

Please explain why a sorted vector or deque is better than
map or set. A find() will be O(lg n) with any of them.

redfloyd · Aug 8, 2007

Hi, I was wondering whether to use a map or a set for
my purposes.

I am reading rows from a database and am populating a
struct (or class) for each row. It has a key and ancillary
data, e.g.:

struct Foo {
int key;
float data1;
int data2;
string data3;

};

Now I want to keep all the rows in either a map or a set.
If I use a set, I can keep the struct the way it is; the only
thing I need is to provide a comparison function for the set
so that the ordering is based on the key.

On the other hand, if I use a map, I am inclined to break
the key out and use the key as the first part in the map,
e.g.:

struct Foo {
float data1;
int data2;
string data3;};

map<int, Foo> mydata;

Which approach is preferable for the following cases:

1. My only operations on the data structure are to iterate
over it and to use find().

2. I will iterate over as well as update the data.

Disregarding whether vector/deque may be more suitable (sorry,
Victor), your second criterion, namely the updating of the data, would
point ot a map.

PicO · Aug 8, 2007

Please explain why a sorted vector or deque is better than
map or set. A find() will be O(lg n) with any of them.

if you are going update data after insertion , you have to use map not
set as you can't update data in the set as set built with no
assignment properties while map can update the "value" ( data1 & data
2 & data3 ) ...

i suppose map as you can search easier ... but i don't suppose vector
as erase and insertion will take long time ( n for each erase ) ...

Mark P · Aug 8, 2007

Digital said:
Please explain why a sorted vector or deque is better than
map or set. A find() will be O(lg n) with any of them.

In general, it will be faster to read all of the data into a vector and
sort it once rather than reading the data into a set or map and sorting
"online". A vector will also use less space in general.

Pete Becker · Aug 8, 2007

if you are going update data after insertion , you have to use map not
set as you can't update data in the set as set built with no
assignment properties while map can update the "value" ( data1 & data
2 & data3 ) ...

You can update an element in a set by removing it and re-inserting it.

PicO · Aug 8, 2007

You can update an element in a set by removing it and re-inserting it.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

ya ya you can but it's not efficient ... you have ( n log n ) cost for
every time you do this ....

Victor Bazarov · Aug 8, 2007

PicO said:
ya ya you can but it's not efficient ... you have ( n log n ) cost for
every time you do this ....

Ya ya... Nope. Record the iterator you expect to delete, increment
and store the incremented one, remove using the iterator, insert again
with the incremented one as the hint. Should be constant time.

V

tony_in_da_uk · Aug 8, 2007

[ should I have a set of...]
struct Foo {
int key;
float data1;
int data2;
string data3;
};
[...or a map of int keys to...]
struct Foo {
float data1;
int data2;
string data3;};

Which approach is preferable for the following cases:

1. My only operations on the data structure are to iterate
over it and to use find().

2. I will iterate over as well as update the data.

In general, you should use a map when part of the struct data forms
the key, and set when the entire data is the key. Whatever operations
your program starts off doing, programs evolve and choosing a natural
match for your data is a better approach than choosing something
that's perhaps slightly better for the usage start off with, then
having to rewrite everything.

The fact that you want to use find suggests an associative container
is a simply more natural fit than say a sorted vector and binary
search algorithm, and unless profiling shows you need fast population
or iteration, then there's no reason to consider it.

Tony

PicO · Aug 9, 2007

Ya ya... Nope. Record the iterator you expect to delete, increment
and store the incremented one, remove using the iterator, insert again
with the incremented one as the hint. Should be constant time.

V

of course it'll not be a constant time .. for every set insertion
it'll cost ( log n ) ..

Juha Nieminen · Aug 9, 2007

PicO said:
ya ya you can but it's not efficient ... you have ( n log n ) cost for
every time you do this ....

Removing a value from a set is O(log n). Inserting a value in a set is
O(log n). Where do you get O(n log n)?

Victor Bazarov · Aug 9, 2007

PicO said:
of course it'll not be a constant time .. for every set insertion
it'll cost ( log n ) ..

You just like to disagree, don't you?

If you don't change the data affecting the sorting order (but only the
other, "irrelevant" data), and then reinsert it in the same place, two
comparisons should be made and that's all. If you don't believe me,
try it yourself.

V

joe · Aug 9, 2007

Please explain why a sorted vector or deque is better than
map or set. A find() will be O(lg n) with any of them.- Hide quoted text -

What people often forget in O() notation is that there is a
coefficient k which goes in front of it and reflects the cost of the
overhead for specific implementations of an algorithm. This value can
be important. A vector has the following benefits over a map/set:

1) Much less overhead per item. A map/set is usually a red/black tree
and each node has left, right, and parent pointers plus data for the
item.

2) Fewer allocations. Each item in a map/set is kept as an
individually allocated node whereas a vector is usually allocated in
chunks which can hold several nodes.

3) Quicker movement from one item to the next. After a comparison to
get to the next node, a pointer has to be read from memory, with a
vector you often just add two registers together.

4) Locality of reference. Since vectors are allocated in large chunks
of contiguous memory, you are much likelier to get a cache hit while
navigating through the vector than you would in navigating a set/map.

Maps/sets have the following benefits over a vector:

1) Existing iterators don't get invalidated by inserts and only the
iterators directly involved get invalidated by deletes.

2) Vectors may possibly require a lot of data shifting during inserts/
deletes.

3) More natural interface. You have to use the std algorithms to
manipulate the vectors whereas the map/set containers have them built-
in.

These are the kinds of factors that often make a sorted vector better
than a map or set if the data is mostly static. How much the data
changes is where the trade off occurs. If you have insertions and
deletions occurring frequently, then the cost of constantly keeping
the vector sorted will make a map or set more attractive. If
insertions/deletions can be batched and/or kept infrequent then a
vector is generally faster.

joe

Pete Becker · Aug 9, 2007

ya ya you can but it's not efficient ... you have ( n log n ) cost for
every time you do this ....

Surely that's O(log n). Even so, "you can't update data in the set" is
wrong. And without knowing how often this is going to happen, you
simply can't rule it out categorically.

Pete Becker · Aug 9, 2007

2) Fewer allocations. Each item in a map/set is kept as an
individually allocated node whereas a vector is usually allocated in
chunks which can hold several nodes.

Just a slight correction: a vector is allocated in a single chunk,
which holds all of the stored objects. A deque can be allocated in
multiple chunks.

James Kanze · Aug 9, 2007

On 2007-08-08 21:49:04 -0400, PicO <[email protected]> said:

Surely that's O(log n). Even so, "you can't update data in the set" is
wrong. And without knowing how often this is going to happen, you
simply can't rule it out categorically.

And as Victor pointed out, you can save the iterator behind the
one used to erase, so the insert can be constant time.

In fact, the constant factor involved in insert can be fairly
high, since there will typically be an allocation. On the other
hand, I've done exactly this in one very time critical
application, with no real problems. As you say, it depends on
how often you are doing this, and what else is going on.

PicO · Aug 9, 2007

Removing a value from a set is O(log n). Inserting a value in a set is
O(log n). Where do you get O(n log n)?

i was mean ( log n ) for every n it'll be (n log n ) and i correct it
later by log n if u read my reply .. but of course it'll not constant
like he says ...

Universal BMP Steganography Tool (AES-128-CTR + SP800-90A CSPRNG) Full Encoder/Decoder with 3LSB Payload, PasswordDerived Key & External Key File	4	Mar 26, 2026
Struct Member Variables Problem	0	Jun 21, 2023
RSA implementation issues in public key pem loader function	0	May 21, 2025
Unable to read input from keyboard, in below C code, for a BST.	0	Jul 20, 2025
Trying to use clangd with VSCodium, CMake_World_COMPILER not set	1	Nov 4, 2024
Sorting an STL map	1	Oct 19, 2013
Map with key as vector<any>	2	Jun 30, 2011
Implementing a Q-Learning Algorithm with Logistic Regression Normalization in C++	0	Jun 4, 2025

map or set for handling struct with a key?

Digital Puer

Victor Bazarov

PicO

Digital Puer

Digital Puer

redfloyd

PicO

Mark P

Pete Becker

PicO

Victor Bazarov

tony_in_da_uk

PicO

Juha Nieminen

Victor Bazarov

joe

Pete Becker

Pete Becker

James Kanze

PicO

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads