STL list or map?

Mike Copeland · May 29, 2011

I have the following structure defined, and I'm trying to decide the
best container to use for accessing specific elements: "best" meaning
the fastest way. I will be searching on the "sBibNo" field, and it will
likely contain values in a high range (20000-50000). The elements
stored will not be contiguous. I can't allocate an array of elements
because doing so would waste a lot of memory space...so I can't use a
direct index to find the elements.
I'm thinking I could use either an STL list or map for this
application, but I fear the list processing would be prohibitively slow
and the map might be quite wasteful in other ways. The actual number of
elements used in this application probably won't exceed 3000.
Please advise. TIA

struct SHADOWS
{
char sRCode, aRCode;
int sBibNo, aBibNo;
}

Stefan Ram · May 29, 2011

Please advise.

Hide the decision behind an interface and then implement
this first focussing only on correctness and readability.

Kai-Uwe Bux · May 29, 2011

Mike said:
I have the following structure defined, and I'm trying to decide the
best container to use for accessing specific elements: "best" meaning
the fastest way. I will be searching on the "sBibNo" field, and it will
likely contain values in a high range (20000-50000). The elements
stored will not be contiguous. I can't allocate an array of elements
because doing so would waste a lot of memory space...so I can't use a
direct index to find the elements.
I'm thinking I could use either an STL list or map for this
application, but I fear the list processing would be prohibitively slow
and the map might be quite wasteful in other ways.

Memory: A list node contains the data and typically two pointers. Very
likely, the underlying allocator is also putting in a hidder pointer. That
is three pointers in addition to your data. A map node typically contains
three pointers and the hidden pointer in addition to the data. Taking into
account that the allocator is likely to round sizes to multiples of 16 or 32
bytes, you are somewhat unlikely to see a difference.

Cost of iteration: incremening / decrementing an iterator is constant time
for a list and ammortized constant time for a map.

The actual number of
elements used in this application probably won't exceed 3000.
Please advise. TIA

[...]

Elsethread, the suggestion was made to hide this decision as an
implementation detail of some class. That is a good idea.

If you choose a list representation, then a lot will depend on which pattern
of searches predominates in your application. E.g., if there is a
comparatively small set of keys that you will search for very often, then a
"self organizing file" might be a good idea: you always search from front to
tail and when you dfind a key, you move the entry to the head of the list.
This way, frequently searched entries gather near the front and will be
found quickly.

If your searches are more or less evenly distributed among the keys, a
better data structure would probably improve performance of searching. The
tradeoff is usually a slightly greater complexity for inserting an entry.
Whether changing the data structures pays off, therefore, also depends on
whether your application will search way more often than it will modify the
underlying data.

As the influencing factors are hard to predict, you should write your
program so that it will be easy to change the data structure. Then,
implement one that you can easily prove correct. Create unit tests. When you
change the data structure, those tests will catch bugs. Once the other data
structure is up and running, you can compare performance.

Best,

Kai-Uwe Bux

Rune Allnor · May 30, 2011

I have the following structure defined, and I'm trying to decide the
best container to use for accessing specific elements: "best" meaning
the fastest way. I will be searching on the "sBibNo" field, and it will
likely contain values in a high range (20000-50000). The elements
stored will not be contiguous. I can't allocate an array of elements
because doing so would waste a lot of memory space...so I can't use a
direct index to find the elements.
I'm thinking I could use either an STL list or map for this
application, but I fear the list processing would be prohibitively slow
and the map might be quite wasteful in other ways. The actual number of
elements used in this application probably won't exceed 3000.

That's the trade-off you are faced with: Memory vs speed.

If you want to search inside the container, forget the list.
Lists can only be searched sequentially. Maps probably give
the fastest search results, at the expense of some internal
data structures that require memory.

Rune

Joshua Maurice · May 30, 2011

I have the following structure defined, and I'm trying to decide the
best container to use for accessing specific elements: "best" meaning
the fastest way. I will be searching on the "sBibNo" field, and it will
likely contain values in a high range (20000-50000). The elements
stored will not be contiguous. I can't allocate an array of elements
because doing so would waste a lot of memory space...so I can't use a
direct index to find the elements.
I'm thinking I could use either an STL list or map for this
application, but I fear the list processing would be prohibitively slow
and the map might be quite wasteful in other ways. The actual number of
elements used in this application probably won't exceed 3000.
Please advise. TIA

struct SHADOWS
{
char sRCode, aRCode;
int sBibNo, aBibNo;
}

How often do you insert, and how often do you do finds? Specifically
the relative frequencies.

std::list is /usually/ the wrong answer to any performance problem. If
you're doing lookups and you're concerned about memory, it is
definitely the wrong answer. std::map would be better.

std::set, equivalently std::map, is the good default choice.

If the number of finds greatly outweigh the number of inserts, a
std::vector kept in sorted order using binary searches is probably the
best approach for performance.

Alternatively, maybe a B-tree with the right branching factor.
http://en.wikipedia.org/wiki/B-tree
I don't know offhand if this is a good idea. I would implement and
measure.

Specifically, if your problem is that important, I would implement and
measure all reasonable alternatives.

Michael Doubez · May 30, 2011

I have the following structure defined, and I'm trying to decide the
best container to use for accessing specific elements: "best" meaning
the fastest way. I will be searching on the "sBibNo" field, and it will
likely contain values in a high range (20000-50000). The elements
stored will not be contiguous. I can't allocate an array of elements
because doing so would waste a lot of memory space...so I can't use a
direct index to find the elements.
I'm thinking I could use either an STL list or map for this
application, but I fear the list processing would be prohibitively slow
and the map might be quite wasteful in other ways. The actual number of
elements used in this application probably won't exceed 3000.
Please advise. TIA

struct SHADOWS
{
char sRCode, aRCode;
int sBibNo, aBibNo;
};

In addition to what others said, you can also consider unorder_map
(hash_map extension on some compilers) but it depends on the
distribution of sBibNo or if you can find a good hash function.

Juha Nieminen · May 30, 2011

Mike Copeland said:
I have the following structure defined, and I'm trying to decide the
best container to use for accessing specific elements: "best" meaning
the fastest way. I will be searching on the "sBibNo" field, and it will
likely contain values in a high range (20000-50000). The elements
stored will not be contiguous. I can't allocate an array of elements
because doing so would waste a lot of memory space...so I can't use a
direct index to find the elements.

Is the container initialized once at the start of the program and then
never modified again (fast reading being the only operation done to it
for the rest of the program)?

If the answer is yes, then the most efficient way is to put the elements
in a vector, sort it, and then use binary searching when you need to read
an element. The binary searching can be done with a std::lower_bound() call.

If the container needs to be updated (elements added or removed) during
the execution of the program, and this updating must be as fast as possible,
then this solution won't work. (In that case you need to either use std::set
or std::unordered_set.)

Mike Copeland · May 30, 2011

I have the following structure defined, and I'm trying to decide the

Is the container initialized once at the start of the program and then
never modified again (fast reading being the only operation done to it
for the rest of the program)?

Yes. Furthermore, the (many) searches will attempt to find virtually
all of the elements in the container - which is why I'm most interested
in speed of searching.

If the answer is yes, then the most efficient way is to put the elements
in a vector, sort it, and then use binary searching when you need to read
an element. The binary searching can be done with a std::lower_bound() call.

That seems to be most worth exploring. Thanks for the thorough
answer.

If the container needs to be updated (elements added or removed) during
the execution of the program, and this updating must be as fast as possible,
then this solution won't work. (In that case you need to either use std::set
or std::unordered_set.)

This has no bearing on my application. Data is stored once (at the
program's start) and there will be many accesses of all elements (as I
say above).
Thanks to all who responded!

Jeff Flinn · May 31, 2011

Mike said:
Yes. Furthermore, the (many) searches will attempt to find virtually
all of the elements in the container - which is why I'm most interested
in speed of searching.

That seems to be most worth exploring. Thanks for the thorough
answer.

This has no bearing on my application. Data is stored once (at the
program's start) and there will be many accesses of all elements (as I
say above).
Thanks to all who responded!

Sounds like a sorted vector and using upper/lower_bound would be the
best bet.

Jeff

MikeP · Jun 6, 2011

Kai-Uwe Bux said:
Elsethread, the suggestion was made to hide this decision as an
implementation detail of some class. That is a good idea.

I don't think that it is. This is the perfect opportunity for the OP to
learn about applying the correct container/algo by taking the required
amount of time to do so. Unnecessary abstraction just delays the decision
and gives the application unnatural semantics and stifles the
programmer's level of capability. Use the right tool for the job. Don't
just use a hammer for expediency. The way to proceed is in recognizing
the application's required semantics first. Second, pick the correct
container/algo. If need be (and usually it need not be), perform
optimization (time and/or space and/or size) steps (but do avoid this
unless
it is REALLY necessary, for it can unnecessarily result in "bad" code).

If you choose a list representation, then a lot will depend on which
pattern of searches predominates in your application. E.g., if there
is a comparatively small set of keys that you will search for very
often, then a "self organizing file" might be a good idea: you always
search from front to tail and when you dfind a key, you move the
entry to the head of the list. This way, frequently searched entries
gather near the front and will be found quickly.

That is a good example of the thought process required to arrive at a
solution.

If your searches are more or less evenly distributed among the keys, a
better data structure would probably improve performance of
searching. The tradeoff is usually a slightly greater complexity for
inserting an entry. Whether changing the data structures pays off,
therefore, also depends on whether your application will search way
more often than it will modify the underlying data.

Example of optimization concerns.

As the influencing factors are hard to predict, you should write your
program so that it will be easy to change the data structure. Then,
implement one that you can easily prove correct. Create unit tests.
When you change the data structure, those tests will catch bugs. Once
the other data structure is up and running, you can compare
performance.

Again, I disagree (see above).

MikeP · Jun 6, 2011

Rune said:
That's the trade-off you are faced with: Memory vs speed.

If you want to search inside the container, forget the list.
Lists can only be searched sequentially.

The other responder noted that a "priority list" (for lack of a better
designation in my mind right now) may be in order, which may or may not
be. A list that moves recently accessed elements to the front may be just
what the application requires. Do not reject a list out of only the above
characteristic of a simple list. Let the application semantics govern.

Ian Collins · Jun 6, 2011

I don't think that it is. This is the perfect opportunity for the OP to
learn about applying the correct container/algo by taking the required
amount of time to do so. Unnecessary abstraction just delays the decision
and gives the application unnatural semantics and stifles the
programmer's level of capability. Use the right tool for the job.

Requirements aren't always invariant. If we lived in a perfect world,
your observations would be correct. But we don't!

Again, I disagree (see above).

Again, we don't all live in a world of unchanging requirements.

Rune Allnor · Jun 6, 2011

The other responder noted that a "priority list" (for lack of a better
designation in my mind right now) may be in order, which may or may not
be. A list that moves recently accessed elements to the front may be just
what the application requires. Do not reject a list out of only the above
characteristic of a simple list. Let the application semantics govern.

Is that a list? I have seen descriptions of such functionality
in Cormen & al. The description seemed like some sort of tree
structure, that was updated after each access, to me.

Rune

Alain Ketterlin · Jun 6, 2011

[...]

Is that a list? I have seen descriptions of such functionality
in Cormen & al. The description seemed like some sort of tree
structure, that was updated after each access, to me.

It's called a priority queue, and usually uses a tree to guarantee
O(log(N)) insertion and extraction of the highest priority element. It
is included in th STL (priority_queue).

-- Alain.

MikeP · Jun 6, 2011

Ian said:
Requirements aren't always invariant. If we lived in a perfect world,
your observations would be correct. But we don't!

You seem to be trying to make the exception the rule. The rule is to only
code the exceptional case only in the exceptional case and not aberrate
the common case with the exceptional. If you extrapolate further from
your suggestion, you will see it is not practical nor is it good
engineering.

In the OP's case shown above, doing the abstraction of the container/algo
could be R&D but the final product should be devoid of test harnesses.

Again, we don't all live in a world of unchanging requirements.

Non-sequitor at best, strawman at worst. Your "case" is moot. (You are
welcome to try again though but I am not into have overly-general or
philosophical discussion about it, for it is a well-defined area, at
least to me it is).

MikeP · Jun 6, 2011

Alain said:
[...]

Is that a list? I have seen descriptions of such functionality
in Cormen & al. The description seemed like some sort of tree
structure, that was updated after each access, to me.

Click to expand...

It's called a priority queue, and usually uses a tree to guarantee
O(log(N)) insertion and extraction of the highest priority element. It
is included in th STL (priority_queue).

No. A list is a list. A queue is a queue. (Notwithstanding, a queue can
be made from a list). I didn't mean to confuse by using "priority list"
(and parentheically tried to preclude potential confusion and also
attempted to avoid by putting it in quotes) but apparently I did. I was
trying to conjure up a name for a list that relinks last-accessed
elements to the front of the list for I don't know if there is a formal
name for such a beast while there indeed is a name for the different
animal called "priority queue".

Rune Allnor · Jun 6, 2011

Alain said:
Alain said:

[...]
The other responder noted that a "priority list" (for lack of a
better designation in my mind right now) may be in order, which may
or may not be. A list that moves recently accessed elements to the
front may be just what the application requires. Do not reject a
list out of only the above characteristic of a simple list. Let the
application semantics govern.
Is that a list? I have seen descriptions of such functionality
in Cormen & al. The description seemed like some sort of tree
structure, that was updated after each access, to me.

Click to expand...

Click to expand...

It's called a priority queue, and usually uses a tree to guarantee
O(log(N)) insertion and extraction of the highest priority element. It
is included in th STL (priority_queue).

Click to expand...

No. A list is a list. A queue is a queue. (Notwithstanding, a queue can
be made from a list). I didn't mean to confuse by using "priority list"
(and parentheically tried to preclude potential confusion and also
attempted to avoid by putting it in quotes) but apparently I did. I was
trying to conjure up a name for a list that relinks last-accessed
elements to the front of the list for I don't know if there is a formal
name for such a beast while there indeed is a name for the different
animal called "priority queue".- Hide quoted text -

- Show quoted text -

OK, then why not try and untangle the confusion:

1) How is the list you describe functionally and semantically
different from a priority queue?

2) What would the advantages be, if any, with this list compared
to a priority queue?

Rune

MikeP · Jun 6, 2011

Rune said:
Alain said:

Rune Allnor <[email protected]> writes:

Click to expand...

[...]
The other responder noted that a "priority list" (for lack of a
better designation in my mind right now) may be in order, which
may or may not be. A list that moves recently accessed elements
to the front may be just what the application requires. Do not
reject a list out of only the above characteristic of a simple
list. Let the application semantics govern.

Click to expand...

Is that a list? I have seen descriptions of such functionality
in Cormen & al. The description seemed like some sort of tree
structure, that was updated after each access, to me.

Click to expand...

It's called a priority queue, and usually uses a tree to guarantee
O(log(N)) insertion and extraction of the highest priority element.
It is included in th STL (priority_queue).

Click to expand...

No. A list is a list. A queue is a queue. (Notwithstanding, a queue
can be made from a list). I didn't mean to confuse by using
"priority list" (and parentheically tried to preclude potential
confusion and also attempted to avoid by putting it in quotes) but
apparently I did. I was trying to conjure up a name for a list that
relinks last-accessed elements to the front of the list for I don't
know if there is a formal name for such a beast while there indeed
is a name for the different animal called "priority queue".- Hide
quoted text -

- Show quoted text -

Click to expand...

OK, then why not try and untangle the confusion:

1) How is the list you describe functionally and semantically
different from a priority queue?

2) What would the advantages be, if any, with this list compared
to a priority queue?

Think about what a queue is and what a list is and how their behavior is
different and think of some examples where each is appropriately applied
from an application point of view. Save the Big-O stuff and the like for
exotic situations or at least AFTER thinking about the problem from the
use case perspective. The web and search is your friend. I recommend
Cormen's book as the best on the subject.

Michael Doubez · Jun 6, 2011

Rune said:
Rune said:

Alain Ketterlin wrote:
[...]
The other responder noted that a "priority list" (for lack of a
better designation in my mind right now) may be in order, which
may or may not be. A list that moves recently accessed elements
to the front may be just what the application requires. Do not
reject a list out of only the above characteristic of a simple
list. Let the application semantics govern.
Is that a list? I have seen descriptions of such functionality
in Cormen & al. The description seemed like some sort of tree
structure, that was updated after each access, to me.
It's called a priority queue, and usually uses a tree to guarantee
O(log(N)) insertion and extraction of the highest priority element.
It is included in th STL (priority_queue).
No. A list is a list. A queue is a queue. (Notwithstanding, a queue
can be made from a list). I didn't mean to confuse by using
"priority list" (and parentheically tried to preclude potential
confusion and also attempted to avoid by putting it in quotes) but
apparently I did. I was trying to conjure up a name for a list that
relinks last-accessed elements to the front of the list for I don't
know if there is a formal name for such a beast while there indeed
is a name for the different animal called "priority queue".- Hide
quoted text -
- Show quoted text -

Click to expand...

Click to expand...

OK, then why not try and untangle the confusion:

Click to expand...

1) How is the list you describe functionally and semantically
different from a priority queue?

Click to expand...

2) What would the advantages be, if any, with this list compared
to a priority queue?

Click to expand...

Think about what a queue is and what a list is and how their behavior is
different and think of some examples where each is appropriately applied
from an application point of view.

I don't understand your point. A list is a data structure while a
queue is a way to organize/prioritize data (the FIFO abstraction).
Thus, in the C++ standard, a std::list<> is a container while a
std::queue<> is a container adapter.

I would be interest on how you can compare them ?

Save the Big-O stuff and the like for
exotic situations or at least AFTER thinking about the problem from the
use case perspective. The web and search is your friend. I recommend
Cormen's book as the best on the subject.

From the reviews I have seen, it is tagged as a bit too academic.

MikeP · Jun 6, 2011

Michael said:
Rune said:

Alain Ketterlin wrote:
Rune Allnor <[email protected]> writes:

Click to expand...

[...]
The other responder noted that a "priority list" (for lack of a
better designation in my mind right now) may be in order, which
may or may not be. A list that moves recently accessed elements
to the front may be just what the application requires. Do not
reject a list out of only the above characteristic of a simple
list. Let the application semantics govern.

Click to expand...

Is that a list? I have seen descriptions of such functionality
in Cormen & al. The description seemed like some sort of tree
structure, that was updated after each access, to me.

Click to expand...

It's called a priority queue, and usually uses a tree to guarantee
O(log(N)) insertion and extraction of the highest priority
element. It is included in th STL (priority_queue).

Click to expand...

No. A list is a list. A queue is a queue. (Notwithstanding, a queue
can be made from a list). I didn't mean to confuse by using
"priority list" (and parentheically tried to preclude potential
confusion and also attempted to avoid by putting it in quotes) but
apparently I did. I was trying to conjure up a name for a list that
relinks last-accessed elements to the front of the list for I don't
know if there is a formal name for such a beast while there indeed
is a name for the different animal called "priority queue".- Hide
quoted text -

Click to expand...

- Show quoted text -

Click to expand...

OK, then why not try and untangle the confusion:

Click to expand...

1) How is the list you describe functionally and semantically
different from a priority queue?

Click to expand...

2) What would the advantages be, if any, with this list compared
to a priority queue?

Click to expand...

Think about what a queue is and what a list is and how their
behavior is different and think of some examples where each is
appropriately applied from an application point of view.

Click to expand...

I don't understand your point.

My point is that discussion about fundamentals of ADTs is noise in a NG
about C++. As such, I gave suggestion for further investigation to the
poster.

A list is a data structure while a
queue is a way to organize/prioritize data (the FIFO abstraction).

The are both ADTs. In the simplest case, I think of a priority queue as a
queue that allows adding to the front of the queue in addition to just
the back of the queue. That one additional capability can distinguish a
priority queue from a regular queue. Here's a good "homework" question:
Should a queue allow iteration over it's elements just like a list does?
(Good homework or test question in general. I wasn't assigning you
homework!).

But see, now you are sucking me into the discussion I didn't want to get
into! Web/search is your friend.

Thus, in the C++ standard, a std::list<> is a container while a
std::queue<> is a container adapter.

You seem to be confusing the abstraction with a particular
implementation. Just because you CAN build a queue from a list does not
mean you HAVE to build it that way, nor is it necessarily the best way.

I would be interest on how you can compare them ?

I compare them according to their behavioral capabilities and by their
applicability to a given scenario.

From the reviews I have seen, it is tagged as a bit too academic.

Well that is exactly what seems to be the need: fundamental principles of
the different ADTs, divorced from any particular implementation in any
programming language. If you don't like Cormen, there are plenty of other
books on the subject. My red-black tree is based on the pseudo-code in
the Cormen book, but an early version. I will be rewriting based on their
latest edition which is more stable in regards to not invalidating
iterators. So, I obviously like the book and that is why I recommended
it.

Sorting an STL map	1	Oct 19, 2013
STL Map Problem	3	Mar 30, 2013
STL: map vs vector	13	May 18, 2006
STL Map iterator compilation error	2	Nov 29, 2007
STL map or hash map using struct as data and find it	7	Dec 31, 2007
map vs list performance	15	Dec 30, 2009
STL linked list w/multiple entries?	2	Apr 8, 2008
Modifying STL list Objects	2	May 23, 2009

STL list or map?

Mike Copeland

Stefan Ram

Kai-Uwe Bux

Rune Allnor

Joshua Maurice

Michael Doubez

Juha Nieminen

Mike Copeland

Jeff Flinn

MikeP

MikeP

Ian Collins

Rune Allnor

Alain Ketterlin

MikeP

MikeP

Rune Allnor

MikeP

Michael Doubez

MikeP

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads