priority_queue<> underlying container

D

Dave

Hello all,

I'm pondering why the default underlying container for std::priority_queue<>
is std::vector<>. It would seem that inserts are liable to happen anywhere,
which would make std::list<> a superior alternative. Why factors am I not
considering here? Why in the general case would std::vector<> be best?

Thanks,
Dave
 
J

Joe Gottman

Dave said:
Hello all,

I'm pondering why the default underlying container for
std::priority_queue said:
is std::vector<>. It would seem that inserts are liable to happen anywhere,
which would make std::list<> a superior alternative. Why factors am I not
considering here? Why in the general case would std::vector<> be best?

A priority_queue is generally implemented as a heap, which requires
random-access iterators.

Joe Gottman
 
J

John Carson

Dave said:
Hello all,

I'm pondering why the default underlying container for
std::priority_queue<> is std::vector<>. It would seem that inserts
are liable to happen anywhere, which would make std::list<> a
superior alternative. Why factors am I not considering here? Why in
the general case would std::vector<> be best?

Thanks,
Dave


According to Josuttis (The C++ Standard Library), priority queues are sorted
using the heap sorting algorithm. A heap sort is only a partial sort (which
allows it to be O(n), in contrast to full sorting algorithms which tend to
be O(n*log n)). A heap sort requires that the container offer a random
access iterator, which list does not do. The fact that the vector only needs
to be partially sorted also reduces the cost of insertion.
 
C

Claudio Puviani

John Carson said:
According to Josuttis (The C++ Standard Library), priority queues are sorted
using the heap sorting algorithm. A heap sort is only a partial sort (which
allows it to be O(n), in contrast to full sorting algorithms which tend to
be O(n*log n)). A heap sort requires that the container offer a random
access iterator, which list does not do. The fact that the vector only needs
to be partially sorted also reduces the cost of insertion.

You're confusing two completely different things. std::priority_queue uses a
heap DATA STRUCTURE, not a heap sort ALGORITHM. The heap sort algorithm, by the
way, works in O(n log n), not O(n) and does NOT sort partially; the result of a
heap sort is a fully sorted sequence. The heap sort algorithm USES a heap data
structure to perform its work, but the heap data structure has uses other than
sorting... such as priority queues.

You can read up on the details in any good book on data structures and
algorithms (Knuth if you're techically inclined, or Sedgewick if you prefer a
didactic approach).

Claudio Puviani
 
J

John Carson

Claudio Puviani said:
You're confusing two completely different things. std::priority_queue
uses a heap DATA STRUCTURE, not a heap sort ALGORITHM. The heap sort
algorithm, by the way, works in O(n log n), not O(n) and does NOT
sort partially; the result of a heap sort is a fully sorted sequence.
The heap sort algorithm USES a heap data structure to perform its
work, but the heap data structure has uses other than sorting... such
as priority queues.

You can read up on the details in any good book on data structures and
algorithms (Knuth if you're techically inclined, or Sedgewick if you
prefer a didactic approach).

Claudio Puviani



Thanks for that. You are right. I was basing my remarks on the
half-remembered discussion in Eric Roberts, Programming Abstractions in C,
which, now that I check, does indeed deal with a heap data structure (in
which data is partially sorted) rather than a heap sort. I transferred that
hazy memory into my reading of Josuttis. Apologies to all concerned for the
misinformation.
 
H

Howard Hinnant

Claudio Puviani said:
You're confusing two completely different things. std::priority_queue uses a
heap DATA STRUCTURE, not a heap sort ALGORITHM. The heap sort algorithm, by
the
way, works in O(n log n), not O(n) and does NOT sort partially; the result of
a
heap sort is a fully sorted sequence. The heap sort algorithm USES a heap
data
structure to perform its work, but the heap data structure has uses other
than
sorting... such as priority queues.

You can read up on the details in any good book on data structures and
algorithms (Knuth if you're techically inclined, or Sedgewick if you prefer a
didactic approach).

Actually std::priority_queue uses any data structure which supports
random access iterators:
23.2.3.2 - Template class priority_queue [lib.priority.queue]

-1- Any sequence with random access iterator and supporting operations
front(), push_back() and pop_back() can be used to instantiate
priority_queue.

And the description of the std::priority_queue member functions
specifically mentions the heap algorithms, for example:
void push(const value_type& x);

-1- Effects:
c.push_back(x);
push_heap(c.begin(), c.end(), comp)

-Howard
 
C

Claudio Puviani

Howard Hinnant said:
Claudio Puviani said:
You're confusing two completely different things. std::priority_queue
uses a heap DATA STRUCTURE, not a heap sort ALGORITHM.
The heap sort algorithm, by the way, works in O(n log n), not O(n)
and does NOT sort partially; the result of a heap sort is a fully sorted
sequence. The heap sort algorithm USES a heap data structure to
perform its work, but the heap data structure has uses other than
sorting... such as priority queues.

You can read up on the details in any good book on data structures and
algorithms (Knuth if you're techically inclined, or Sedgewick if you prefer a
didactic approach).

Actually std::priority_queue uses any data structure which supports
random access iterators:
23.2.3.2 - Template class priority_queue [lib.priority.queue]

-1- Any sequence with random access iterator and supporting operations
front(), push_back() and pop_back() can be used to instantiate
priority_queue.

That describes the underlying storage mechanism, not the way the data is
organized. For example, the underlying storage could be a non-contiguous
sequence, but the data organization would still be a heap.
And the description of the std::priority_queue member functions
specifically mentions the heap algorithms, for example:

'push_heap()' is a heap management function, just as I described above. It is
NOT performing a heap sort (which only uses heap management internally and
temporarily). Heap algorithms are NOT the heap sort algorithm but building
blocks thereof.

Claudio Puviani
 
J

Jerry Coffin

Dave said:
Hello all,

I'm pondering why the default underlying container for std::priority_queue<>
is std::vector<>. It would seem that inserts are liable to happen anywhere,
which would make std::list<> a superior alternative. Why factors am I not
considering here? Why in the general case would std::vector<> be best?

This exemplifies why std::list (and linked lists in general) are
rarely as useful as they initially appear.

Linked lists allow O(k) insertion or deletion anywhere in the list,
which sounds good. Unfortunately, in typical situations, you first
have to figure out where in the list to to do that insertion or
deletion -- and traversing the list to that point is O(N). As such,
the O(k) insertion/deletion is really O(N) unless you can find the
right spot in the list without actually traversing it (cf. skip
lists).

In the case of a heap, things are a really even worse: you never
really have to insert in the middle of the heap anyway -- you make the
addition at one end of the heap, and then swap the new element with
roughly lg(N) elements to get it to the right spot. Since you can add
an item to one end of a vector (or either end of a deque) in
(amortized) constant time, the total complexity is O(lg N).

Each insertion in the list would involve O(N) traversal steps along
with the O(lg N) comparisons, and finally an O(k) insertion at the
right spot. As such, the only hope for a list would be if we could
count on those N traversal steps being _extraordinarily_ fast (it
clearly loses asymtotically, but might remain better in practical
ones).

Unfortunately, that's exactly the opposite of reality: in a vector,
the memory is contiguous, leading to high locality of reference.
Along with that, we have only the data we care about being stored --
i.e. data density is 100%. Both of these help with cache utilization.

In a list, each node is normally allocated individually, and includes
not only the data itself, but a couple of pointers as well. Therefore
we have relatively poor locality of reference, and lower density of
data (i.e. the cache ends up holding lots of pointers along with the
data proper). The result is poorer use of the cache, so not only is
the complexity higher, but the constants are too...
Later,
Jerry.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top