distance between two iterators in map

J

John

In STL's map implementation, the distance between two
iterators it1, it2 takes O(K) to compute where K is the
actual distance between the two iterators.

In theory, a red black tree can do this in O(log K) time.
Anyone knows if there is a way to do this in map<> or
if map<> can be inherited/modified to do this?

Thanks,
--j


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
 
V

Victor Bazarov

John said:
In STL's map implementation, the distance between two
iterators it1, it2 takes O(K) to compute where K is the
actual distance between the two iterators.

Where did you get this information?
In theory, a red black tree can do this in O(log K) time.
Anyone knows if there is a way to do this in map<> or
if map<> can be inherited/modified to do this?

First of all, are you sure you have to modify 'std::map'?

Second, yes, you can and may inherit from 'std::map'. However,
you will then need to also modify/inherit 'std::map::iterator'.

Third, perhaps you should contact your library implementors and
suggest any improvements you want them to make to the library.

V
 
V

Victor Bazarov

Kai-Uwe Bux said:
Victor said:
Where did you get this information?

Maybe from the standard [24.3.4/1].

Maybe. Maybe not. 24.3.4/1 essentially sets the _lowest_ standard
for bidirectional iterators. While 'std::map's iterator is of
'bidirectional' category, nothing really prevents its real-world
implementation from providing better than linear 'advance' and
'distance', does it? John says "implementation ... takes O(K)".
Which implementation? Whose implementation? Catch my drift?

V
 
K

Kai-Uwe Bux

Victor said:
Kai-Uwe Bux said:
Victor said:
John wrote:
In STL's map implementation, the distance between two
iterators it1, it2 takes O(K) to compute where K is the
actual distance between the two iterators.

Where did you get this information?

Maybe from the standard [24.3.4/1].

Maybe. Maybe not. 24.3.4/1 essentially sets the _lowest_ standard
for bidirectional iterators.

Where in the wording do you find support for that opinion? The standard
says:

[...], the library provides two function templates advance and distance.
...; for input, forward, and bidirectional iterators they use ++ to
provide linear time implementations.

Seems to be pretty specific in saying what std::distance() is supposed to
do.
While 'std::map's iterator is of
'bidirectional' category, nothing really prevents its real-world
implementation from providing better than linear 'advance' and
'distance', does it?

Not sure, but maybe you can direct me to the point in the standard where it
allows to do better. There could be some clause the standard stating that
implementations are always allowed to provide specializations for their own
containers/iterators that do better than the generic implementation
specified for some algorithm. However, I am not aware of such language and
would be grateful for a pointer.

John says "implementation ... takes O(K)". Which implementation? Whose
implementation? Catch my drift?


Best

Kai-Uwe Bux
 
J

John

Victor said:
Where did you get this information?

I guessed it :) That is the easy way out for coders,
and I'm almost sure STL implementors at SGI took it
(Prove me wrong! :) I wud be glad)
First of all, are you sure you have to modify 'std::map'?

Unless someone tells me otherwise that distance can be
calculated in O(log K) time on map.
Second, yes, you can and may inherit from 'std::map'. However,
you will then need to also modify/inherit 'std::map::iterator'.

I think it will be a mess if I try to do that, I'll have to modify
everything...inserts deletes...what is stored in the map structure...
Thats why I asked, is there an easy way out...
Third, perhaps you should contact your library implementors and
suggest any improvements you want them to make to the library.

You mean SGI? Since when did they start to listen to common folks?
And not everyone needs fast "distance" computation between iterators.
So not sure if this is worth the effort in the STL.

Thanks,
--j


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
 
J

Jerry Coffin

@p10g2000cwp.googlegroups.com>, (e-mail address removed)
says...

[ ... ]
In theory, a red black tree can do this in O(log K) time.
Anyone knows if there is a way to do this in map<> or
if map<> can be inherited/modified to do this?

I can see how you could make this proportional to
something close to O(log n) in a perfectly balanced tree.
I can see how you could derive some sort of rough
estimate in O(Log n) on an RB tree as well, but I can't
see how you'd do an accurate distance in O(log N) on an
RB tree (or any other somewhat-imbalanced tree such as
AVL).

--
Later,
Jerry.

The universe is a figment of its own imagination.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
 
D

Dietmar Kuehl

Kai-Uwe Bux said:
Victor said:
Maybe. Maybe not. 24.3.4/1 essentially sets the _lowest_ standard
for bidirectional iterators.

Where in the wording do you find support for that opinion? The standard
says:

[...], the library provides two function templates advance and distance.
...; for input, forward, and bidirectional iterators they use ++ to
provide linear time implementations.

Seems to be pretty specific in saying what std::distance() is supposed to
do.

An implementation is always free to do better than the specification.
This is covered by the "as if"-rule: a user cannot detect whether the
implementation really used 'operator++()' on the bidirectional
iterator. Thus, an implementation can do whatever it wants as long as
it meets the explicit constraints like being no worse than linear.
Actually, some operations are specified by code snippets but the
implementer of the library is not at all bound to use code which is
even similar to these!
 
K

Kai-Uwe Bux

Dietmar said:
Kai-Uwe Bux said:
Victor said:
Maybe. Maybe not. 24.3.4/1 essentially sets the _lowest_ standard
for bidirectional iterators.

Where in the wording do you find support for that opinion? The standard
says:

[...], the library provides two function templates advance and
[distance.
...; for input, forward, and bidirectional iterators they use ++ to
provide linear time implementations.

Seems to be pretty specific in saying what std::distance() is supposed to
do.

An implementation is always free to do better than the specification.
This is covered by the "as if"-rule: a user cannot detect whether the
implementation really used 'operator++()' on the bidirectional
iterator. Thus, an implementation can do whatever it wants as long as
it meets the explicit constraints like being no worse than linear.

I pondered for a while whether the as if rule covers this case. However, I
think that, as an interpretation of the standard, it is stretching the as
if rule a little too far. The as if rule is concerned with the "semantic
descriptions" and not with performance requirements. If you can quote the
as if rule to do better than linear (because performance is not observable
by definition of the standard), then why can't you take it as license to do
worse than linear (because performance is not observable)?

I should add that I certainly would want the standard to allow for doing
better and for ruling out doing worse; also I would consider anything else
a bug in the standard. Moreover, I am pretty sure that the intend of the
standard (or its authors) goes in this direction. It just so happens that I
cannot match the intent with the actual wording of the standard.

Anyway, this discussion should be purely academic since (a) no sensible
person would complain if an implementation does better than the standard
requires/permits and (b) nobody would want to change the wording of the
standard just to license better implementations running the risk of
introducing worse inconsistencies.


Best

Kai-Uwe Bux
 
D

Dietmar Kuehl

Kai-Uwe Bux said:
I pondered for a while whether the as if rule covers this case. However, I
think that, as an interpretation of the standard, it is stretching the as
if rule a little too far. The as if rule is concerned with the "semantic
descriptions"
Right.

and not with performance requirements.

Well, not exactly. It is pretty clear that if there is performance
requirement, it has to be met for an implementation. The "as if" rule
does not touch this.
If you can quote the
as if rule to do better than linear (because performance is not observable
by definition of the standard), then why can't you take it as license to
do worse than linear (because performance is not observable)?

The performance requirements are always of the form "no worse than".
They are not of the form "exactly this performance characteristic".
The notation O(f(n)) just states that the asymptotic growth is less
than 'c * f(n)'. If 'f(n)' is a linear function, e.g. 'n', the
requirement is also met if the actual function is 'ln n'.

IMO this settles the case.
 
P

Pete Becker

Kai-Uwe Bux said:
I should add that I certainly would want the standard to allow for doing
better and for ruling out doing worse; also I would consider anything else
a bug in the standard. Moreover, I am pretty sure that the intend of the
standard (or its authors) goes in this direction. It just so happens that I
cannot match the intent with the actual wording of the standard.

17.3.1.3/5: "Complexity requirements specified in the library clauses
are upper bounds, and implementations that provide better complexity
guarantees satisfy the requirements."
 
K

kanze

I guessed it :) That is the easy way out for coders, and I'm
almost sure STL implementors at SGI took it (Prove me wrong!
:) I wud be glad)

You don't have to guess -- in fact, in cases like these, you
shouldn't guess. The standard says that the iterators for
std::map are bidirectional iterators. Bidirectional iterators
don't support addition or subtraction, and the distance
algorithm is specified to be O(k) for them.
Unless someone tells me otherwise that distance can be
calculated in O(log K) time on map.

I don't think it can be, off hand. Or rather, support for doing
so would increase the memory footprint of the map, and increase
times for other operations, like insertion or deletion. Which
isn't necessarily acceptable.
I think it will be a mess if I try to do that, I'll have to
modify everything...inserts deletes...what is stored in the
map structure... Thats why I asked, is there an easy way
out...

I can't see anything less than implementing the whole thing
yourself. You need a more complex data structure for the nodes,
which affects just about everything else.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
 
K

kevin cline

You can subtract tree iterators in O(log N) time if you store the left
subtree size at each node. This will allow the ordinal position of a
node to be computed by adding the sizes of subtrees that precede the
node. For example, here is an in-order traversal of a tree, with the
left-subtree sizes shown.
A (left = 0)
C (left = 1)
E (left = 0)
F (left = 0)
G (left = 4)
H (left = 0)
J (left = 1)
K (left = 0)
M (left = 1)
N (left = 3)
Q (left = 0)
R (left = 1)
S (left = 0)
T (left = 0)

To locate node "M" we pass through nodes G and J, and compute the index
of "M" as: 4 (G's left subtree) + 1 (G) + 1 (J's left subtree) + 1 (J)
+ 1 (M's left subtree) = 8. This computation is O(log N). Similarly,
the left subtree sizes can be updated in O(log N) time as nodes are
inserted or removed.


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
 
K

Kai-Uwe Bux

Pete said:
17.3.1.3/5: "Complexity requirements specified in the library clauses
are upper bounds, and implementations that provide better complexity
guarantees satisfy the requirements."

Thanks, that settles it.


Best

Kai-Uwe Bux
 
J

Jerry Coffin

@z34g2000cwc.googlegroups.com>, (e-mail address removed)
says...
You can subtract tree iterators in O(log N) time if you store the left
subtree size at each node.

Yes, storing extra data allows you to do it. That, of
course, raises the question of whether it's worth it.
This data would have to be updated at essentially every
rotation, so you'd have to know that distance was going
to be used quite a bit to justify it.

Considering the number of times I've (personally) seen
distance used on maps, I doubt it's justified in any but
a few relatively special cases -- though I may be in the
minority.

--
Later,
Jerry.

The universe is a figment of its own imagination.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
 
R

Richard Smith

James said:
You don't have to guess -- in fact, in cases like these, you
shouldn't guess. The standard says that the iterators for
std::map are bidirectional iterators. Bidirectional iterators
don't support addition or subtraction, and the distance
algorithm is specified to be O(k) for them.

Of course the library is perfectly entitled to make std::distance
*better* than O(k) if it can. It already has to handle the special
case that the iterators are random access iterators, so adding another
special case for std::map iterators ought to be easy.

I don't think it can be, off hand. Or rather, support for doing
so would increase the memory footprint of the map, and increase
times for other operations, like insertion or deletion. Which
isn't necessarily acceptable.

Agreed -- I can't see how inserting with a (correct) hint and erasing a
single iterator can be amortized constant time. I *think* the rest of
the requirements can be satisfied, though, as the obvious
implementation simply adds another O(log N) term to an already O(log N)
operation.

The standard aside, though, the benefits of O(1) deletion and, in some
circumstances, insertion, with O(N) iterator differences certainly
seems a better general purpose design than O(log N) insertion,
deletion, and iterator differences. I can't think of many situations
when I've wanted to call std::distance on map iterators -- the size of
the range returned by equal_range is the obvious case, and in this case
the distance is often small.

--
Richard Smith


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
 
N

n2xssvv g02gfr12930

John said:
In STL's map implementation, the distance between two
iterators it1, it2 takes O(K) to compute where K is the
actual distance between the two iterators.

In theory, a red black tree can do this in O(log K) time.
Anyone knows if there is a way to do this in map<> or
if map<> can be inherited/modified to do this?

Thanks,
--j

It's my understanding that most STL implementations use Red Black
trees to implement map. Naturally the time taken to calculate
the distance between 2 iterators is dependent on how the container
is implemented. Now, as I understand it std::list normally uses a
linked list making it fast when doing a lot of insertions and
deletions of elements, but slow for calculating the distance between
2 iterators. Whereas std::vector normally effectively uses an array
which gives fast random access and iterator distance calculation but
slow insertion and deletion operations. So any design should pick the
STL container best suited for the task.
Finally, as you probably already know re balancing a Red Black tree
on insertion and deletion operations is quicker than an AVL tree.

JB
 
P

Pete Becker

Richard said:
Of course the library is perfectly entitled to make std::distance
*better* than O(k) if it can. It already has to handle the special
case that the iterators are random access iterators, so adding another
special case for std::map iterators ought to be easy.

Random access iterators can be recognized because
iterator_traits<iter>::iterator_category is random_access_iterator_tag.
There is no such easy identification of a map iterator. It's just a
bidirectional iterator.

--

Pete Becker
Roundhouse Consulting, Ltd.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
 
D

Dietmar Kuehl

Pete said:
Random access iterators can be recognized because
iterator_traits<iter>::iterator_category is
random_access_iterator_tag.
There is no such easy identification of a map iterator. It's just a
bidirectional iterator.

.... and still the library implementation can [probably] detect its
iterator type used for 'std::map's: if the iterator type is not
nested within the 'std::map' class template it is fairly easy to
detect and a special overload for the 'std::distance()' function
template can take advantage of the known structure, e.g.:

namespace std {
// ...
template <typename _K, typename _V, typename _C, typename _A>
struct _Map_iterator
{
// funny stuff dealing with the iterator
};

template <typename _K, typename _V, typename _C, typename _A>
struct map
{
typedef _Map_iterator<_K, _V, _C, _A> iterator;
};

template <typename _K, typename _V, typename _C, typename _A>
std::size_t // just a short-hand; might use iterator_traits
distance(_Map_iterator<_K, _V, _C, _A> begin,
_Map_iterator<_K, _V, _C, _A> end)
{
// do fancy ln n algorithm to determine the distance
}
}

There is no need to have special machinery for detecting the
iterators. It is actually not even a really generic algorithm
although it is a template: it just operates on a fixed data
structure and could possibly even work on a non-templatized
base class of the 'std::map's nodes which represent the links
in the tree (although I'm not sure I would realize things this
way because it sacrifices some type-safety; of course, if profiling
shows that it is indicated, I would do it).

That said, I'm not sure that any optimization of computing the
distance or moving a pointer on 'std::map's is warranted: it is
a rare use and to prepare the data structure for this would cost
some performance even if the operations are not used. Thus, people
would pay for something they don't really use. An optimization like
this might be reasonable for a special purpose map where it is
common that iterators are advanced in large step or the distance
between iterators is frequently computed.
--
<mailto:[email protected]> <http://www.dietmar-kuehl.de/>
<http://www.eai-systems.com> - Efficient Artificial Intelligence

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
 
R

Richard Smith

Pete said:
Random access iterators can be recognized because
iterator_traits<iter>::iterator_category is random_access_iterator_tag.
There is no such easy identification of a map iterator. It's just a
bidirectional iterator.

But I'm talking about what the library implementation might choose to
do. If the implementation of std::map happens to allow a fast
implemention of std::distance, then it would be easy to recognise the
map's iterator type and overload accordingly: just overload one of its
internal implementation functions and rely on partial ordering to find
it. The library implementators can reliably do this even if an end
user cannot.

--
Richard Smith


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top