std::set

M

ma740988

The position returned via the STL std::set container never made much
sense to me. When you insert elements within the container, the
position returned - via find - does not reflect the actual position of
the value within the container. std::string - for instance - does.
How then do I get the actual position of the value when using
std::set?

# include <iostream>
using std::cout;
using std::cin;
using std::endl;

# include <fstream>
using std::ifstream;
using std::eek:fstream;

# include <vector>
using std::vector;

# include <iomanip>
using std::setw;

# include <string>
using std::string;

# include <limits>
using std::numeric_limits;

# include <algorithm>
using std::copy;

# include <iterator>
using std::istream_iterator;
using std::eek:stream_iterator;
using std::back_inserter;

# include <set>
using std::set;

int main()
{
cout << "Testing std::set " << '\n';
set<int> cc;
cc.insert(10);
cc.insert(30);
cc.insert(20);
if (cc.insert(20).second)
cout << "success" << '\n';
else
cout << "failure -> set doesn't allow duplicates " << '\n';
cc.insert(40);

set<int>::const_iterator pos = cc.begin();
pos = cc.find(40);
if (pos != cc.end())
cout << "Found " << *pos << " within the set " << '\n' << '\n';
else
cout << " not found " << '\n';

// for (; pos != cc.end(); ++pos)
// cout << *pos << '\n';
cout << "Testing std::string " << '\n';

string my_string ("abcdefg");
string::size_type idx = my_string.find('c');

if (idx == string::npos)
cout << "not found " << '\n';
else
cout << "String found at position: " << static_cast<int>(idx) <<
'\n';
}

my_string.find ('c') returns 2 for the position which makes sense.
cc.find(40) on the other hand 'returns' the contents at the position
not the actual position.

Consider:
std::set<int, greater<int> > mySet;
mySet.insert (10);
mySet.insert (40);
mySet.insert (50);
mySet.insert (35);
mySet.insert (30);

Based on the sort criteria, could I get a 'write up' - of sorts on
how/when the actual 'sorting' happens?

I envision per my reading of the C++ Std Library.
10 gets inserted.
40 gets inserted. Since 40 is greater than 10, 40 gets moved to the
'top'.
50 gets inserted. Since 50 is greater than 10 && 50 greater than 40,
50 gets moved to the top. This describes the 'transitive' behavior of
strick weak ordering.
and so on ...?

At some point I suspect I'll peruse the source (if avaibable) of
std::set since I'm also not sure (but curious) of how the 'motor
operates' - if you will.
Take the case again where 50 gets inserted. This much is true,
there's an operator() for the criteria, however, is it safe to state
that operator() is called twice? Once to compare 50 against 10, and a
second time to compare 50 against 40, except I now feel like I'm
headed towards logarithimic complexity versus linear versus .. can't
recall the other one.


Thanks for your time.
 
V

Victor Bazarov

ma740988 said:
The position returned via the STL std::set container never made much
sense to me. When you insert elements within the container, the
position returned - via find - does not reflect the actual position of
the value within the container. std::string - for instance - does.
How then do I get the actual position of the value when using
std::set?

(a) There is no "actual position" because std::set is not a linear
container, it's actually a tree. Well, there is a position, it's
just not what you probably think.

(b) Why do you think you need one?

(c) You could always calculate std::distance(resultoffind, cc.begin())
to learn how many elements you'd have to step through to get to
the found element from the beginning of the set. Again, why do
you think you need that?

(d) Yes, insertion into a set has logarithmic complexity.

All above considered, why don't you get a decent book on C++ Standard
Library and see what it says instead of collecting bits and pieces from
the newsgroup or from your experiments?

If I didn't answer your questions, it's probably because I didn't
understand them. Sorry, then.

Victor
 
T

tom_usenet

The position returned via the STL std::set container never made much
sense to me. When you insert elements within the container, the
position returned - via find - does not reflect the actual position of
the value within the container.

It is an iterator to the element - it reflects the exact position of
the element in the container.
std::string - for instance - does.
How then do I get the actual position of the value when using
std::set?

What do you mean by "position"? I think you are trying to say "index",
in which case, index doesn't make all that much sense for a set, since
it isn't a sequence container. If you want to know where the element
lies in the sort order of the container, you can find out using
std::distance. e.g.

pos = s.insert(40).first;
int sortPosition = std::distance(s.begin(), pos); //O(pos) operation.
my_string.find ('c') returns 2 for the position which makes sense.
cc.find(40) on the other hand 'returns' the contents at the position
not the actual position.

No, it returns an iterator, which *is* a position.
Consider:
std::set<int, greater<int> > mySet;
mySet.insert (10);
mySet.insert (40);
mySet.insert (50);
mySet.insert (35);
mySet.insert (30);

Based on the sort criteria, could I get a 'write up' - of sorts on
how/when the actual 'sorting' happens?

The complexity and iterator requirements of std::set means that it is
always implemented as some kind of balanced tree structure, usually a
red-black tree. No sorting happens, the container stores inserts the
element into the appropriate position in the tree to maintain the
invariants of the tree, jiggling the nodes if necessary (which is
usually called "rebalancing").
I envision per my reading of the C++ Std Library.
10 gets inserted.
40 gets inserted. Since 40 is greater than 10, 40 gets moved to the
'top'.

What do you mean by "top"? The root of the tree? Or are you thinking
of a list?
50 gets inserted. Since 50 is greater than 10 && 50 greater than 40,
50 gets moved to the top. This describes the 'transitive' behavior of
strick weak ordering.
and so on ...?

Elements in sets never get moved (unless you count rebalancing as
moving). It is just that new elements get inserted in the correct
place.
At some point I suspect I'll peruse the source (if avaibable) of
std::set since I'm also not sure (but curious) of how the 'motor
operates' - if you will.
Take the case again where 50 gets inserted. This much is true,
there's an operator() for the criteria, however, is it safe to state
that operator() is called twice? Once to compare 50 against 10, and a
second time to compare 50 against 40, except I now feel like I'm
headed towards logarithimic complexity versus linear versus .. can't
recall the other one.

Insertion is linear. Basically, the code walks down the tree towards
the bottom until it finds the correct location. The depth of the tree
is O(log size()), so insertion takes only O(log size() ) comparisons.

Read up on red-black trees. Implementing your own would be an
educational experience.

Tom
 
A

Andre Kostur

(c) You could always calculate std::distance(resultoffind, cc.begin())
to learn how many elements you'd have to step through to get to
the found element from the beginning of the set. Again, why do
you think you need that?

I don't have my C++ references in front of me, but shouldn't the iterators
in your call to distance be reversed? Isn't it a requirement that the
second iterator be reachable from the first by repeated calls to operator++
?
 
V

Victor Bazarov

Andre said:
@ord-read.news.verio.net:




I don't have my C++ references in front of me, but shouldn't the iterators
in your call to distance be reversed? Isn't it a requirement that the
second iterator be reachable from the first by repeated calls to operator++
?

Probably. Since I suggested the OP to find a good book about the library,
I didn't care much to verify the 'distance' synopsis.

V
 
A

Andrey Tarasevich

ma740988 said:
The position returned via the STL std::set container never made much
sense to me. When you insert elements within the container, the
position returned - via find - does not reflect the actual position of
the value within the container. std::string - for instance - does.
How then do I get the actual position of the value when using
std::set?

What kind of "position" are you talking about? An index? 'std::set' is
not an random-access container and it doesn't provide index access. You
can simulate index access to 'std::set' by using 'std::advance' and
'std::distance' but that will be highly inefficient.
 
M

ma740988

(c) You could always calculate std::distance(resultoffind, cc.begin())
to learn how many elements you'd have to step through to get to
the found element from the beginning of the set. Again, why do
you think you need that?

The issue isn't isolated to a 'need', but a mis-understanding of
'find' when viewed from .. say std::string etc. The string example
pointed out that, find returns an 'index'. For std::set the solution
as you pointed out is to use 'std::distance', more importantly I
overlooked the fact that std::set is predicated upon a tree and not a
linear container
(d) Yes, insertion into a set has logarithmic complexity.

All above considered, why don't you get a decent book on C++ Standard
Library and see what it says instead of collecting bits and pieces from
the newsgroup or from your experiments?
I've got what in the ++ community is a decent book Josuttis and except
for the authors name I mentioned the text in my orignal post. I'm
reading chapter 6 on containers, cruising along until i encountered
std::set. When confusion looms large I oft try 'stepping through' a
sample program. In this case I was still confused so I turn to
comp.lang.c++
If I didn't answer your questions, it's probably because I didn't
understand them.
Coupled with toms post, I now see the light.

Thanks for your time.
 
M

ma740988

tom_usenet said:
What do you mean by "position"? I think you are trying to say "index",
in which case, index doesn't make all that much sense for a set, since
it isn't a sequence container. If you want to know where the element
lies in the sort order of the container, you can find out using
std::distance. e.g.
Thats correct, I should have said 'index'.
pos = s.insert(40).first;
int sortPosition = std::distance(s.begin(), pos); //O(pos) operation.
my_string.find ('c') returns 2 for the position which makes sense.
cc.find(40) on the other hand 'returns' the contents at the position
not the actual position.

No, it returns an iterator, which *is* a position.
Consider:
std::set<int, greater<int> > mySet;
mySet.insert (10);
mySet.insert (40);
mySet.insert (50);
mySet.insert (35);
mySet.insert (30);

Based on the sort criteria, could I get a 'write up' - of sorts on
how/when the actual 'sorting' happens?

The complexity and iterator requirements of std::set means that it is
always implemented as some kind of balanced tree structure, usually a
red-black tree. No sorting happens, the container stores inserts the
element into the appropriate position in the tree to maintain the
invariants of the tree, jiggling the nodes if necessary (which is
usually called "rebalancing"). Got it!!!
[...]

Insertion is linear. Basically, the code walks down the tree towards
the bottom until it finds the correct location. The depth of the tree
is O(log size()), so insertion takes only O(log size() ) comparisons.

Read up on red-black trees. Will do
Implementing your own would be aneducational experience.

Truly appreaciate it. Thanks
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,053
Latest member
BrodieSola

Latest Threads

Top