why does STL binary-serach twice when deleting a set member?

J

JDT

Hi,

The last statement at the end of this post is to delete a set member
7.7. I set a break point inside ltstr::eek:perator(). I found out that
STL binary-seach the set twice. How come? I have another problem that
a member in a set cannot be deleted. I think I may be able to solve
that problem if I can figure out the problem posted here. Thanks for
your help.

The sequence of (f1, f2) STL goes through is (7.7, 4.4), (7.7, 6.6),
(7.7, 8.8), (7.7, 7.7), (4.4, 7.7), (6.6, 7.7), (8.8, 7.7), and finally
(7.7, 7.7) again.


// *****************************
struct ltstr // less than
{
bool operator()(const float f1, const float f2) const
{
return f1 > f2;
}
};

int main(int argc, char* argv[])
{

set<float, ltstr > s;
s.insert(1.1);
s.insert(2.2);
s.insert(3.3);
s.insert(4.4);
s.insert(5.5);
s.insert(6.6);
s.insert(7.7);
s.insert(8.8);
s.insert(9.9);

s.erase(7.7);
....

JD
 
M

Marcus Kwok

JDT said:
The last statement at the end of this post is to delete a set member
7.7. I set a break point inside ltstr::eek:perator(). I found out that
STL binary-seach the set twice. How come?

This is probably just an implementation detail, but from what I
understand many implementations of std::set internall use some sort of
tree structure to keep the set sorted. When you delete a member, the
first search is probably to find the element to be deleted, and the
second may be some sort of rebalancing of the tree.
I have another problem that
a member in a set cannot be deleted. I think I may be able to solve
that problem if I can figure out the problem posted here. Thanks for
your help.

This problem may be a consequence of the fact that floating point
arithmetic is not exact; see

http://www.parashift.com/c++-faq-lite/newbie.html#faq-29.17
 
J

JDT

Hi Marcus,

Thanks for the explanation. But an element cannot be deleted even after
I pay attention to your 2nd point. Basically I feed the following
customized less-than operator into a set. (I hope that the fTolerance
can be as small as possible) One in thousands of elements cannot be
deleted. I think the operator is only used for sorting/searching, but
not for the final matching. To match an element to be deleted at the
end of searching, pointers to CSetMember will be used for comparison.
Am I right? The operator shouldn't cause this type of problem as long
as it always returns a consistent bool value. It really puzzles me ....

struct lt
{
bool operator() (const CSetMember *s1, const CSetMember *s2) const
{
float f1 = s1->a * s1->b / s1->c;
float f2 = s2->a * s2->b / s2->c;
if (abs(f1-f2) > fTolerance)
return f1 < f2;
else
return s1->id < s2->id; // id is a unique integer
}
};

JD
 
P

P.J. Plauger

Thanks for the explanation. But an element cannot be deleted even after I
pay attention to your 2nd point. Basically I feed the following
customized less-than operator into a set. (I hope that the fTolerance can
be as small as possible) One in thousands of elements cannot be deleted.
I think the operator is only used for sorting/searching, but not for the
final matching. To match an element to be deleted at the end of
searching, pointers to CSetMember will be used for comparison. Am I right?
The operator shouldn't cause this type of problem as long as it always
returns a consistent bool value. It really puzzles me ....

struct lt
{
bool operator() (const CSetMember *s1, const CSetMember *s2) const
{
float f1 = s1->a * s1->b / s1->c;
float f2 = s2->a * s2->b / s2->c;
if (abs(f1-f2) > fTolerance)
return f1 < f2;
else
return s1->id < s2->id; // id is a unique integer
}
};

If this function ever returs true for both lt(s1, s2) and lt(s2, s1)
(and I strongly suspect it does) then it is *not* a strict weak
ordering, and set will not behave properly.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
J

JDT

Hi P.J.,

Since I didn't change both s1's and s2's a, b, c, and id, the same
machine and same compile will always produce the same values of f1 and
f2. It's a strict weak ordering unless "(abs(f1-f2) > fTolerance)"
produces different bools for lt(s1, s2) and lt(s2, s1). Is it possible?

JD
 
P

P.J. Plauger

Hi P.J.,

Since I didn't change both s1's and s2's a, b, c, and id, the same machine
and same compile will always produce the same values of f1 and f2. It's a
strict weak ordering unless "(abs(f1-f2) > fTolerance)" produces different
bools for lt(s1, s2) and lt(s2, s1). Is it possible?

A strict weak ordering unless... is not a strict weak ordering.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
P

P.J. Plauger

Hi P.J.,

Since I didn't change both s1's and s2's a, b, c, and id, the same machine
and same compile will always produce the same values of f1 and f2. It's a
strict weak ordering unless "(abs(f1-f2) > fTolerance)" produces different
bools for lt(s1, s2) and lt(s2, s1). Is it possible?

A strict weak ordering unless... is not a strict weak ordering.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
B

Bernd Strieder

Hello,

P.J. Plauger said:
If this function ever returs true for both lt(s1, s2) and lt(s2, s1)
(and I strongly suspect it does) then it is *not* a strict weak
ordering, and set will not behave properly.

Even worse. It's not symmetry, that is causing the biggest problems with
that functor, but transitivity.

Take 3 objects of type CSetMember c1, c2, c3, where ->a and ->c are 1.0
and c1->b == 0.25*fTolerance c2->b == fTolerance c3->b ==
1.75*fTolerance. Then let c1->id==3, c2->id==2 and c3->id==1.

Then we have lt(c1,c2)==false, lt(c2,c1)== true, lt(c2,c3)==false,
lt(c3,c2)== true, so by transitivity lt(c3,c1) should be true, but
lt(c3,c1) == false and lt(c1,c3)==true. This cannot work in general.

If you want to study the effects of a wrong ordering functor on
std::set, you could for example check the order of the elements after
each insert or erase, but not only the neighbouring elements, try all
kinds of distances between the indexes.

Another problem above is using the absolute error to decide. This made
it really easy to find a counterexample. A better way to compare, if
floating point es involved, is comparing values rounded to the accuracy
the input suggests. Imagine a round(d,n) function rounding d to n
digits after decimal point.

// f1, f2 as above
float c1 = round(f1,4);
float c2 = rounf(f2,4);

if (c1 < c2) return true;
if (c1 == c2) return s1->id < s2->id;
return false;

The OP version cuts the range of numbers into pieces of length
fTolerance, but the points, where the pieces end are different with
every starting point. This must not happen. The discretization should
be uniform over all inputs. If you round your input in a sensible
manner, then you might be able to leave out the rounding inside.

Quite some years ago, I had problems with the qsort function of the
standard C library. It crashed, because it was optimized in a way that
a comparision function, not representing an order as required, would
make it including elements outside the array when sorting. A thrilling
experience.

Bernd Strieder
 
M

Mark P

JDT said:
Hi,

The last statement at the end of this post is to delete a set member
7.7. I set a break point inside ltstr::eek:perator(). I found out that
STL binary-seach the set twice. How come?

This is apparently not uncommon. Certain implementations will perform
an erase( key) by first calling lower_bound( key) and upper_bound( key),
and then removing the range of iterators returned by these two calls.
Seems wasteful, but typically set, multiset, map, and multimap are built
upon the same underlying framework, usually a red black tree. The range
search then works for all of these containers although it's less than
optimal for the non multi- versions.

FWIW, my version of gcc which seems to be based upon HP and SGI
implementations, does this. To get around this, you can perform a find
operation to return an iterator, and then erase the iterator directly.
If you feel like being fancy, you could even write a templatized
function to do this automatically-- cf. below.

Mark

// nb. untested code!

/* Erases element k, if present, from set s. Returns the number of
* erased elements (0 or 1).
*/
template< typename SetType, typename KeyType>
int setErase( SetType& s, const KeyType& k)
{
typename SetType::iterator it = s.find( k);
if( it == s.end())
return 0;
s.erase( it);
return 1;
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,050
Latest member
AngelS122

Latest Threads

Top