comparison operator for STL sets

S

srp113

Hello,
The comparator functor one specifies when declaring a stl set should
follow strict weak ordering semantics. This means it should return
true if x is strictly less than y, otherwise false. And equivalance is
tested as x == y iff. !(x < y) and ! (y < x). If this is met then
set.insert() will fail since there is such an element already. What
about the case where it turns out (x < y) and ( y < x) is true,
should such an element be inserted? Logically this should never
happen, but I ran into this in code below:

struct MyClass {
MyClass(int num,char *name): _num(num){
strlcpy(_name,name,sizeof _name);
}

int _num;
char _name[NAME_LEN+1];

};

class MyClassCompare{
public:
bool operator()(const MyClass& lhs,const MyClass& rhs) const{
if(lhs._num < rhs._num)
return true;
else {
if(strncmp(lhs._name,rhs._name,NAME_LEN+1) < 0)
return true;
}
return false;
}

};

int main(int argc,char* argv[]) {
typedef std::set<MyClass,MyClassCompare> MyClassSet;
typedef MyClassSet::iterator MyClassSetItr;
MyClassSet mcSet;
MyClass mc1(10,"Hello"),mc2(10,"Hellm"),mc3(10,"Hellp"),mc4
(20,"Hello"),mc5(10,"Hello");
pair<MyClassSetItr,bool> retVal;
retVal = mcSet.insert(mc1);
cout << "Return code from inserting mc1=" << retVal.second << endl;
retVal = mcSet.insert(mc2);
cout << "Return code from inserting mc2=" << retVal.second << endl;
retVal = mcSet.insert(mc3);
cout << "Return code from inserting mc3=" << retVal.second << endl;
retVal = mcSet.insert(mc4);
cout << "Return code from inserting mc4=" << retVal.second << endl;
retVal = mcSet.insert(mc5);
cout << "Return code from inserting mc5=" << retVal.second << endl;
for(MyClassSetItr itr=mcSet.begin();itr != mcSet.end();itr++) {
cout << " " << (*itr)._num << "," << (*itr)._name << endl;
}
}
the output from this code was:
Return code from inserting mc1=1
Return code from inserting mc2=1
Return code from inserting mc3=1
Return code from inserting mc4=1
Return code from inserting mc5=0
Printing the set:
10,Hellm
10,Hello
20,Hello
10,Hellp
should object(20,Hello) have been allowed to be inserted into set?
Why is that element not at end of the list when I am displaying the
set? I see that when comparing object(20,Hello) with object(10,Hellp)
operator < will return true for both x<y and y<x. Is there something
wrong with the way I have setup the comparator here? It would be more
straightforward if set allowed === in which I can unambiguously test
for equivalance of two MyClass objects...
Thanks Much for your help,
Sunil
 
S

srp113

Hi Pete,
Thanks a lot, that makes sense. Can you tell me in general
what are the steps to come up with a comparator operator like this.
Are there simple steps to follow to derive operator < from opeartor
==. For instance I am working on a more complex class that has
following attributes:
bool a;
string b;
struct {
int d;
int e;
int f;
}c;
and test for equality would be
operator==(const MyClass& lhs,const MyClass& rhs){
if ( lhs.a == true && rhs.a == true) {
if( lhs.b == rhs.b && ((lhs.c.d == rhs.c.d)&&(lhs.c.e == rhs..c.e) &&
(lhs.c.f == rhs.c.f))
return true;
}
else if(lhs.a == false && rhs.a == false && lhs.b == rhs.b)
return true;
return false; // in all other cases
}
Coming up operator < in such a case seems bit involved... From my
point of view, its unique if operator == above returns false and I
want it to go into set.
Thanks,
Sunil

srp113 said:
Hello,
The comparator functor one specifies when declaring a stl set should
follow strict weak ordering semantics. This means it should return
true if x is strictly less than y, otherwise false. And equivalance is
tested as x == y iff.  !(x < y) and ! (y < x). If this is met then
set.insert() will fail since there is such an element already. What
about the case where  it turns out  (x < y) and ( y < x)  is true,

Then the operator does not provide a strict weak ordering. The behavior
is undefined.




should such an element be inserted?  Logically this should never
happen, but I ran into this in code below:
struct MyClass {
  MyClass(int num,char *name): _num(num){
    strlcpy(_name,name,sizeof _name);
   }
  int _num;
  char _name[NAME_LEN+1];

class MyClassCompare{
public:
  bool operator()(const MyClass& lhs,const MyClass& rhs) const{
    if(lhs._num < rhs._num)
      return true;
    else {
      if(strncmp(lhs._name,rhs._name,NAME_LEN+1) < 0)
   return true;
    }
    return false;
  }

The predicate is wrong.

        if (lhs._num < rhs._num)
                return true;
        else if (rhs._num == lsh._num
                && strncmp(lhs._name, rhs._name, NAME_LEN_1) < 0)
                return true;
        else return false;

--
   Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of
"The Standard C++ Library Extensions: a Tutorial and Reference"
(www.petebecker.com/tr1book)- Hide quoted text -

- Show quoted text -
 
N

Neelesh

Hi Pete,
        Thanks a lot, that makes sense. Can you tell me in general
what are the steps to come up with a comparator operator like this.
Are there simple steps to follow to derive operator < from opeartor
==. For instance I am working on a more complex class that has
following attributes:
 bool a;
 string b;
 struct {
  int d;
  int e;
  int f;
  }c;
 and test for equality would be
operator==(const MyClass& lhs,const MyClass& rhs){
 if ( lhs.a == true && rhs.a == true) {
   if( lhs.b == rhs.b && ((lhs.c.d == rhs.c.d)&&(lhs.c.e == rhs.c.e) &&
(lhs.c.f == rhs.c.f))
     return true;
 }
 else if(lhs.a == false && rhs.a == false && lhs.b == rhs.b)
  return true;
 return false; // in all other cases}

Coming up operator < in such a case seems bit involved...

Typically, defining "equality" and defining "ordering" are two
different problems, often independent of each other. For example, we
can appropriately define operator== for a complex number class, but we
From my point of view, its unique if operator == above returns false

Not when we are dealing with associative containers. In fact, 23.1.2p
(3) of the C++ standard explicitly says:
"The phrase 'equivalence of keys' means the equivalence relation
imposed by the comparison and not the operator== on keys. That is, two
keys k1 and k2 are considered to be equivalent if for the comparison
object comp, comp(k1, k2) == false && comp(k2, k1) == false."

The associative container "set" isn't a place for storing "distinct"
keys in the sense that those return false for operator==. Rather, its
a place for storing "distinct keys that can be ordered based on
certain strict weak ordering". In other words, "distinctness" is not
defined as a!=b, but it is defined as exactly one of "a<b" or "b<a" .

want it to go into set.
Thanks,
Sunil

Please do not top-post.
 
J

Juha Nieminen

Pete said:
The predicate is wrong.

if (lhs._num < rhs._num)
return true;
else if (rhs._num == lsh._num
&& strncmp(lhs._name, rhs._name, NAME_LEN_1) < 0)
return true;
else return false;

In general, when you perform the comparison using several values (one
of them being the primary comparison value, another the secondary, etc.)
it should be done like:

if(primaryValue != rhs.primaryValue)
return primaryValue < rhs.primaryValue;
else if(secondaryValue != rhs.secondaryValue)
return secondaryValue < rhs.secondaryValue;
else if(tertiaryValue != rhs.tertiaryValue)
return tertiaryValue < rhs.tertiaryValue;
// etc...
else return leastSignificantValue < rhs.leastSignificantValue;

(If the values in question are eg. in an array, this can be more
easily done with a proper loop construct.)
 
C

Christopher Dearlove

Juha Nieminen said:
In general, when you perform the comparison using several values (one
of them being the primary comparison value, another the secondary, etc.)
it should be done like:

if(primaryValue != rhs.primaryValue)
return primaryValue < rhs.primaryValue;
else if(secondaryValue != rhs.secondaryValue)
return secondaryValue < rhs.secondaryValue;
else if(tertiaryValue != rhs.tertiaryValue)
return tertiaryValue < rhs.tertiaryValue;
// etc...
else return leastSignificantValue < rhs.leastSignificantValue;

(If the values in question are eg. in an array, this can be more
easily done with a proper loop construct.)

That uses a mixture of < and !=. That may be fine, but in other
cases (and certainly if writing generic code) you may only have
< available, maybe not even >. (You also don't want to use the
above if != is available but implemented using two uses of <.)

I'm not claiming the following is the best organisation, though
I'd like to see one that's better (other than the loop of course),
especially if it can avoid doubling the number of lines and using
true and false explicitly.

if (primaryValue < rhs.primaryValue)
return true;

else if (rhs.primaryValue < primaryValue)
return false;
else if (secondaryValue < rhs.secondaryValue)
return true;
else if (rhs.secondaryValue < secondaryValue)
return false;
else if (tertiaryValue < rhs.tertiaryValue)
return true;
else if (rhs.tertiaryValue < tertiaryValue)
return false;
// etc...
else
return leastSignificantValue < rhs.leastSignificantValue;

(I actually ran into this problem using an STL algorithm supplied
with I think it was an older Borland compiler. That used != when
it shouldn't - I think however in a case where it should only have
used == rather than only using <. That however costs nothing, or
next to nothing, to get right efficiently.)
 
B

Bart van Ingen Schenau

srp113 said:
Hi Pete,
Thanks a lot, that makes sense. Can you tell me in general
what are the steps to come up with a comparator operator like this.
Are there simple steps to follow to derive operator < from opeartor
==.

First you have to remember that for ordering a bunch of structs on
multiple fields, you MUST specify an order of importance for the fields.
If your order of importance has a "these X fields are equally important
and must be compared" statement, then it is impossible to specify a
less-than operation.
As the order of importance can not be fully derived from an equality
relation, you can also not derive a full less-than relation from
equality. The equality relation may give hints to which fields have to
be checked before others for semantic reasons and which fields can be
ignored in specific cases.

For instance I am working on a more complex class that has
following attributes:
bool a;
string b;
struct {
int d;
int e;
int f;
}c;
and test for equality would be
operator==(const MyClass& lhs,const MyClass& rhs){
if ( lhs.a == true && rhs.a == true) {
if( lhs.b == rhs.b && ((lhs.c.d == rhs.c.d)&&(lhs.c.e == rhs.c.e)
&&
(lhs.c.f == rhs.c.f))
return true;
}
else if(lhs.a == false && rhs.a == false && lhs.b == rhs.b)
return true;
return false; // in all other cases
}
Coming up operator < in such a case seems bit involved... From my
point of view, its unique if operator == above returns false and I
want it to go into set.

But the set also needs to determine *where* to insert that new item.
This the equality relation can't tell.

A sample operator< for this class would be:

bool operator <(const MyClass& lhs, const MyClass& rhs)
{
/* must test a first for semantic reasons */
if (lhs.a != rhs.a)
return lhs.a < rhs.a;

if (lhs.a)
{ /* lhs.a and lhs.b both true */
/* Compare all other fields. IN WHICH ORDER? chosen arbitrarily */
if (lhs.c.d < rhs.c.d)
return true;
else if (rhs.c.d < lhs.c.d)
return false;
else
if (lhs.c.f < rhs.c.f)
return true;
else if (rhs.c.f < lhs.c.f)
return false;
else
if (lhs.b < rhs.b)
return true;
else if (rhs.b < lhs.b)
return false;
else
return lhs.c.e < rhs.c.e;
}
else
{ /* lhs.a and lhs.b both false */
/* Only compare b. Other fields are meaningless */
return lhs.b < rhs.b;
}
}
Thanks,
Sunil

Bart v Ingen Schenau
 
J

Juha Nieminen

Pete said:
if (lhs.a < rhs.a)
return true;
if (rhs.a < lhs.a)
return false;
if (lhs.b < rhs.b)
return true;
if (rhs.b < rhs.a)
return false;
return true;

There's a bug in there.

Besides, wouldn't it be simpler to do:

if (lhs.a < rhs.a)
return true;
if (rhs.a < lhs.a)
return false;
return lhs.b < rhs.b;
 
J

James Kanze

"Juha Nieminen" <[email protected]> wrote in message

[...]
I'm not claiming the following is the best organisation,
though I'd like to see one that's better (other than the loop
of course), especially if it can avoid doubling the number of
lines and using true and false explicitly.
if (primaryValue < rhs.primaryValue)
return true;
else if (rhs.primaryValue < primaryValue)
return false;
else if (secondaryValue < rhs.secondaryValue)
return true;
else if (rhs.secondaryValue < secondaryValue)
return false;
else if (tertiaryValue < rhs.tertiaryValue)
return true;
else if (rhs.tertiaryValue < tertiaryValue)
return false;
// etc...
else
return leastSignificantValue < rhs.leastSignificantValue;

Avoiding true and false is simple. In theory, anyway: the
results aren't always that pleasant. Basically, the rule for
any given element it:

return lhs.element < rhs.element
|| (! (rhs.element < lhs.element) &&
lhs.remainingElements < rhs.remainingElements ) ;

For the last element, it becomes simply:

return lhs.lastElement < rhs.lastElement ;

The problem is that when you expand this into a single function,
the alternating || and && imply that none of the parentheses can
be eliminated (or at least, I've not found a way to do so). So
you end up with some very hairy nesting.

Ideally, there's be some way of generating this using template
metaprogramming with template varargs---you'd give the list of
fields to be compared, and the template would do the rest. In
practice, I'm not sure---if nothing else, it's clear that you'd
have to give the list in the form of pointers to members (since
templates can't use just the name of a member as an argument),
which is a bit painful for starters. But other than that, it
should be workable.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top