memcmp for <

R

rajkumar

I have a struct like

struct MyStruct
{
int a;
int b;
int c:
bool d;
bool e;
}

I want to insert such a struct in a map. I understand I can declare the
< operator for such a struct for lexicographical compare like

x.a < y.a || !(y.a < x.a) && ( x.b < y.b || !(y.b < x.b) && x.c < y.c
.............

or a simple one like this

memcmp(&x,&y,sizeof(MyStruct)) < 0;

This seems to work if I memset and fill the sizeof(MyStruct) with
zeroes in the constructor before I assign a b and c etc. This will take
care of any padding that the compiler adds.

My question is whether the second approach is portable? If so do I
really need the memset ? Does the standard say anything about
initializing the padding bits?

Raj
 
V

Victor Bazarov

I have a struct like

struct MyStruct
{
int a;
int b;
int c:
bool d;
bool e;
}

I want to insert such a struct in a map. I understand I can declare the
< operator for such a struct for lexicographical compare like

x.a < y.a || !(y.a < x.a) && ( x.b < y.b || !(y.b < x.b) && x.c < y.c
............

or a simple one like this

memcmp(&x,&y,sizeof(MyStruct)) < 0;

This seems to work if I memset and fill the sizeof(MyStruct) with
zeroes in the constructor before I assign a b and c etc. This will take
care of any padding that the compiler adds.

My question is whether the second approach is portable? If so do I
really need the memset ? Does the standard say anything about
initializing the padding bits?

1) Yes. 2) Yes. 3) They are uninitialised unless your object has static
storage duration. Beware, though that as soon as your MyStruct ceases
being a POD (because you added a private section or a virtual function or
something of the sort), use of memset and memcpy on it becomes undefined.

V
 
M

Malte Starostik

I have a struct like

struct MyStruct
{
int a;
int b;
int c:
bool d;
bool e;
}

I want to insert such a struct in a map. I understand I can declare the
< operator for such a struct for lexicographical compare like

x.a < y.a || !(y.a < x.a) && ( x.b < y.b || !(y.b < x.b) && x.c < y.c
............

or a simple one like this

memcmp(&x,&y,sizeof(MyStruct)) < 0;

This seems to work if I memset and fill the sizeof(MyStruct) with
zeroes in the constructor before I assign a b and c etc. This will take
care of any padding that the compiler adds.

My question is whether the second approach is portable? If so do I
really need the memset ? Does the standard say anything about
initializing the padding bits?

Sorry, can't tell you about the padding bits, but it's still not
portable because of endianess issues:

struct Foo
{
int a;
};

Foo f1 = { 1 };
Foo f2 = { 256 };

On big-endian machines a memcmp() compare will work correctly. On a
little-endian machine with 32-bit ints, f1 will contain the byte
sequence 0x01 0x00 0x00 0x00 (minus padding) and f2 will contain 0x00
0x01 0x00 0x00. memcmp() will report f1 as greater than f2.

Cheers,
Malte
 
R

rajkumar

You mentioned something about private section. Could you elaborate how
that would change things ?

If the struct carried a vtable pointer or had NON POD could i just
overload new and memset before i call the constructor ?

Raj
 
I

Ivan Vecerina

I have a struct like

struct MyStruct
{
int a;
int b;
int c:
bool d;
bool e;
}

I want to insert such a struct in a map. I understand I can declare the
< operator for such a struct for lexicographical compare like

x.a < y.a || !(y.a < x.a) && ( x.b < y.b || !(y.b < x.b) && x.c < y.c

Note that you can also use the (IMO better) following form:
return (x.a!=y.a) ? x.a<y.a
: (x.b!=y.b) ? x.b<y.b
: (x.c!=y.c) ? x.c<y.c
: (x.d!=y.d) ? x.d<y.d : x.e < y.e;
or a simple one like this

memcmp(&x,&y,sizeof(MyStruct)) < 0;

This seems to work if I memset and fill the sizeof(MyStruct) with
zeroes in the constructor before I assign a b and c etc. This will take
care of any padding that the compiler adds.

But not of endianness and other binary representation issues.
Really, I don't think that saving a few statements is worth the
loss of portability. Plus the explicit form gives you much more
flexibility. So why bother?
 
V

Victor Bazarov

You mentioned something about private section. Could you elaborate how
that would change things ?

The layout of an object is only mandated within the same access specifier
section. So, as soon as you introduce private or protected non-static
data members, the struct is not a POD any more, and I am not really sure
why that is, but the Standard makes a point of defining POD-struct that
way.
If the struct carried a vtable pointer or had NON POD could i just
overload new and memset before i call the constructor ?

I am not sure what you mean by "overload memset", but yes, essentially,
your task would be to gain control over the "padding bytes" by, for
example, eliminating them using compiler-specific means.

Let me ask a rhetorical questions, though. If you are prepared to give it
overloaded 'new' and 'memset' (let's suppose it's possible somehow), why
don't you just overload the operator < ?

V
 
V

Victor Bazarov

Malte said:
[...]
Sorry, can't tell you about the padding bits, but it's still not
portable because of endianess issues:

struct Foo
{
int a;
};

Foo f1 = { 1 };
Foo f2 = { 256 };

On big-endian machines a memcmp() compare will work correctly. On a
little-endian machine with 32-bit ints, f1 will contain the byte
sequence 0x01 0x00 0x00 0x00 (minus padding) and f2 will contain 0x00
0x01 0x00 0x00. memcmp() will report f1 as greater than f2.

But won't it report f1 consistently greater than f2? The purpose of
using memcmp (as I understood it) was to forgo the real operator < and
the memberwise comparison just to see if they were different.

V
 
A

Andrew Koenig

1) Yes. 2) Yes. 3) They are uninitialised unless your object has static
storage duration.

Beg pardon? Memcmp portable? I don't see why. As a simple example, I
can't think of any place in the standard that requires all equal bool values
to have the same representation. In other words, I don't see anything wrong
with an implementation that stores a byte in a bool and considers zero to be
false and any nonzero value to be true. Under such an implementation,
memcmp might yield unequal for two values that should be considered equal.
 
R

rajkumar

I dont care about that as I want just keep them in a set. If A < B I
just want to make sure A < B all the time

Raj
 
R

rajkumar

Let me ask a rhetorical questions, though. If you are prepared to
give it
overloaded 'new' and 'memset' (let's suppose it's possible somehow), why
don't you just overload the operator < ?

Its some legacy code. The idea being if you add a new member it will
work automatically. If you overload <
you will have to manually update it for the new member

Raj
 
V

Victor Bazarov

Andrew said:
Beg pardon? Memcmp portable? I don't see why. As a simple example, I
can't think of any place in the standard that requires all equal bool values
to have the same representation. In other words, I don't see anything wrong
with an implementation that stores a byte in a bool and considers zero to be
false and any nonzero value to be true. Under such an implementation,
memcmp might yield unequal for two values that should be considered equal.

Beg pardon? How is the internal representations of 'true' or 'false'
relevant in this case? Whether 'true' is 0 or 1, two 'true's will
compare equal and so will two internal representations of 'false'. One
can't really expect two different internal representations from two
different architectures to compare equal, but who cares about that?
The program runs on a virtual machine that cannot have two distinctly
different representations for 'true' during the same run of the program,
can it?

V
 
V

Victor Bazarov

give it



Its some legacy code. The idea being if you add a new member it will
work automatically. If you overload <
you will have to manually update it for the new member

Maintenance is maintenance. You gotta do it right or you shouldn't be
doing it at all. Doing half-a-job is not really going to buy you much.

V
 
R

Rolf Magnus

Victor said:
Beg pardon? How is the internal representations of 'true' or 'false'
relevant in this case? Whether 'true' is 0 or 1, two 'true's will
compare equal and so will two internal representations of 'false'.

Read Andrew's response again. His point was that this (e.g. true always
comparing equal to true in memcmp) might not be the case.
One can't really expect two different internal representations from two
different architectures to compare equal, but who cares about that?
The program runs on a virtual machine that cannot have two distinctly
different representations for 'true' during the same run of the program,
can it?

What makes you think it can't?
 
A

Andrew Koenig

Beg pardon? How is the internal representations of 'true' or 'false'
relevant in this case? Whether 'true' is 0 or 1, two 'true's will
compare equal and so will two internal representations of 'false'.

I don't think anything in the standard prohibits two values, both of which
are "true", from having different internal representations. Please read
again what I said in my previous post:
In other words, I don't see anything wrong with an implementation that
stores a byte in a bool
and considers zero to be false and any nonzero value to be true.

On such an implementation, two variables might both have the same value but
different representations. Of course the implementation would have to
change representation appropriately if the value were to be treated as an
integer, but I can see no particular difficulty in doing so.

As a historical note, I am quite certain that C and C++ implementations have
existed under which two pointers can compare equal but nevertheless have
different representations. And I am entirely certain that on most modern
computers, two floating-point values with different representations can
compare equal--namely +0 and -0.
 
V

Victor Bazarov

I wasn't paying attention apparently. Sorry.
On a historical note, was there ever an implementation that did that?

V
 
A

Andrew Koenig

I wasn't paying attention apparently. Sorry.
On a historical note, was there ever an implementation that did that?

Not to my knowledge for bool. But definitely for pointers and
floating-point values.

Then there's this issue:

struct X { char a; int b; };

void foo()
{
X x1 = { '?', 42 };
X x2 = x1;
// ...
};

If there's padding between X::a and X::b, I don't think that the
implementation is obligated to copy that padding. In other words, I don't
think there's any guarantee that memcmp will show x1 and x2 as being equal
if executed at the comment.
 
J

Jack Klein

I have a struct like

struct MyStruct
{
int a;
int b;
int c:
bool d;
bool e;
}

I want to insert such a struct in a map. I understand I can declare the
< operator for such a struct for lexicographical compare like

x.a < y.a || !(y.a < x.a) && ( x.b < y.b || !(y.b < x.b) && x.c < y.c
............

or a simple one like this

memcmp(&x,&y,sizeof(MyStruct)) < 0;

This seems to work if I memset and fill the sizeof(MyStruct) with
zeroes in the constructor before I assign a b and c etc. This will take
care of any padding that the compiler adds.

My question is whether the second approach is portable? If so do I
really need the memset ? Does the standard say anything about
initializing the padding bits?

Raj

Actually, it is extremely non-portable, and error-prone as well. As
others have pointed out, endianness can be a killer. If int has four
octet size bytes and is little endian like Intel and others, consider
x.a = 256 and y.a = 1. Then they begin with the byte sequences:

x 0x00 0x10 0x00 0x00 ...
y 0x01 0x00 0x00 0x00 ...

So which one will memcmp() find greater?

Also there are real widely used compilers where padding can certainly
trip you up.

Gnu ports for x86, for example, use the Intel 80 bit extended
precision real format for long double, and sizeof(long double) is 12,
so they always start aligned to a 4 byte address.

You can assign two long doubles the same value, then using a union or
pointer punning change the final two bytes of one of them. They will
still compare as equal with ==, but not with memcpy().
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top