Rationale for struct assignment and no struct comparison

N

Noob

Hello,

I don't understand why assignment of structs is supported,
but comparison is not. It seems to me that the two operations
would be of similar complexity, no?

struct foo { int i; };
int main(void)
{
struct foo s1, s2;
s1.i = 42;
s2 = s1; /* OK */
if (s1 == s2) return 0; /* syntax error */
return 0;
}

$ gcc -Wall -Wextra -std=c89 -pedantic temp.c
temp.c: In function 'main':
temp.c:7: error: invalid operands to binary == (have 'struct foo' and 'struct foo')

What was the rationale for allowing assignment and not comparison?

Assignment might be implemented via memcpy, comparison via memcmp
(although padding may cause major headache).

Regards.
 
N

Noob

Noob said:
I don't understand why assignment of structs is supported,
but comparison is not. It seems to me that the two operations
would be of similar complexity, no?

struct foo { int i; };
int main(void)
{
struct foo s1, s2;
s1.i = 42;
s2 = s1; /* OK */
if (s1 == s2) return 0; /* syntax error */
return 0;
}

$ gcc -Wall -Wextra -std=c89 -pedantic temp.c
temp.c: In function 'main':
temp.c:7: error: invalid operands to binary == (have 'struct foo' and 'struct foo')

What was the rationale for allowing assignment and not comparison?

Assignment might be implemented via memcpy, comparison via memcmp
(although padding may cause major headache).

Additional question:
After an assignment s2=s1;
is s2 bit-for-bit identical to s1
or does the padding have unspecified values?

Regards.
 
G

gwowen

I don't understand why assignment of structs is supported,
but comparison is not.
....

(although padding may cause major headache).

Actually, you do understand :)
 
K

Keith Thompson

Noob said:
I don't understand why assignment of structs is supported,
but comparison is not. It seems to me that the two operations
would be of similar complexity, no?
[...]

What was the rationale for allowing assignment and not comparison?

Assignment might be implemented via memcpy, comparison via memcmp
(although padding may cause major headache).

Assignment can be implemented via (the equivalent of) memcpy.
For small structures, it can be even easier than that; the code for
a structure the same size as int might be implemented by the same
code as an int assignment, likely a single instruction. (Or the
compiler can generate code to copy each member, ignoring padding,
if there's some reason to do so.)

For comparison, yes, padding is a big problem. You can easily have
two structure objects all of whose members are equal, but which are
not bitwise identical. Furthermore, bitwise comparison might not
work for floating-point or pointer members. And defining the meaning
of equality for a structure type with a member that's a union would
be another fun task (you could say that the first declared member of
the union is compared, but it's not clear that that would be useful).

And consider something like this:

struct flex_array {
size_t len;
double data[MAX];
};

How is the compiler to know that only the first len elements of
the data array are meaningful? And what about the struct hack or
flexible array members?

Struct equality comparison *could* have been defined rigorously,
but it would have been a lot of work with a number of special cases,
the generated code could be large and inefficient, and the result
likely wouldn't match the kind of comparison *you* want to do with
*your* struct type.
 
B

Ben Bacarisse

Noob said:
I don't understand why assignment of structs is supported,
but comparison is not. It seems to me that the two operations
would be of similar complexity, no?

struct foo { int i; };
int main(void)
{
struct foo s1, s2;
s1.i = 42;
s2 = s1; /* OK */
if (s1 == s2) return 0; /* syntax error */
return 0;
}

$ gcc -Wall -Wextra -std=c89 -pedantic temp.c
temp.c: In function 'main':
temp.c:7: error: invalid operands to binary == (have 'struct foo' and 'struct foo')

What was the rationale for allowing assignment and not comparison?

Assignment might be implemented via memcpy, comparison via memcmp
(although padding may cause major headache).

I think you've answered your own question! Ether s1 == s2 can be
permitted to have odd results if two otherwise identical structs have
different bits in the padding bytes (or, indeed, in the padding bits
within the values) or s1 == s2 has to turn into a whole big chunk of
code.

If you favour the first, well, memcmp is not hard to write. If you
favour the second, I suspect it is seen as contrary to the spirit of C
for a small, apparently innocent, operation to turn into a possibly
huge block of machine code.
 
E

Eric Sosman

Noob said:
Hello,

I don't understand why assignment of structs is supported,
but comparison is not. It seems to me that the two operations
would be of similar complexity, no?

You're exactly right: "No." From the Rationale:

The C89 Committee considered, on more than one
occasion, permitting comparison of structures for
equality. Such proposals foundered on the problem of
holes in structures. A byte-wise comparison of two
structures would require that the holes assuredly be
set to zero so that all holes would compare equal, a
difficult task for automatic or dynamically allocated
variables. The possibility of union-type elements in
a structure raises insuperable problems with this
approach. Without the assurance that all holes were
set to zero, the implementation would have to be
prepared to break a structure comparison into an
arbitrary number of member comparisons; a seemingly
simple expression could thus expand into a substantial
stretch of code, which is contrary to the spirit of C.
 
B

bartc

Eric Sosman said:
You're exactly right: "No." From the Rationale:

The C89 Committee considered, on more than one
occasion, permitting comparison of structures for
equality. Such proposals foundered on the problem of
holes in structures. A byte-wise comparison of two
structures would require that the holes assuredly be
set to zero so that all holes would compare equal, a
difficult task for automatic or dynamically allocated
variables.

Interesting. I developed a pretty much parallel language to C in the 1980's.

That /did/ have struct compares, and it seemed work!

However, my structs didn't have 'holes', as I knew nothing about C's method
of automatic padding (I did this manually as needed.)

Even with holes, I can imagine various schemes to compare two structs that
would not require an arbitrarily large amount of code (such as looping
through a bitmap showing the bytes that need to match, or finding out
exactly what the difficulty is in making sure padding bytes are zero) .
The possibility of union-type elements in
a structure raises insuperable problems with this
approach.

How would you compare two union values anyway? Unless the union members are
all the same size, this could just result in a compilation error.
Without the assurance that all holes were
set to zero, the implementation would have to be
prepared to break a structure comparison into an
arbitrary number of member comparisons; a seemingly
simple expression could thus expand into a substantial
stretch of code, which is contrary to the spirit of C.

But a substantial stretch of user-code (which must be maintained as the
struct changes) is acceptable...

I think having a default struct-compare feature in the language would have
been useful.
 
K

Keith Thompson

And even without padding bytes, or with an assurance that padding
bytes are set to zero, you might still have to worry about members
for which bitwise equality comparison doesn't work. Is +0 equal
to -0? What about pointers on a system where the same address can
have more than one representation? What about floating-point NaNs,
which aren't equal to themselves?
But a substantial stretch of user-code (which must be maintained as the
struct changes) is acceptable...

I think having a default struct-compare feature in the language would have
been useful.

How often is it reall useful?

I've worked on a compiler for a language (Ada) that does require
support for record (struct) equality comparison. In one version
of the compiler, we initialized the entire object to all-bits-zero
(which could impose significant overhead even when no comparison
is ever performed). In a later version, we generated code to
compare each member. And I don't remember ever actually *needing*
to compare two records/structs for equality. Logical equivalence
is very often more complicated than member-by-member equality.
 
E

Eric Sosman

bartc said:
Interesting. I developed a pretty much parallel language to C in the
1980's.

That /did/ have struct compares, and it seemed work!

However, my structs didn't have 'holes', as I knew nothing about C's method
of automatic padding (I did this manually as needed.)

Even with holes, I can imagine various schemes to compare two structs that
would not require an arbitrarily large amount of code (such as looping
through a bitmap showing the bytes that need to match, or finding out
exactly what the difficulty is in making sure padding bytes are zero) .

Padding bytes (and padding bits, for bit-fields) mightn't
be the only problem. For example:

struct { char c[100]; }
x = { "Hello" }, y = { "Hello" };
x.c[99] = 'x';
y.c[99] = 'y';
assert (strcmp(x.c, y.c) == 0);
assert (x == y); /* ??? */

The structs x,y are "equal" if their c arrays are thought of
as containing strings, "unequal" if they're thought of as
holding a hundred characters each. A related issue:

char xdata[] = "Hello", ydata[] = "Hello";
struct { char *p; }
x = { xdata }, y = { ydata };
assert (strcmp(x.p, y.p) == 0);
assert (x.p != y.p);
assert (x == y); /* ??? */

It would certainly be possible to define struct equality
as meaning "corresponding elements' values are equal," meaning
that the final assert() in each example would fail. But would
this be useful? Sometimes, perhaps, for "pure value" structs
like `struct point { int x, y; }'. But even for a tiny step
beyond simplicity it seems to me the programmer wants finer
control over which elements do and do not count towards "equal."
In `struct point { int x,y; struct point *next; }', for example,
it's likely that x,y would participate in the test but that the
linked list pointer would not.
How would you compare two union values anyway? Unless the union members are
all the same size, this could just result in a compilation error.

There's trouble even if they're the same size. Just suppose
float and int are the same size (as in a recent thread):

union { float f; int i; } x, y;
x.f = 42;
y.i = 42;
assert (x.f == y.i);
assert (x == y); /* ??? */

The compiler has no way to know which element it should check
for equality, since a union can contain different elements at
different times.
But a substantial stretch of user-code (which must be maintained as the
struct changes) is acceptable...

Since it gives the programmer something of value -- namely,
a way to control which elements participate in the test, and what
notion of "equality" applies to them -- yes, I think so.
I think having a default struct-compare feature in the language would have
been useful.

For self-contained "value" structs, maybe. For others, I'm
not convinced. But try the experiment for yourself: Look through
a pile of code and find the places where structs are tested for
some kind of "equality." See if you think an automatically-
generated test would have worked instead of the hand-coded kind.
(The experiment isn't perfect, of course, but may be informative.)
 
A

Antoninus Twink

The structs x,y are "equal" if their c arrays are thought of as
containing strings, "unequal" if they're thought of as holding a
hundred characters each.

Do you conclude that C should permit overloading of the equality
operator (==)?
 
B

bartc

Keith Thompson said:
How often is it reall useful?

I've worked on a compiler for a language (Ada) that does require
support for record (struct) equality comparison. In one version
of the compiler, we initialized the entire object to all-bits-zero
(which could impose significant overhead even when no comparison
is ever performed). In a later version, we generated code to
compare each member.

In another project, I had to compare two records (containing variant fields)
using a recursive method. Structs (as in C) were blindly compared
bitwise/bytewise as I suggested in my other reply.
And I don't remember ever actually *needing*
to compare two records/structs for equality.

It happens from time to time. In recent code I have a 4-byte composite value
(an array, but it could have been a struct) which I have to compare with
another.

It not's a big deal in C to interpret each composite as an integer, but it
would have been neater to have allowed == directly.
 
B

bartc

Eric Sosman said:
bartc said:
Even with holes, I can imagine various schemes to compare two structs
that
would not require an arbitrarily large amount of code (such as looping
through a bitmap showing the bytes that need to match, or finding out
exactly what the difficulty is in making sure padding bytes are zero) .

Padding bytes (and padding bits, for bit-fields) mightn't
be the only problem. For example:

struct { char c[100]; }
x = { "Hello" }, y = { "Hello" };
x.c[99] = 'x';
y.c[99] = 'y';
assert (strcmp(x.c, y.c) == 0);
assert (x == y); /* ??? */

The structs x,y are "equal" if their c arrays are thought of
as containing strings, "unequal" if they're thought of as
holding a hundred characters each. A related issue:

char xdata[] = "Hello", ydata[] = "Hello";
struct { char *p; }
x = { xdata }, y = { ydata };
assert (strcmp(x.p, y.p) == 0);
assert (x.p != y.p);
assert (x == y); /* ??? */

It would certainly be possible to define struct equality
as meaning "corresponding elements' values are equal," meaning
that the final assert() in each example would fail. But would
this be useful? Sometimes, perhaps, for "pure value" structs
like `struct point { int x, y; }'.

My idea is just to do a byte-wise comparison. Such as memcmp() might do.
With the programmer being aware that the issues you raise meaning results
may be wrong if the odd non-padding byte in either struct is not an expected
value.

And many structs I think are simple value types just like that.

(Now you're going to tell me the C standard can't guarantee a byte-wise
comparison between even ordinary values such as ints, floats and pointers)
But even for a tiny step
beyond simplicity it seems to me the programmer wants finer
control over which elements do and do not count towards "equal."
In `struct point { int x,y; struct point *next; }', for example,
it's likely that x,y would participate in the test but that the
linked list pointer would not.

Possibly you wouldn't use such a struct with a == compare then, unless you
go to the trouble of wrapping the 'payload' in it's own struct.
There's trouble even if they're the same size. Just suppose
float and int are the same size (as in a recent thread):

union { float f; int i; } x, y;
x.f = 42;
y.i = 42;
assert (x.f == y.i);
assert (x == y); /* ??? */

The compiler has no way to know which element it should check
for equality, since a union can contain different elements at
different times.

This means that, usually they can only compare equal if both contain the
same type. So 42.0 and 42 will not be equal. This is the same as if:

int i=42;
float f=42.0;
memcmp(&i,&f,sizeof int);

was used.
 
K

Kenny McCormack

The *definition* of the two operations are not even of similar
complexity.

I don't think anyone doubts that it can be done - both in "shallow" mode
and in "deep" mode. I believe .NET does it. But it clearly requires a
Microsoft philosophy (Megabytes of code? No problem.) rather than a C
philosophy.

And, it is also true that it is easy enough to roll-you-own for this
should the need arise, and you'll probably be happier with one you did
yourself than a built-in one, anyway.
 
K

Keith Thompson

Noob said:
I don't understand why assignment of structs is supported,
but comparison is not. It seems to me that the two operations
would be of similar complexity, no?

The *definition* of the two operations are not even of similar
complexity.

If you try to use structure comparison, you will constantly be
running into problems:
[snip]
- The fields are rarely in the right order for a > or < comparison.
- Structure assignment can use memcpy(). For a > or < comparison,
you need to know the types involved, as signedness and endianness
messes up comparing ints (or worse, doubles) by looking at the
representation as unsigned char. Little-endian multi-byte integers
will be a problem here.
[...]

I don't think anyone was suggesting that "<" and ">" be supported
for structures. I suppose you could define them to compare members
in order, but that's far less likely to be useful than "==".

I note that operator overloading would allow you to define your own
operators for structures with whatever semantics you like. And if
you wanted to get fancy, you could even have a rule that generates
an implicit "==" operator based on any "==" operators for members.

Since standard C doesn't have operator overloading, you can always
define your own equal() function.

[I've re-inserted an attribution line for what "Noob" wrote.
Permission to quote this article is granted if and only if any
quoted text is properly attributed.]
 
W

Willem

bartc wrote:
) It happens from time to time. In recent code I have a 4-byte composite value
) (an array, but it could have been a struct) which I have to compare with
) another.
)
) It not's a big deal in C to interpret each composite as an integer, but it
) would have been neater to have allowed == directly.

It's even simpler to just use memcpy().
And almost as neat. And arguably clearer.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
E

Eric Sosman

bartc said:
[...]
(Now you're going to tell me the C standard can't guarantee a byte-wise
comparison between even ordinary values such as ints, floats and pointers)

Testing, one, two, three. <Tap tap tap> Is this thing
on? Testing, one -- oh, good. Ahem. "The C standard can't
guarantee a byte-wise comparison between even ordinary values
such as ints, floats, and pointers."

<Applause>

Thank you, thank you, you've been a wonderful audience!
Now, for an encore: "The IEEE floating-point standard can't
guarantee a byte-wise comparison between even ordinary
values such as plus and minus zero (numerically equal but
byte-wise unequal) or between identical NaNs (byte-wise equal
but numerically unequal)."

<Applause. Off-stage, sotto voce: "Damn, but he's a hard
act to follow!">
 
P

Peter Nilsson

Keith Thompson said:
If s1 and s2 don't have the same type, the assignment
is a constraint violation.

What constraint is violated in the code below?

int s2;
long s1 = 42;
s2 = s1;
 
K

Keith Thompson

Peter Nilsson said:
What constraint is violated in the code below?

int s2;
long s1 = 42;
s2 = s1;

Sorry, I fumbled the context; I was assuming that s1 and s2 were both
structs.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top