Copying a struct to a larger struct?

H

hermes_917

I want to use memcpy to copy the contents of one struct to another
which is a superset of the original struct (the second struct has extra
members at the end). I wrote a small program to test this, and it
seems to work fine, but are there any cases where doing something like
this could cause any problems?

Here's the small program I wrote to test this:

#include <stdio.h>


int main()
{
struct simplestruct
{
int i;
char str[8];
};

struct extendedstruct
{
int i;
char str[8];
int j;
};

struct simplestruct foostruct;
struct extendedstruct barstruct;

/* Populate the members of the simplestruct instance */
foostruct.i=7;
strcpy(foostruct.str, "Test");

/* Copy the contents of the simplestruct instance to the
extendedstruct instance */
memcpy(&barstruct, &foostruct, sizeof(foostruct));

/* Populate remaining member of the extendedstruct instance */
barstruct.j=13;

/* Print values of members of the extendedstruct instance */
printf("i\t%d\nstr\t%s\nj\t%d\n", barstruct.i, barstruct.str,
barstruct.j);
}

Thanks in advance for any advice.
 
T

tedu

struct simplestruct
{
int i;
char str[8];
};

struct extendedstruct
{
int i;
char str[8];
int j;
};

A lower maintainence, less error prone way to do this is
struct simplestruct { ... };
struct extendedstruct { struct simplestruct s; ... };
 
M

Michael Mair

I want to use memcpy to copy the contents of one struct to another
which is a superset of the original struct (the second struct has extra
members at the end). I wrote a small program to test this, and it
seems to work fine, but are there any cases where doing something like
this could cause any problems?

Here's the small program I wrote to test this:

#include <stdio.h>


int main()
{
struct simplestruct
{
int i;
char str[8];
};

struct extendedstruct
{
int i;
char str[8];
int j;
};

struct simplestruct foostruct;
struct extendedstruct barstruct;

/* Populate the members of the simplestruct instance */
foostruct.i=7;
strcpy(foostruct.str, "Test");

/* Copy the contents of the simplestruct instance to the
extendedstruct instance */
memcpy(&barstruct, &foostruct, sizeof(foostruct));

/* Populate remaining member of the extendedstruct instance */
barstruct.j=13;

/* Print values of members of the extendedstruct instance */
printf("i\t%d\nstr\t%s\nj\t%d\n", barstruct.i, barstruct.str,
barstruct.j);
}

Thanks in advance for any advice.

It is possible that sizeof foostruct == sizeof barstruct;
take another example:

struct simplestruct
{
int i;
char str[7];
};

struct extendedstruct
{
int i;
char str[7];
unsigned char j;
};

On typical implementations with 8bit char and 32bit or 16bit int and
minimal padding, it is now probable that sizeof foostruct == sizeof
barstruct, so your padding in simplestruct overwrites your useful
information in extendedstruct.
So, either you memcpy() (offsetof(struct simplestruct, str) + sizeof
foostruct.str) bytes (the offsetof macro comes from <stddef.h>) or
you follow the hint given in another reply, namely
struct extendedstruct
{
struct simplestruct mySimple;
unsigned char j;
}
Then, you do not even need memcpy():
barstruct.mySimple = foostruct;
suffices.
(In addition, you can abuse the address of barstruct as a
struct simplestruct*, if necessary -- but I have not said that ;-))


Cheers
Michael
 
B

Barry Schwarz

I want to use memcpy to copy the contents of one struct to another
which is a superset of the original struct (the second struct has extra
members at the end). I wrote a small program to test this, and it
seems to work fine, but are there any cases where doing something like
this could cause any problems?

There is no guarantee that the padding in the common part of the two
structs is the same. It is also possible for the shorter struct to
have padding at the end but in the longer struct there could be an
actual member at the same offset.


<<Remove the del for email>>
 
C

CBFalconer

I want to use memcpy to copy the contents of one struct to another
which is a superset of the original struct (the second struct has
extra members at the end). I wrote a small program to test this,
and it seems to work fine, but are there any cases where doing
something like this could cause any problems?

Why do snaky things in the first place? Just write down what you
want:

struct smaller {
....
} smallguy;

struct bigger {
struct smaller small;
....
} bigguy;

....

bigguy.small = smallguy;

and I find it hard to imagine what could go wrong.
 
A

Alexei A. Frounze

Barry Schwarz said:
On 27 Jul 2005 12:32:08 -0700, "(e-mail address removed)" ....
There is no guarantee that the padding in the common part of the two
structs is the same.

If my memory serves me, the standard gurantees that. But, I'd prefer to turn
that common part into a type used inside both structs. memcpy() will be able
to do bad things silently...

Alex
 
N

Netocrat

If my memory serves me, the standard gurantees that.

Perhaps you aren't using error correction on your memory. :)

Barry is right, the standard doesn't guarantee it. The alignment of
members within a struct is "implementation-defined" (I can't imagine why
in practice a compiler would not pad them identically, but it isn't
prohibited from padding differently).

The only guarantees about member positions in structs are that no hole
will occur at the beginning and that members will occupy increasing
storage addresses.

Padding also may be added at the end to ensure correct alignment for
arrays of the struct (but this is again at the implementation's discretion
- i.e. not of predictable / guaranteed / repeatable size).
But, I'd prefer to turn
that common part into a type used inside both structs. memcpy() will be able
to do bad things silently...

memcpy() may do bad things if the common part is not padded identically,
as well as if the smaller struct has padding at the end in space at which
the larger struct has a member.

You're right that your preferred solution is safe though.
 
K

Keith Thompson

Netocrat said:
Perhaps you aren't using error correction on your memory. :)

Barry is right, the standard doesn't guarantee it. The alignment of
members within a struct is "implementation-defined" (I can't imagine why
in practice a compiler would not pad them identically, but it isn't
prohibited from padding differently).

The only guarantees about member positions in structs are that no hole
will occur at the beginning and that members will occupy increasing
storage addresses.

But there is an additional guarantee. C99 6.5.2.3p5 says:

One special guarantee is made in order to simplify the use of
unions: if a union contains several structures that share a common
initial sequence (see below), and if the union object currently
contains one of these structures, it is permitted to inspect the
common initial part of any of them anywhere that a declaration of
the complete type of the union is visible. Two structures share a
common initial sequence if corresponding members have compatible
types (and, for bit-fields, the same widths) for a sequence of one
or more initial members.

There's no explicit guarantee *unless* the two structures are members
of the same union, but it's hard to imagine an implementation other
than the DS9K meeting this requirement for union members without
meeting it for all structures with common initial sequences.
Padding also may be added at the end to ensure correct alignment for
arrays of the struct (but this is again at the implementation's discretion
- i.e. not of predictable / guaranteed / repeatable size).

memcpy() may do bad things if the common part is not padded identically,
as well as if the smaller struct has padding at the end in space at which
the larger struct has a member.

Yes, that can be a problem even given the additional guarantee above.
For example, given:

struct foo {
int x;
char y;
}

struct foobar {
int x;
char y;
char z;
}

it's effectively guaranteed that the x and y members will have the
same size and offset in both structures, but the z member of struct
foobar could very well occupy some of the padding of struct foo.
You're right that your preferred solution is safe though.

Agreed.
 
A

Alexei A. Frounze

....
But there is an additional guarantee. C99 6.5.2.3p5 says:

One special guarantee is made in order to simplify the use of
unions: if a union contains several structures that share a common
initial sequence (see below), and if the union object currently
contains one of these structures, it is permitted to inspect the
common initial part of any of them anywhere that a declaration of
the complete type of the union is visible. Two structures share a
common initial sequence if corresponding members have compatible
types (and, for bit-fields, the same widths) for a sequence of one
or more initial members.

There's no explicit guarantee *unless* the two structures are members
of the same union, but it's hard to imagine an implementation other
than the DS9K meeting this requirement for union members without
meeting it for all structures with common initial sequences.

That was what I was referring to. It's really hard to imagine it would work
with unions only.
I think this would fail only if the compiler is such broken that it would
even treat diffrently the same structure/type definition from a .h header
file included by several different .c files. Say the files are: h.h, a.c and
b.c. Apparently, the compiler must not remember how it compiled a.c when
compiling b.c. But if it not just didn't remember but treated the contents
of h.h. differently in a.c and b.c, it would be useless for compiling any
code...
Yes, that can be a problem even given the additional guarantee above.
For example, given:

struct foo {
int x;
char y;
}

struct foobar {
int x;
char y;
char z;
}

it's effectively guaranteed that the x and y members will have the
same size and offset in both structures, but the z member of struct
foobar could very well occupy some of the padding of struct foo.

That's fine with z. But still, isn't what you're saying about x and y in
both structures enough to memcpy between them? Maybe it should be (I'm
writing pseudo code here):
memcpy (&foobar, &foo, offsetof(foo.y)+1)
instead of:
memcpy (&foobar, &foo, sizeof(foo))
?

Alex
 
M

Michael Mair

Alexei said:
...



That was what I was referring to. It's really hard to imagine it would work
with unions only.
I think this would fail only if the compiler is such broken that it would
even treat diffrently the same structure/type definition from a .h header
file included by several different .c files. Say the files are: h.h, a.c and
b.c. Apparently, the compiler must not remember how it compiled a.c when
compiling b.c. But if it not just didn't remember but treated the contents
of h.h. differently in a.c and b.c, it would be useless for compiling any
code...



That's fine with z. But still, isn't what you're saying about x and y in
both structures enough to memcpy between them? Maybe it should be (I'm
writing pseudo code here):
memcpy (&foobar, &foo, offsetof(foo.y)+1)
offsetof(struct foo, y)
see my reply to the OP.
instead of:
memcpy (&foobar, &foo, sizeof(foo))
?

Cheers
Michael
 
C

Chris Croughton

I think this would fail only if the compiler is such broken that it would
even treat diffrently the same structure/type definition from a .h header
file included by several different .c files. Say the files are: h.h, a.c and
b.c. Apparently, the compiler must not remember how it compiled a.c when
compiling b.c. But if it not just didn't remember but treated the contents
of h.h. differently in a.c and b.c, it would be useless for compiling any
code...

No, that's not necessarily broken. In fact it can happen easily if the
compiler is invoked with different options for each file (easy to do in
some build systems, or if some of the code is a separate library).

However, provided you haven't done something silly like that (which will
break all structures passed between the translation units) you are
right. If something is in a header file in one translation unit then it
must be compiled the same in all other translation units otherwise the
system is broken.
That's fine with z. But still, isn't what you're saying about x and y in
both structures enough to memcpy between them? Maybe it should be (I'm
writing pseudo code here):
memcpy (&foobar, &foo, offsetof(foo.y)+1)
instead of:
memcpy (&foobar, &foo, sizeof(foo))
?

Yes, you could, although the +1 should be +sizeof(foo.y) to be general
(in your case y is a char, but if it were anything else then it could be
a different size). I would prefer (in C99 supporting compilers) to use
an extra variable, preferably zero length:

struct Common
{
int x;
char y;
/*...*/
char endOfCommon[];
};

struct A
{
int x;
char y;
/*...*/
};

etc. And then do
...
memcpy(&a, &b, offsetof(struct Common, endOfCommon));

If your compiler won't take that syntax then make endOfCommon a char (it
won't matter anyway since you are just using it as a marker and won't
actually declare a variable of type struct Common so it won't take up
any real space). That is effectively the "struct hack", you would just
be doing it with declared variables rather than the more common malloc'd
memory. I would bury the memcpy inside a macro or function to avoit
people mistyping it, COPY_COMMON(a, b) for instance. I would also be
tempted to put the common declarations into a macro so that I always got
those right, so it would come out something like:

#define COMMON int x; char y; /*...*/

struct COMMON_s
{
COMMON
char end_of_COMMON;
};

#define COPY_COMMON(a,b) \
memcpy(a, b, offsetof(struct COMMON_s, end_of_COMMON))

/*...*/

struct A
{
COMMON
int astuff;
};

struct B
{
COMMON
void *bstuff;
};

/*...*/

void funct(struct A *aPar)
{
struct B bbb;
COPY_COMMON(&b, a);
/*...*/
}

Yes, it might fail on a DS9000 which was deliberately trying to break
it, but I can't believe that any real compiler would do it (not least
because so much real world code would break, anyone using such a
compiler would throw it back on Quality of Implementation grounds, the
DS9000 is not a useful production compiler).

Chris C
 
N

Netocrat

No, that's not necessarily broken. In fact it can happen easily if the
compiler is invoked with different options for each file (easy to do in
some build systems, or if some of the code is a separate library).

Only if those options don't include invoking standards-compliant mode or
we're being really pedantic...

As Keith pointed out the standard provides a virtual guarantee that the
common members must be padded in the same way - it's not an actual
guarantee but a compiler writer would have to go out of their way to meet
the requirements for structs with common initial members to be padded
identically when part of a union but not in the general case.
However, provided you haven't done something silly like that (which will
break all structures passed between the translation units) you are
right. If something is in a header file in one translation unit then it
must be compiled the same in all other translation units otherwise the
system is broken.
That's fine with z. But still, isn't what you're saying about x and y
in both structures enough to memcpy between them? Maybe it should be
(I'm writing pseudo code here):
memcpy (&foobar, &foo, offsetof(foo.y)+1)
instead of:
memcpy (&foobar, &foo, sizeof(foo))
?

Yes, you could, although the +1 should be +sizeof(foo.y) to be general
(in your case y is a char, but if it were anything else then it could be
a different size). I would prefer (in C99 supporting compilers) to use
an extra variable, preferably zero length:

struct Common
{
int x;
char y;
/*...*/
char endOfCommon[];
};

struct A
{
int x;
char y;
/*...*/
};

etc. And then do
...
memcpy(&a, &b, offsetof(struct Common, endOfCommon));

You're forgetting something here...

What if there's padding added before endOfCommon?

<snip rest of example>
 
C

Chris Croughton

Only if those options don't include invoking standards-compliant mode or
we're being really pedantic...

No, the options can be perfectly standards compliant but only if they
are used consistently. An option which says that structures are packed,
for instance, would be perfectly compliant unless you missed it off
somewhere. But that would cause all structures passed from one to the
other to fail.
As Keith pointed out the standard provides a virtual guarantee that the
common members must be padded in the same way - it's not an actual
guarantee but a compiler writer would have to go out of their way to meet
the requirements for structs with common initial members to be padded
identically when part of a union but not in the general case.

And what about the case when they are accessed via a pointer to the
union? Yugh. I wouldn't want to try to writte such a compiler...
Yes, you could, although the +1 should be +sizeof(foo.y) to be general
(in your case y is a char, but if it were anything else then it could be
a different size). I would prefer (in C99 supporting compilers) to use
an extra variable, preferably zero length:

struct Common
{
int x;
char y;
/*...*/
char endOfCommon[];
};

struct A
{
int x;
char y;
/*...*/
};

etc. And then do
...
memcpy(&a, &b, offsetof(struct Common, endOfCommon));

You're forgetting something here...

What if there's padding added before endOfCommon?

Hmm, you mean if the compiler puts more padding before a char than
before an int, say? Is it allowed to do that, since a char is by
definition the smallest type? Is the padding allowed to be more than
that needed to satisfy the alignment? I think that comes back to the
Quality of Implementation argument, a compiler which adds more padding
than it needs won't last long in the wild. Think of it as evolution in
action...

Chris C
 
E

Eric Sosman

Chris said:
No, the options can be perfectly standards compliant but only if they
are used consistently. An option which says that structures are packed,
for instance, would be perfectly compliant unless you missed it off
somewhere. But that would cause all structures passed from one to the
other to fail.

Saw that happen once, in a vendor-supplied header
no less. Paraphrasing from dim memory:

#pragma push(_struct_packing_mode)
#define _struct_packing_mode 1
struct something { ... };
#pragma pop(struct_packing_mode)

The effect was that `struct something' was laid out
consistently and correctly in all translation units,
but that other structs were laid out inconsistently
depending on whether their declarations came before
or after the header that contained this sequence.
 
C

Chris Croughton

Saw that happen once, in a vendor-supplied header
no less. Paraphrasing from dim memory:

#pragma push(_struct_packing_mode)
#define _struct_packing_mode 1
struct something { ... };
#pragma pop(struct_packing_mode)

Let me guess, a Richmond WA based company's media APIs? That's where
I've found similar...
The effect was that `struct something' was laid out
consistently and correctly in all translation units,
but that other structs were laid out inconsistently
depending on whether their declarations came before
or after the header that contained this sequence.

And inconsistent from ones which didn't include the header at all. Yup.
Easily done with implementation options. Borland's compilers did it a
lot, because you could (still can) turn 'command line' parameters on an
off within source files and it's very easy to forget to restore the
default in a header...

GCC's method of putting 'attributes' only on the specific object is a
lot more manageable in that way (still non-standard, but at least it
rarely breaks things outside the specific cases where it's used).

Chris C
 
B

Ben Pfaff

Eric Sosman said:
#pragma push(_struct_packing_mode)
#define _struct_packing_mode 1
struct something { ... };
#pragma pop(struct_packing_mode)

The effect was that `struct something' was laid out
consistently and correctly in all translation units,
but that other structs were laid out inconsistently
depending on whether their declarations came before
or after the header that contained this sequence.

I've seen this problem avoided by putting the needed prologue and
epilogue in separate header files and then #include'ing them
instead of putting them inline. If you screw up the spelling of
a header file name, you get an error message.
 
M

Michael Wojcik

No, the options can be perfectly standards compliant but only if they
are used consistently. An option which says that structures are packed,
for instance, would be perfectly compliant unless you missed it off
somewhere. But that would cause all structures passed from one to the
other to fail.

Compiling two source files with two different sets of options is,
by definition, compiling them with two different implementations:

[#1] implementation
a particular set of software, running in a particular
translation environment under particular control options
(n869 3.10)

Since structure padding is implementation-defined, the situation you
invoke is moot. When compiling two source files using the same
implementation, common initial members of structure types *must* use
the same padding unless the implementation can determine that they
never appear together in a union. (If you did ever run into such a
perverse implementation, you could easily create such a union to
force this behavior, of course.)

When compiling under two different implementations, all bets are off;
the clause does not apply.
 
E

Eric Sosman

Chris said:
Let me guess, a Richmond WA based company's media APIs? That's where
I've found similar...

No; the header that fouled up my project came from a
company based in Maynard MA.

That company no longer exists, and neither does the
company that bought their remnants -- and the company that
in turn bought the buyer has fired its CEO and is laying
off more than fourteen thousand other employees. The
consequences of Undefined Behavior truly can be dire ...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,756
Messages
2,569,540
Members
45,025
Latest member
KetoRushACVFitness

Latest Threads

Top