Can a C compiler do this - <related to Padding in Structures>?

A

AjayB

Hi All,

Sorry for this silly question. But all the google answers I got were
vague and could not make up my mind.
My question is can the padding in structure vary between different
"instances" of that same structure??

For example I have a structure say

struct ABC{
int i;
char c;
flaot f;
double d;
int *p_i ;
};

Now lets say in one part of the source file i have an object
v_instance1:

<<file1.c>>>
struct ABC v_instance1 ;
....
....
some code...
....

and another object v_instance2:
<<file2.c>>
struct ABC v_instance2;
....
....
some code...
....

well now sizeof(v_instance1) always equal to sizeof(v_instance2) with
or without padding.
My "silly" doubts:

1)But is it possible that a compiler does the internal padding for
v_instance1 differently from v_instance2 and still satisfy the
sizeof(v_instance1)==sizeof(v_instance2) ??

IOW is the padding between elements of v_instance1 is exactly same as
padding between elements of v_instance2 such that my custom funtions
similar to memcpy() or memset() don't messup anything later(as the
compiler is not aware of what i intend to do unlike stdlib
functions)..?

Which paragraph of the C standard mentions this explicitly?


2)Also after doing, v_instance1 = v_instance2 , memcmp() can still
return Non-Zero right?
anyway to make sure the copy is done in such a way that memcmp()
return 0 and never go into undefined behavior

3)what is the best portable way to copy v_instance1 into v_instance2
(assume pointer members exist) if i plan to use memcmp() and memset()
later on those two variables ,without leading to undefined behaviour
or traps??

4)if v_instance1 and v_instance2 are not initialized, does expr
v_instance1=v_instance2 give undefined behavior?? for example let int
be member of the struct ,then while copying uninitialized int between
two instances can it not generate trap?? so then
v_instance1=v_instance2 is not a good thing todo always?

Thanks in advance..
Regards
Ajay
 
T

Tom St Denis

1)But is it possible that a compiler does the internal padding for
v_instance1 differently from v_instance2 and still satisfy the
sizeof(v_instance1)==sizeof(v_instance2) ??

The sizeof will always be the same for the given compiler options.
*alignment* might be different depending on where in your code you
declare variables of that type. in the typical .bss section as a
global it might be aligned on a 32-bit word whereas on the stack [as
an auto variable] it might be 64 or 128-bit for instance.
IOW is the padding between elements of v_instance1 is exactly same as
padding between elements of v_instance2 such that my custom funtions
similar to memcpy() or memset() don't messup anything later(as the
compiler is not aware of what i intend to do unlike stdlib
functions)..?

They should be given the same compiler options. Note that you can
assign structures too. struct foo bar1, bar2; bar1 = bar2; sort of
thing.
Which paragraph of the C standard mentions this explicitly?

Read the spec yourself?
2)Also after doing, v_instance1 = v_instance2 , memcmp() can still
return Non-Zero right?
anyway to make sure the copy is done in such a way that memcmp()
return 0 and never go into undefined behavior

Typically if you memcpy() it then memcmp will work, but the more
portable solution is to write a function to compare your structs then
you can do int foocmp(foo *one, foo *two) and drop calls to foocmp()
wherever you need to compare the two.
3)what is the best portable way to copy v_instance1 into v_instance2
(assume pointer members exist) if i plan to use memcmp() and memset()
later on those two variables ,without leading to undefined behaviour
or traps??

You can memcpy() it, if foo1 is a valid instance then memcpy(foo2,
foo1, sizeof foo1) is fine and foo2 will memcmp with foo1.
4)if v_instance1 and v_instance2 are not initialized, does expr
v_instance1=v_instance2 give undefined behavior?? for example let int
be member of the struct ,then while copying uninitialized int between
two instances can it not generate trap?? so then
v_instance1=v_instance2 is not a good thing todo always?

? If you full initialize the structure assignment is always valid.
Your question is like asking if this causes UB

int a, b;
a = b;

Of course it does, you haven't initialized b yet.

Tom
 
A

AjayB

1)But is it possible that a compiler does the internal padding for
v_instance1 differently from v_instance2 and still satisfy the
sizeof(v_instance1)==sizeof(v_instance2) ??

The sizeof will always be the same for the given compiler options.
*alignment* might be different depending on where in your code you
declare variables of that type.  in the typical .bss section as a
global it might be aligned on a 32-bit word whereas on the stack [as
an auto variable] it might be 64 or 128-bit for instance.

Thanks ,That clears some things up. Sorry for being naive, can you
provide links where I can read more about this.
They should be given the same compiler options.  Note that you can
assign structures too.  struct foo bar1, bar2; bar1 = bar2; sort of
thing.

I know, thanks for clearing doubts.
Read the spec yourself?

will do now..
Typically if you memcpy() it then memcmp will work, but the more
portable solution is to write a function to compare your structs then
you can do int foocmp(foo *one, foo *two) and drop calls to foocmp()
wherever you need to compare the two.

That is what am doing but desired to know any better alternatives.
You can memcpy() it, if foo1 is a valid instance then memcpy(foo2,
foo1, sizeof foo1) is fine and foo2 will memcmp with foo1.


? If you full initialize the structure assignment is always valid.
Your question is like asking if this causes UB

int a, b;
a = b;

Of course it does, you haven't initialized b yet.

oops..yeah of-course...i did not see it that way.

Thanks a lot

Ajay
 
B

Ben Bacarisse

AjayB said:
Sorry for this silly question. But all the google answers I got were
vague and could not make up my mind.
My question is can the padding in structure vary between different
"instances" of that same structure??

For example I have a structure say

struct ABC{
int i;
char c;
flaot f;
double d;
int *p_i ;
};
s/flaot/float/

Now lets say in one part of the source file i have an object
v_instance1:

<<file1.c>>>
struct ABC v_instance1 ;
and another object v_instance2:
<<file2.c>>
struct ABC v_instance2;
well now sizeof(v_instance1) always equal to sizeof(v_instance2) with
or without padding.
My "silly" doubts:

1)But is it possible that a compiler does the internal padding for
v_instance1 differently from v_instance2 and still satisfy the
sizeof(v_instance1)==sizeof(v_instance2) ??
No.

IOW is the padding between elements of v_instance1 is exactly same as
padding between elements of v_instance2 such that my custom funtions
similar to memcpy() or memset() don't messup anything later(as the
compiler is not aware of what i intend to do unlike stdlib
functions)..?

Which paragraph of the C standard mentions this explicitly?

The offsetof macro (7.17 p3) could not do its job unless offsets within
all instances of a struct type were the same. This does not cover bit
fields but the wording of how bit fields are allocated does not, I
think, allow their location to vary except between different struct
types. This may also be true of non-bit field members -- it's just that
the offsetof argument is simpler.
2)Also after doing, v_instance1 = v_instance2 , memcmp() can still
return Non-Zero right?

Yes, it can.
anyway to make sure the copy is done in such a way that memcmp()
return 0 and never go into undefined behavior

3)what is the best portable way to copy v_instance1 into v_instance2
(assume pointer members exist) if i plan to use memcmp() and memset()
later on those two variables ,without leading to undefined behaviour
or traps??

The cases are different. memset (presumably to zero the objects) is not
guaranteed to work with pointers or floating types. If you worry about
these cases (and it is often simplest not to worry -- just document the
non-portability of using memset) you have to set the members explicitly.
One way to do this is to keep a "zero object" around for copying by
assignment. Static objects are correctly zeroed by default and
automatic ones can be zeroed by using = {0} as the initialiser.

In C99 you can use a compound literal: '(struct xxx){0}'.

As for memcmp, the real question is about what you are trying to do.
memcmp compares the representation (including padding bytes) and it does
so entirely reliably, so if you want to detect changes in the bits of an
ojbect it is the function to use. If you want to portably detect changes
in value it is not useful and this lack of utility comes from a range
of issue of which padding is probably the least of your worries.

Only a member-by-member compare is going to be truly portable.
4)if v_instance1 and v_instance2 are not initialized, does expr
v_instance1=v_instance2 give undefined behavior?? for example let int
be member of the struct ,then while copying uninitialized int between
two instances can it not generate trap??

I don't think so, but why do you care? What reason can there be to copy
an indeterminate object like this?
so then
v_instance1=v_instance2 is not a good thing todo always?

When v_instance2 is indeterminate, all you are doing it propagating this
indeterminate data further. That's a bad thing to do whether you do it
by assignment or by memcpy. I'd work hard to remove any change of this
happening but for reasons quite unrelated to your concern about traps.
 
B

Ben Bacarisse

AjayB said:
1)But is it possible that a compiler does the internal padding for
v_instance1 differently from v_instance2 and still satisfy the
sizeof(v_instance1)==sizeof(v_instance2) ??

The sizeof will always be the same for the given compiler options.
*alignment* might be different depending on where in your code you
declare variables of that type.  in the typical .bss section as a
global it might be aligned on a 32-bit word whereas on the stack [as
an auto variable] it might be 64 or 128-bit for instance.

Thanks ,That clears some things up. Sorry for being naive, can you
provide links where I can read more about this.

I hope you see that Tom is talking about the overall alignment of the
struct. In other words what he's said does not address the question you
asked (as I understand it). What Tom said may be useful to you, but it
seems unrelated to your worries.

oops..yeah of-course...i did not see it that way.

I think Tom's analogy is flawed. struct objects have a special
dispensation. They can never be trap representations on their own. The
individual members might be, of course, but that does not seem to have
anything to do with what you asked.
 
T

Tom St Denis

I think Tom's analogy is flawed.  struct objects have a special
dispensation.  They can never be trap representations on their own.  The
individual members might be, of course, but that does not seem to have
anything to do with what you asked.

Well if you cause UB in the assignment (because one of its internal
members is not correctly initialized) then your program has UB. Even
if the assignment on its face looks valid.

Tom
 
A

AjayB

Well if you cause UB in the assignment (because one of its internal
members is not correctly initialized) then your program has UB.  Even
if the assignment on its face looks valid.

Tom

Even I was thinking on the above lines. Is this wrong?

And thanks everyone for clearing these doubts.

Well the reason I was asking this is more like, not to burn my hand in
"one of those never heard of" micro-controller compilers and also just
some academic interest. Just wanted explore a way to make comparing
and copying structs of large elements as fast and portable as
possible(very low freq processor). But it seems its good to keep it
simple.

Thanks again to all who replied.
Regards
Ajay
 
E

Eric Sosman

Well if you cause UB in the assignment (because one of its internal
members is not correctly initialized) then your program has UB. Even
if the assignment on its face looks valid.

No; the assignment *is* valid, even if the source is junk.

struct s { SomeTypeWithTraps data; };
struct s uninitialized; // "random garbage"
struct s target; // "more random garbage"
target = uninitialized; // valid assignment

The assignment is valid, and there is no undefined behavior, not
even if `uninitialized.data' happens to hold a trap representation.

Of course, using either `target.data' or `uninitialized.data'
could get you into trouble -- but that's another story.
 
A

AjayB

On 5/6/2010 7:08 AM, AjayB wrote:
snipped
3)what is the best portable way to copy v_instance1 into v_instance2
(assume pointer members exist) if i plan to use memcmp() and memset()
later on those two variables ,without leading to undefined behaviour
or traps??

     The obvious `v_instance2 = v_instance1;' seems satisfactory:
Portable, trouble-free, and as efficient as the compiler can manage.
Your reference to "pointer members" makes me wonder if you're thinking
about what's sometimes called a "deep copy," where you don't copy
the pointer, but the thing it points at (so the "copied" pointer aims
at the newly-copied target).  C doesn't do that for you, neither with
simple assignment nor with memcpy(), because C doesn't know what kind
of meaning you attach to your data structures.

     As for memset() and memcmp(), you probably don't (or shouldn't)
want to use them.  memset() is fine for arrays of bytes, but not for
much else.  As mentioned earlier, memcmp() on structs will go astray
if padding bytes disagree.  Also, its notion of "equality" may not
match yours, particularly when pointers are involved:

        struct person { char *name; };

        char programmer[] = "Ajay";
        struct person p1 = { programmer };

        char questioner[] = "Ajay"
        struct person p2 = { questioner };

Now `memcmp(&p1, &p2, sizeof p1)' will declare the struct unequal
(because their `name' elements are different, if nothing else), even
though you might well think them equal because both point to "Ajay".

Great..Thanks for this snippet..

Regards
Ajay
 
B

Ben Bacarisse

AjayB said:
Even I was thinking on the above lines. Is this wrong?

Yes, this is wrong -- it's not UB. Maybe what Tom means is that you are
propagating the *potential* for UB later, but there is no UB caused by
assigning one struct from another of the same type.

<snip>
 
T

Tom St Denis

Yes, this is wrong -- it's not UB.  Maybe what Tom means is that you are
propagating the *potential* for UB later, but there is no UB caused by
assigning one struct from another of the same type.

Um, UB tags along for the ride.

We agree that

int a, b;
a = b;

is UB right?

struct foo { int a, b; }

struct foo bar1, bar2;
bar2.a = 1;
bar1 = bar2;

Why isn't this UB as well? bar2.b is not initialized, so any use of
bar2.b or bar1.b at this point is UB.

The UB propagates through all reads from the unitialized variable.

Tom
 
S

Seebs

Hi All,

Sorry for this silly question. But all the google answers I got were
vague and could not make up my mind.
My question is can the padding in structure vary between different
"instances" of that same structure??

In general, no.
1)But is it possible that a compiler does the internal padding for
v_instance1 differently from v_instance2 and still satisfy the
sizeof(v_instance1)==sizeof(v_instance2) ??

I don't think so.

In practice, you probably can't make an implementation where compilation
with a given set of options doesn't produce the same results for equivalent
declarations.

-s
 
K

Keith Thompson

Tom St Denis said:
Um, UB tags along for the ride.

We agree that

int a, b;
a = b;

is UB right?

Right, because b might contain a trap representation.
struct foo { int a, b; }

struct foo bar1, bar2;
bar2.a = 1;
bar1 = bar2;

Why isn't this UB as well? bar2.b is not initialized, so any use of
bar2.b or bar1.b at this point is UB.

Because of N1256 6.2.6.1p6:

The value of a structure or union object is never a trap
representation, even though the value of a member of the structure
or union object may be a trap representation.
The UB propagates through all reads from the unitialized variable.

Nope.

Consider the implications if that were true. Suppose you have a
structure that holds up to 10 double values:

#define MAX 10
struct foo {
int count;
double values[MAX];
}

Then this would invoke undefined behavior:

struct foo x, y;
x.count = 3;
x.values[0] = 0.0;
x.values[1] = 1.0;
x.values[2] = 2.0;
y = x;

In fact, it would be impossible to safely assign values of type
"struct tm", since the standard permits it to have extra members.

When copying struct values, any uninitialized members harmlessly go
along for the ride, even if they contain trap representations.

This was a post-C99 change. See
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_222.htm>.
 
E

Eric Sosman

[...]
struct foo { int a, b; }

struct foo bar1, bar2;
bar2.a = 1;
bar1 = bar2;

Why isn't this UB as well? bar2.b is not initialized, so any use of
bar2.b or bar1.b at this point is UB.

The UB propagates through all reads from the unitialized variable.

I'll retract my previous claim that structs have no trap
representations, for the simple reason that when I went looking
for chapter and verse to support it, lo! the C&V had gone missing.

There's language that rules out trap representations in a
particular circumstance: If you store to some element of a struct,
or to the whole struct as a unit, any padding bytes or padding
bits or unnamed elements take on unspecified values. The Standard
says that these unspecified values shall not cause the struct as a
whole to have a trap representation, and I mis-remembered this as
"structs don't have traps."

But the language I can find says nothing about structs that
contain indeterminate-valued elements, nor about structs whose
padding bytes/bits have values not resulting from storing to an
element or to the struct. As far as I can see, nothing forbids
an uninitialized or partially initialized struct from holding a
trap representation -- in which case, your example *is* U.B.
despite my earlier claims to the contrary.

Unless, of course, I'm wrong. ("Again!?")
 
T

Tom St Denis

[...]
struct foo { int a, b; }
struct foo bar1, bar2;
bar2.a = 1;
bar1 = bar2;
Why isn't this UB as well?  bar2.b is not initialized, so any use of
bar2.b or bar1.b at this point is UB.
The UB propagates through all reads from the unitialized variable.

     I'll retract my previous claim that structs have no trap
representations, for the simple reason that when I went looking
for chapter and verse to support it, lo! the C&V had gone missing.

     There's language that rules out trap representations in a
particular circumstance: If you store to some element of a struct,
or to the whole struct as a unit, any padding bytes or padding
bits or unnamed elements take on unspecified values.  The Standard
says that these unspecified values shall not cause the struct as a
whole to have a trap representation, and I mis-remembered this as
"structs don't have traps."

     But the language I can find says nothing about structs that
contain indeterminate-valued elements, nor about structs whose
padding bytes/bits have values not resulting from storing to an
element or to the struct.  As far as I can see, nothing forbids
an uninitialized or partially initialized struct from holding a
trap representation -- in which case, your example *is* U.B.
despite my earlier claims to the contrary.

     Unless, of course, I'm wrong.  ("Again!?")

I'd be really surprised that using bar1.b doesn't invoke UB.

In most practical circumstances though structa = structb is really
just a memcpy behind the scenes and most platforms don't have "trap"
representations for integer types. So strictly speaking all you'll
get is a garbage value in the int, not a program fault (e.g. trap,
segfault, whatever).

That being said, while I think it's "valid" to assign structures where
only some of the fields have been initialized it DOES propagate the UB
if unitialized source elements are then read from the target
structure.

Tom
 
E

Eric Sosman

[...]
struct foo { int a, b; }
struct foo bar1, bar2;
bar2.a = 1;
bar1 = bar2;
Why isn't this UB as well? bar2.b is not initialized, so any use of
bar2.b or bar1.b at this point is UB.
The UB propagates through all reads from the unitialized variable.

I'll retract my previous claim that structs have no trap
representations, for the simple reason that when I went looking
for chapter and verse to support it, lo! the C&V had gone missing.

There's language that rules out trap representations in a
particular circumstance: If you store to some element of a struct,
or to the whole struct as a unit, any padding bytes or padding
bits or unnamed elements take on unspecified values. The Standard
says that these unspecified values shall not cause the struct as a
whole to have a trap representation, and I mis-remembered this as
"structs don't have traps."

But the language I can find says nothing about structs that
contain indeterminate-valued elements, nor about structs whose
padding bytes/bits have values not resulting from storing to an
element or to the struct. As far as I can see, nothing forbids
an uninitialized or partially initialized struct from holding a
trap representation -- in which case, your example *is* U.B.
despite my earlier claims to the contrary.

Unless, of course, I'm wrong. ("Again!?")

I'd be really surprised that using bar1.b doesn't invoke UB.

I don't think anyone doubts (or doubted that).
In most practical circumstances though structa = structb is really
just a memcpy behind the scenes and most platforms don't have "trap"
representations for integer types. So strictly speaking all you'll
get is a garbage value in the int, not a program fault (e.g. trap,
segfault, whatever).

If it's a memcpy() work-alike, there'll be no trap because memcpy()
works as if by copying unsigned chars, and the unsigned char type has
no trap representations. (No, really it doesn't; this time I'm sure.)
But struct assignment isn't required to be memcpy()-like; for example,
padding bytes need not be copied.
That being said, while I think it's "valid" to assign structures where
only some of the fields have been initialized it DOES propagate the UB
if unitialized source elements are then read from the target
structure.

I used to think the assignment was safe but subsequent use of the
suspect elements of the target could make trouble. I now think that
even the assignment is U.B., although most implementations define it
benignly, as you observe.
 
E

Ersek, Laszlo

On 5/6/2010 1:01 PM, Tom St Denis wrote:

[...]
struct foo { int a, b; }

struct foo bar1, bar2;
bar2.a = 1;
bar1 = bar2;
[snip]

I used to think the assignment was safe but subsequent use of the
suspect elements of the target could make trouble. I now think that
even the assignment is U.B., although most implementations define it
benignly, as you observe.

See also the thread starting with

From: Seebs <[email protected]>
Subject: struct assignment and indeterminate values
Message-ID: <[email protected]>
Date: 08 Apr 2010 21:21:43 GMT

http://groups.google.com/group/comp.lang.c/browse_thread/thread/c15233bd771f1091

Cheers,
lacos
 
K

Keith Thompson

Eric Sosman said:
[...]
struct foo { int a, b; }

struct foo bar1, bar2;
bar2.a = 1;
bar1 = bar2;

Why isn't this UB as well? bar2.b is not initialized, so any use of
bar2.b or bar1.b at this point is UB.

The UB propagates through all reads from the unitialized variable.

I'll retract my previous claim that structs have no trap
representations, for the simple reason that when I went looking
for chapter and verse to support it, lo! the C&V had gone missing. [...]
Unless, of course, I'm wrong. ("Again!?")

Let me guess: You checked a copy of the C99 standard.

Try N1256. The relevant text was added in TC2. See N1256 6.2.6.1p6 and
DR #222 <http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_222.htm>.
 
M

Michael Tiomkin

Hi All,

Sorry for this silly question. But all the google answers I got were
vague and could not make up my mind.
My question is can the padding in structure vary between different
"instances" of that same structure??

For example I have a structure say

struct ABC{
        int i;
        char c;
        flaot f;
        double d;
        int *p_i ;

};

Now lets say in one part of the source file i have an object
v_instance1:

<<file1.c>>>
struct ABC v_instance1 ;
...
...
some code...
...

and another object v_instance2:
<<file2.c>>
struct ABC v_instance2;
...
...
some code...
...

well  now sizeof(v_instance1) always equal to sizeof(v_instance2) with
or without padding.
My "silly" doubts:

1)But is it possible that a compiler does the internal padding for
v_instance1 differently from v_instance2 and still satisfy the
sizeof(v_instance1)==sizeof(v_instance2) ??

IOW is the padding between elements of v_instance1 is exactly same as
padding between elements of v_instance2 such that my custom funtions
similar to memcpy() or memset() don't messup anything later(as the
compiler is not aware of what i intend to do unlike stdlib
functions)..?

Which paragraph of the C standard mentions this explicitly?

2)Also after doing, v_instance1 = v_instance2 , memcmp() can still
return Non-Zero right?
anyway to make sure the copy is done in such a way that memcmp()
return 0 and never go into undefined behavior

3)what is the best portable way to copy v_instance1 into v_instance2
(assume pointer members exist) if i plan to use memcmp() and memset()
later on those two variables ,without leading to undefined behaviour
or traps??

4)if v_instance1 and v_instance2 are not initialized, does expr
v_instance1=v_instance2 give undefined behavior?? for example let int
be member of the struct ,then while copying uninitialized int between
two instances can it not generate trap?? so then
v_instance1=v_instance2 is not a good thing todo always?

Thanks in advance..
Regards
Ajay

Well, one possibility to obtain this behaviour is to compile "two"
source files with different pudding options of the compiler (i.e.
alignment).

Well, you can always by a rope and hung yourself successfully, but
it can have some consequences in the near future!-?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top