Viewing the padded bytes

James Kuyper · Jan 15, 2009

David said:
stop programming in C then.

Click to expand...

Why? [Actually I did, many years ago - I do it in C++ these days.]

Because all C structs can have a different size from what you "see on
the page". This applies even more so to C++; the requirements C++
imposes on POD structs are pretty much the same as for C structs. For
non-POD structs, C++ imposes even fewer restrictions on implementations
than it does for POD structs.

You'll therefore continue to be nervous for essentially the entire time
that you continue programming in either C or C++. Of course, some people
work better when they're nervous, in which case that won't be a
disadvantage.

Ian Collins · Jan 15, 2009

David said:
So I would always write the above as

struct check_struct
{
double a;
char b;
char reserved[7];

};

If you do that you can see the padded bytes.

Click to expand...

Click to expand...

what happens when they change?

Click to expand...

Should I care?

What happens on a machine with different alignment and or sizes? What
you have built is unnecessary non-portability.

Rainer Weikusat · Jan 15, 2009

James Kuyper said:
David said:

I don't like having structures a different size from what I see on
the page.
It makes me nervous.

Click to expand...

stop programming in C then.

Click to expand...

Why? [Actually I did, many years ago - I do it in C++ these days.]

Click to expand...

Because all C structs can have a different size from what you "see on
the page".

This is actually more subtle: The C-standard allows 'unnamed padding'
between any two members of a structure and at the end of it (6.2.7.1,
15 & 13), but doesn't contain anything regarding why an implementation
would insert such padding. Enter platform depenencies: Usually (this
is supposed to refer to anything remotely reasonable, ie NOT DOS),
output produced by different C compilers for a particular platform is
supposed to be binary-compatible, ie it is supposed to be possible to
link object files compiled with different compilers into a common
binary. Often, this is a requirement, because fundamental libraries,
like the C-library for UNIX(*), will have been compiled with some
'vendor compiler'. In order to achieve this, an ABI (application
binary interface) needs to be defined and documented and 'memory
layout of composite types' is one of the things described by it.
Assuming knowledge of the rules defined by the ABI, it is possible to
infer the memory layout of any C structure from its declaration.

Rainer Weikusat · Jan 15, 2009

Ian Collins said:
David said:

So I would always write the above as

struct check_struct
{
double a;
char b;
char reserved[7];

};

If you do that you can see the padded bytes.

Click to expand...

what happens when they change?

Click to expand...

Should I care?

Click to expand...

What happens on a machine with different alignment and or sizes?

The size of char is by definition 1. As 'by definition', double is a 8
byte 'double precisions floating point number'. This means that 'the
sizes' must not differ for a conforming C implementation. Usual
alignment requirements are either

- no requirement
- natural alignment, ie 8 byte quantities need to start
at addresses which are integral mutiples of 8

This could actually differ among different C implementations. But
people still don't run FTP servers on DSPs (or if they do, that's
entirely their problem), and for practical purposes, such machines
would first need to be identified and their alignment requirements
determined. If they were (for some reason) part of a set of
'interesting target architectures', conformance would be necessary.
Otherwise, that would just be a waste of both programmer and processor
time.

James Kuyper · Jan 15, 2009

Rainer said:
Ian Collins said:

David said:

On 14 Jan, 13:00, "David Webber" <[email protected]>
wrote:
So I would always write the above as

struct check_struct
{
double a;
char b;
char reserved[7];

};

If you do that you can see the padded bytes.
what happens when they change?
Should I care?

Click to expand...

What happens on a machine with different alignment and or sizes?

Click to expand...

The size of char is by definition 1. ...
True.

... As 'by definition', double is a 8
byte 'double precisions floating point number'.

This is not in fact defined, at least not by the C standard. Unless a
C99 implementation predefines the __STDC_IEC_559__ macro, the C
standard's requirements for double can in fact be met by a floating
point format substantially smaller than 64 bits, and such formats do
exist and are in use; I just recently had to write a program for
processing data which was stored in a 48-bit floating point format that,
if my calculations are correct, does in fact meet all of those requirements.

There are also systems where bytes are larger than octets, so even a
64-bit double can be stored in less than 8 bytes.

... This means that 'the
sizes' must not differ for a conforming C implementation.
Incorrect.

... Usual
alignment requirements are either

- no requirement
- natural alignment, ie 8 byte quantities need to start
at addresses which are integral mutiples of 8

This could actually differ among different C implementations. But
people still don't run FTP servers on DSPs (or if they do, that's
entirely their problem), and for practical purposes, such machines
would first need to be identified and their alignment requirements
determined. If they were (for some reason) part of a set of
'interesting target architectures', conformance would be necessary.
Otherwise, that would just be a waste of both programmer and processor
time.

I think what you're basically saying is that there's a particular range
of the systems that you consider important, and this works as you expect
it to on all of those systems. That's nice to know, but many people live
in a world where systems you dismiss as unimportant are very important
to them; and this code won't do what you expect it to do on many of
those systems.

Rainer Weikusat · Jan 15, 2009

James Kuyper said:
Rainer said:

Ian Collins said:

David Webber wrote:
On 14 Jan, 13:00, "David Webber" <[email protected]>
wrote:
So I would always write the above as

struct check_struct
{
double a;
char b;
char reserved[7];

};

If you do that you can see the padded bytes.
what happens when they change?
Should I care?
What happens on a machine with different alignment and or sizes?

Click to expand...

The size of char is by definition 1. ...
True.

... As 'by definition', double is a 8
byte 'double precisions floating point number'.

Click to expand...

This is not in fact defined, at least not by the C standard.
Unless a C99 implementation predefines the __STDC_IEC_559__ macro, the C
standard's requirements for double can in fact be met by a floating
point format substantially smaller than 64 bits, and such formats do
exist and are in use;

All kinds of things 'exist and are in use'. I arguably didn't read the
standard text careful enough and skipped over the exception clause.
But practially, this is again either obsolete or special purpose
hardware.

[...]

There are also systems where bytes are larger than octets, so even a
64-bit double can be stored in less than 8 bytes.

.... and these systems are either special purpose processors for
digitial signal processing or museum equipment.

[...]

I think what you're basically saying is that there's a particular
range of the systems that you consider important, and this works as
you expect it to on all of those systems.

And I "think" that this is not what I wrote. If YOU (please note that
this is not a statment about ME, cf the difference in spelling) write
C code which is supposed to be executed on, say, old mainframes with
9-bit bytes, this is a problem you will have to deal with (or anyone
else who happens to target such a platform for whatever
reasons). It is not a problem someone has to deal with or should deal
with who isn't targetting such platforms.

You are, of course, free to argue that software should generally be
written to the least common denominator of all computers which ever
existed or even all computer you could imagine to ever exist.

But I would really like to see a logical reason for that. As long as
you target me instead of the content of my text, a safe assumption is
that you haven't any.

David Webber · Jan 15, 2009

....
stop programming in C then.

Click to expand...

Why? [Actually I did, many years ago - I do it in C++ these days.]

Click to expand...

Because all C structs can have a different size from what you "see on the
page". This applies even more so to C++; the requirements C++ imposes on
POD structs are pretty much the same as for C structs. For non-POD
structs, C++ imposes even fewer restrictions on implementations than it
does for POD structs.

Indeed, but my point is that with care you *can* get your data to have the
size you see on the page, that there is sometimes a good case for making
sure you do that, and...

You'll therefore continue to be nervous for essentially the entire time
that you continue programming in either C or C++. Of course, some people
work better when they're nervous, in which case that won't be a
disadvantage.

....and that this nervousness is indeed an advantage, not just because one
works better, but because it leads one to think about what the machine is
doing, rather than just taking code on face value. And C/C++ are languages
where taking code on face value without thinking about what the machine is
doing is rather dangerous.

In fact I'd say that anyone who programs C/C++ without my particular brand
of nervousness is heading for a fall

Dave
--
David Webber
Author of 'Mozart the Music Processor'
http://www.mozart.co.uk
For discussion/support see
http://www.mozart.co.uk/mozartists/mailinglist.htm

David Webber · Jan 15, 2009

What happens on a machine with different alignment and or sizes? What
you have built is unnecessary non-portability.

I write software for Windows (and am viewing this cross-posted thread on
Microsoft's vc.language group). Portability?

But in fact the structures are perfectly portable to any compiler where you
can specify the n-byte packing alignment, which may not be a language
standard, but is surely not uncommon.

To be explicit: I would never dream of assuming that any given element is so
many bits after the start of the structure: that would be foolish. But
when you have arrays of thousands (or more) of them, control over the size
is bloody useful.

Dave
--
David Webber
Author of 'Mozart the Music Processor'
http://www.mozart.co.uk
For discussion/support see
http://www.mozart.co.uk/mozartists/mailinglist.htm

James Kuyper · Jan 15, 2009

Rainer said:
All kinds of things 'exist and are in use'. I arguably didn't read the
standard text careful enough and skipped over the exception clause.

The "exception clause" is the one that mandates IEEE-conforming double
precision format when predefining that macro - it's new in C99. It's the
general rule that allows for other formats.

But practially, this is again either obsolete or special purpose
hardware.

Most of the world's hardware can be dismissed as "special purpose", if
you're so inclined. Embedded processors, for instance, greatly exceed
desktop CPUs in their number and variety, even if you only consider the
multiple embedded processors that come with every desktop CPU. I can't
personally vouch for the accuracy of the claim that more new C code is
being written for them than for all desktop CPUs combined, but I
wouldn't be surprised to find that it's true.

My own personal experience with a non-IEEE 48 bit format that meets C's
requirements for double is MIL-STD-1750A, which I've mentioned in other
recent messages on comp.lang.c. It's nowhere near to being obsolete. In
1996 it was declared inactive for use in new military projects in the
USA; but the project I'm working on is non-military, and it's still in
active use here. According to Wikipedia,
<http://en.wikipedia.org/wiki/MIL-STD-1750A>, the Indian and Chinese
space programs also continue to use it.

James Kuyper · Jan 15, 2009

David said:
....
stop programming in C then.

Why? [Actually I did, many years ago - I do it in C++ these days.]

Click to expand...

Because all C structs can have a different size from what you "see on
the page". This applies even more so to C++; the requirements C++
imposes on POD structs are pretty much the same as for C structs. For
non-POD structs, C++ imposes even fewer restrictions on
implementations than it does for POD structs.

Click to expand...

Indeed, but my point is that with care you *can* get your data to have
the size you see on the page, that there is sometimes a good case for
making sure you do that, and...

You can do it only in an implementation-specific way. There's no general
way to do so. I'm not inclined to write unnecessarily
implementation-specific code. YMMV

...and that this nervousness is indeed an advantage, not just because
one works better, but because it leads one to think about what the
machine is doing, rather than just taking code on face value. And
C/C++ are languages where taking code on face value without thinking
about what the machine is doing is rather dangerous.

In fact I'd say that anyone who programs C/C++ without my particular
brand of nervousness is heading for a fall

The only quasi-legitimate reason I'm aware of for being concerned about
the parts of a struct that you don't use are if you plan to pass it
through a binary interface. If that's what's going on, then you should
be nervous, but for a very different reason; you shouldn't be using
structs for that purpose in the first place. You should be packing the
data into (or unpacking it from) an array (usually of type unsigned
char, unless all of the data is of a single data type).

Rainer Weikusat · Jan 15, 2009

James Kuyper said:
The "exception clause" is the one that mandates IEEE-conforming double
precision format when predefining that macro - it's new in C99. It's
the general rule that allows for other formats.

The 'exception clause' in the text I was refering to is the one which
defines the exception, namely, that support for the 'type mappings'
described in F.2 (which I read) is optional, as stated in F.1.

That it is possible to defined 'exception clause' as something
different, especially, when refering to other parts of the C-standard
does not imply anything for my usage of this term.

Most of the world's hardware can be dismissed as "special purpose", if
you're so inclined.

But I am not 'so inclined', as I already wrote, and insofar you are
'so inclined', as the sentence above suggest, please speak about you
instead of pretending to speak about me.

James Kuyper · Jan 15, 2009

Rainer said:
The 'exception clause' in the text I was refering to is the one which
defines the exception, namely, that support for the 'type mappings'
described in F.2 (which I read) is optional, as stated in F.1.

That's the same clause I'm referring to. It doesn't make support for
those type mappings optional. Those type mappings were already optional
before the __STDC_IEC_559__ macro was ever added to C99. What it
specifies is that systems have the option of predefining that macro, in
which case those mappings become mandatory.

But I am not 'so inclined', as I already wrote, and insofar you are
'so inclined', as the sentence above suggest, please speak about you
instead of pretending to speak about me.

You're misreading the sentence if you think it suggests that I'm so
inclined. The dismissal of floating point formats as "obsolete or
special purpose hardware" was yours, not mine.

David Webber · Jan 15, 2009

The only quasi-legitimate reason I'm aware of for being concerned about
the parts of a struct that you don't use are if you plan to pass it
through a binary interface...

No. Consider

struct A
{
char c;
double d;
char ca[7];
};

struct B
{
double d;
char ca[7];
char c;
};

A a[1000000];
B b[1000000];

with packing on 4 byte boundaries.

Dave
--
David Webber
Author of 'Mozart the Music Processor'
http://www.mozart.co.uk
For discussion/support see
http://www.mozart.co.uk/mozartists/mailinglist.htm

If that's what's going on, then you should

jameskuyper · Jan 15, 2009

David said:
The only quasi-legitimate reason I'm aware of for being concerned about
the parts of a struct that you don't use are if you plan to pass it
through a binary interface...

Click to expand...

No. Consider

struct A
{
char c;
double d;
char ca[7];
};

struct B
{
double d;
char ca[7];
char c;
};

A a[1000000];
B b[1000000];

with packing on 4 byte boundaries.

If there's any real need for packing on 4-byte boundaries, the
compiler should pad the struct to an exact multiple of 4 bytes, and
insert padding between c and d where needed, whether or not you insert
the declaration of 'ca'. If it doesn't, that's a QoI issue that should
lead you to look for a new compiler, not something you should work-
around by inserting useless fields.

Rainer Weikusat · Jan 15, 2009

James Kuyper said:
That's the same clause I'm referring to. It doesn't make support for
those type mappings optional.

It does. F.2 states that

1 The C floating types match the IEC 60559 formats as follows:

-- The float type matches the IEC 60559 single format.

-- The double type matches the IEC 60559 double format.

and the annex itself is classified as 'normative'. Which means that a
conforming implementation has to behave accordingly, except that F.1
says that 'An implementation that defines __STDC_IEC_559__ shall
conform to the specifications in this annex.' Which defines an
exception relieving implementations from the unqualified requirement
in the following section.

Instead of presenting arguments supporting you (at least to me
somewhat unintelligible) thesis, you are still busy with 'fabrication
mock contradictions by suitable redefinition of terms used in the
orignal (that is in mine) text for a different purpose'. That's a
perfect way to "win" every "game" (change the rules such that you won
"by definition") and I congratulate you to your obvious political
talents. They are not helpful anyhow, though, insofar the intention is
to determine the meaning of some text (and they are probably not even
helpful in winning followers among a practically inclined audience,
because this type of 'talk for the purpose of talking' tends -
pardon my french - to piss people off who are simply not interested in
policy. Like me, for instance.

HAND.

jameskuyper · Jan 15, 2009

Rainer Weikusat wrote:
....

"by definition") and I congratulate you to your obvious political
talents.

Believe me, I have none; otherwise people wouldn't get as upset with
me, the way you are, as often as they do.

There was really little point in my discussing whether or not that
clause was exceptional. It was just my way of expressing my surprise
about the fact that you were aware of an obscure new feature of C99,
which allows programmers to determine whether or not an implementation
supports IEEE/IEC 599, read the section which defines what it means to
provide such support, missed the fact that such support was optional
(which is stated in a moderately prominent location near the beginning
of that annex), and thereby reached the mistaken conclusion that it
was mandatory.

Simply reading the main portion of the standard, specifically sections
5.2.4.2.2 and 6.10.8p2, might have quickly led you to the suspicion,
at least, that IEEE double precision is not mandatory.

Ian Collins · Jan 15, 2009

David said:
I write software for Windows (and am viewing this cross-posted thread on
Microsoft's vc.language group). Portability?

That explains a lot.

But in fact the structures are perfectly portable to any compiler where
you can specify the n-byte packing alignment, which may not be a
language standard, but is surely not uncommon.

They'd be just as portable without the unnecessary noise at the end.

To be explicit: I would never dream of assuming that any given element
is so many bits after the start of the structure: that would be
foolish. But when you have arrays of thousands (or more) of them,
control over the size is bloody useful.

Nothing you have added controls the size.

Flash Gordon · Jan 15, 2009

David said:
struct check_struct
{
double a;
char b;
char reserved[7];

};

If you do that you can see the padded bytes.

Click to expand...

I don't think so. It will always show you the reserved member's

Click to expand...

contents but I think the implementation is still free to insert
padding in any amount and at any location in this structure it
pleases, except at the beggining.<

I dare say, in principle, but I use compiler options which specify the
padding properties, and it would be unusual for 8 byte boundaries not to
be safe.

[I have large arrays of structures with many bit fields, and it is in my
interest, to arrange them so that padding is unnecessary, and so I
control very carefully where the bit fields fit in relation to byte and
word boundaries. It's a useful principle of "defensive programming".]

Your adding in a "reserved" field at the end is not controlling it. In
particular, it is highly unlikely to decrease the size but could well
increase it on some implementations (there could be speed benefits to
having an array start on a specific boundary on some implementations).
So for large arrays of structures (which you refer to in other posts) I
would say it was better in general to NOT have the reserved field.

Of course, you are correct in suggesting that you should bare in mind
likely alignment restrictions when deciding on the order of fields and I
would agree with you that the following is BAD

struct bad {
double a;
char b;
double c;
}

sasha · Jan 15, 2009

David said:
[snip]

Definitely not. The best approach is to make life easy for the
compiler and check that you are doing it optimally. It isn't hard and
you end up with much more efficient code.

Sorry, but LOL! Make it easy for the compiler? And here for the last 25
years I was, thinking compilers were to make it easy for people.

Chris M. Thomasson · Jan 15, 2009

karthikbalaguru said:
Hi ,

I have the below structure.
struct check_struct{
double a;
char b;
};
My system reports the size of the above structure as 16 bytes.

I understand that there is some padding at the end of the above
structure and hence the size gets calculated to 16 bytes.
But, how to view the data/info that are padded using a debugger
like gdb or visual c++ debugger ?

I used watch windows but, it did not show the padded data .
Any ideas ?

you can try something like the following nasty hack:
___________________________________________________________
#include <stddef.h>
#include <stdio.h>

struct foo {
double m1;
char meof;
};

int main(void) {
char tmp;
struct foo f;

unsigned char* head = (((unsigned char*)&f) +
offsetof(struct foo, meof)) + sizeof(f.meof);

unsigned char* const tail = (unsigned char*)((&f) + 1);

size_t const size = tail - head;

printf("there seems to be %lu bytes of padding at "
"the end of `struct foo'...\n",
(unsigned long int)size);

if (size) {
tmp = head[size - 1];
head[size - 1] = '\0';
printf("pad bytes: %s%c\n\n", head, tmp);
head[size - 1] = tmp;
}

return 0;
}
___________________________________________________________

Practical packing for structs of bytes	12	Sep 17, 2010
where do the extra bytes go while using Malloc ?	15	Oct 23, 2007
sizeof unpadded struct size	1	Dec 17, 2008
Printing the bitfields	18	Mar 17, 2009
structure padding not considered by 'new'	24	Sep 4, 2007
padding mechanism in structures	10	Jun 15, 2005
((struct name *)0)->b ?	6	Aug 19, 2007
Passing an array of structures back through the argument list	9	Sep 7, 2013

Viewing the padded bytes

James Kuyper

Ian Collins

Rainer Weikusat

Rainer Weikusat

James Kuyper

Rainer Weikusat

David Webber

David Webber

James Kuyper

James Kuyper

Rainer Weikusat

James Kuyper

David Webber

jameskuyper

Rainer Weikusat

jameskuyper

Ian Collins

Flash Gordon

sasha

Chris M. Thomasson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads