sizeof shows different size for a structure

A

Anand Buddhdev

Hi everyone,

I'm a C newbie, so please be gentle. I have a program that defines the
following things:

typedef union
{
unsigned int I;
unsigned char b[4];
} dword;

/* Structure for record 0 of the DAWG. */
typedef struct
{
unsigned char magic[5];
unsigned char name[11];
unsigned char title[41];
unsigned char desc[41];
unsigned char author[41];
unsigned char extra[41];
unsigned char ch[64];
unsigned char numchars;
unsigned char catsym[8];
unsigned char catname[8][11];
unsigned char catinclude[8][8];
unsigned char numcategories;
dword numnodes;
} dawghdr;

I expect the size of the dawghdr structure to be 410 bytes. However,
on compiling this program with gcc 3.3.2 on linux, dawghdr uses 412
bytes. I've examined the memory space of this structure, and it seems
to me that 2 extra bytes appear between the numcategories and numnodes
members. Could anyone be kind enough to explain why this might be
happening?

The program writes this structure to a disk file as a record, and I
later read this output file with a python script, but the script falls
over, as it fails to read the correct value of numnodes, since it has
no way of knowing about the 2 extra bytes.
 
C

Christian Haselbach

Anand Buddhdev said:
I expect the size of the dawghdr structure to be 410 bytes. However,
on compiling this program with gcc 3.3.2 on linux, dawghdr uses 412
bytes. I've examined the memory space of this structure, and it seems
to me that 2 extra bytes appear between the numcategories and numnodes
members. Could anyone be kind enough to explain why this might be
happening?

The compiler is allowed to put padding between the elements.
This question is coverd in the faq:
http://www.eskimo.com/~scs/C-faq/q2.13.html
The program writes this structure to a disk file as a record, and I
later read this output file with a python script, but the script falls
over, as it fails to read the correct value of numnodes, since it has
no way of knowing about the 2 extra bytes.

This "solution" is highly compiler dependent. You should write an own
method. And don't forget to take care of endianess.

Ciao Chriss
 
D

Darksun4

Anand Buddhdev said:
Hi everyone,

I'm a C newbie, so please be gentle. I have a program that defines the
following things:

typedef union
{
unsigned int I;
unsigned char b[4];
} dword;

/* Structure for record 0 of the DAWG. */
typedef struct
{
unsigned char magic[5];
unsigned char name[11];
unsigned char title[41];
unsigned char desc[41];
unsigned char author[41];
unsigned char extra[41];
unsigned char ch[64];
unsigned char numchars;
unsigned char catsym[8];
unsigned char catname[8][11];
unsigned char catinclude[8][8];
unsigned char numcategories;
dword numnodes;
} dawghdr;

I expect the size of the dawghdr structure to be 410 bytes. However,
on compiling this program with gcc 3.3.2 on linux, dawghdr uses 412
bytes. I've examined the memory space of this structure, and it seems
to me that 2 extra bytes appear between the numcategories and numnodes
members. Could anyone be kind enough to explain why this might be
happening?

Conpiler padding. Check your compiler's documentation
The program writes this structure to a disk file as a record, and I
later read this output file with a python script, but the script falls
over, as it fails to read the correct value of numnodes, since it has
no way of knowing about the 2 extra bytes.

Try writing(and reading) the structure element by element. Using that way
you don't care about padding.
 
M

Martin Ambuhl

Anand Buddhdev wrote:

....
I expect the size of the dawghdr structure to be 410 bytes. However,
on compiling this program with gcc 3.3.2 on linux, dawghdr uses 412
bytes.
....

For crying out loud, doesn't *anyone* check the FAQ before posting?
 
R

Rich Gibbs

Anand Buddhdev said the following, on 07/21/04 07:19:
Hi everyone,

I'm a C newbie, so please be gentle. I have a program that defines the
following things:

typedef union
{
unsigned int I;
unsigned char b[4];
} dword;

/* Structure for record 0 of the DAWG. */
typedef struct
{
unsigned char magic[5];
unsigned char name[11];
unsigned char title[41];
unsigned char desc[41];
unsigned char author[41];
unsigned char extra[41];
unsigned char ch[64];
unsigned char numchars;
unsigned char catsym[8];
unsigned char catname[8][11];
unsigned char catinclude[8][8];
unsigned char numcategories;
dword numnodes;
} dawghdr;

I expect the size of the dawghdr structure to be 410 bytes. However,
on compiling this program with gcc 3.3.2 on linux, dawghdr uses 412
bytes. I've examined the memory space of this structure, and it seems
to me that 2 extra bytes appear between the numcategories and numnodes
members. Could anyone be kind enough to explain why this might be
happening?

The program writes this structure to a disk file as a record, and I
later read this output file with a python script, but the script falls
over, as it fails to read the correct value of numnodes, since it has
no way of knowing about the 2 extra bytes.

The compiler is allowed to insert padding bytes between elements of a
structure, in order to meet alignment restrictions of the
implementation. This is covered in the FAQ:
http://www.eskimo.com/~scs/C-faq/

The solution to your specific problem is going to be
implementation-dependent.

[OT]
Sometimes you can avoid the requirement for padding by placing the
structure elements in descending order of length. YMMV.
 
G

Gordon Burditt

I expect the size of the dawghdr structure to be 410 bytes. However,

Well, stop expecting that. The size of the structure is what the
compiler decides it will be (sizeof(struct dawghdr), not any
particular fixed integer). Including padding, if it feels like it.

Gordon L. Burditt
 
O

Old Wolf

Martin Ambuhl said:
Anand Buddhdev wrote:

...
...

For crying out loud, doesn't *anyone* check the FAQ before posting?

It's hard to find a specific issue in the online FAQ, except for the
small subset of issues which are easy to find, and also difficult to
find it in the flat text file if it isn't something you can easily
search for. The only time I've gotten something useful out of it was
when I read it from start to finish. Even then, I only remembered
things because I knew most of it already; for someone new to C
programming it may well be in one ear and out the other.

The HTML FAQ would be much more useful if it:
- had a single page with a hyperlinked summary-line of each entry
- had a text search facility
- was updated (I've heard the HTML ver is lagging behind the text ver)
- had an introductory page with the 20 or so most commonly asked
questions (in the format of all the hyperlinked summary lines at
the top, and all the answers at the bottom). It's a FAQ, not an EAQ
(Ever Asked Questions). Most of its entries are things that haven't
come up for years. It's annoying to wade through all these things.

(Note: this tirade was aimed at explaining why people ask FAQ questions
so often, it wasn't an attack on the FAQ maintainer).
 
S

Sam Halliday

Martin said:
For crying out loud, doesn't *anyone* check the FAQ before posting?

well, clearly not you either mr troll. and i quote:

"This is a large and heavy document, so don't assume that
everyone on the net has managed to read all of it in
detail, and please don't roll it up and thwack people
over the head with it just because they missed their
answer in it."
ftp://ftp.eskimo.com/u/s/scs/C-faq/faq.gz
 
B

Barry Schwarz

Hi everyone,

I'm a C newbie, so please be gentle. I have a program that defines the
following things:

typedef union
{
unsigned int I;
unsigned char b[4];
} dword;

/* Structure for record 0 of the DAWG. */
typedef struct
{
unsigned char magic[5];
unsigned char name[11];
unsigned char title[41];
unsigned char desc[41];
unsigned char author[41];
unsigned char extra[41];
unsigned char ch[64];
unsigned char numchars;
unsigned char catsym[8];
unsigned char catname[8][11];
unsigned char catinclude[8][8];
unsigned char numcategories;
dword numnodes;
} dawghdr;

I expect the size of the dawghdr structure to be 410 bytes. However,
on compiling this program with gcc 3.3.2 on linux, dawghdr uses 412
bytes. I've examined the memory space of this structure, and it seems
to me that 2 extra bytes appear between the numcategories and numnodes
members. Could anyone be kind enough to explain why this might be
happening?

dword most likely requires alignment on a 4-byte boundary. In order
to insure that numnodes in dawghdr meets this requirement in an array
of dawghdr, sizeof(dawghdr) must be a multiple of 4. 412 is the first
multiple of 4 >= 410. Furthermore, when dawghdr starts on a 4-byte
boundary, numnodes would naturally fall at +406 which is not so the
compiler slid it to the right two bytes.
The program writes this structure to a disk file as a record, and I
later read this output file with a python script, but the script falls
over, as it fails to read the correct value of numnodes, since it has
no way of knowing about the 2 extra bytes.

If you need to have a buffer with everything at a specific location, I
suggest you create a 410 byte array of unsigned char and use
strcpy/memcpy to transfer data between the buffer and the struct. You
have to handle each member of the struct individually since the
compiler is allowed to put padding anywhere except before the first
member (you cannot assume magic and name are adjacent).


<<Remove the del for email>>
 
R

Richard Bos

It's hard to find a specific issue in the online FAQ, except for the
small subset of issues which are easy to find,

Oh, for heavens' sake!

Open the FAQ page.
Notice that section two - _two_, mind you, not sixtyone or a hundred and
seven - is called "Structures, Unions and Enumerations". Gosh, what
_would_ that be about?
Click on that link.
Scroll down a bit.
Notice that question 2.13 - yes, you may have to scroll, but it's
_thirteen_, not ninety-and-a-half - is titled "Why does sizeof report a
larger size than I expect for a structure type?" - a wording, you will
agree, which is not very dissimilar from this thread's original subject.
Read that question.

Now, was _that_ hard to find?

Richard
 
D

Dan Pop

In said:
well, clearly not you either mr troll. and i quote:

"This is a large and heavy document, so don't assume that
everyone on the net has managed to read all of it in
^^^^^^^^^^^^^^^^^^^^
detail, and please don't roll it up and thwack people ^^^^^^
over the head with it just because they missed their
answer in it." ^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^
ftp://ftp.eskimo.com/u/s/scs/C-faq/faq.gz

It applies *only* to people who give any sign of having perused the FAQ,
not to the ones who didn't bother checking it before posting. But this
is a far too subtle point for people like you.

The FAQ having a nice table of contents, its size is irrelevant to most
questions. All one had to do to find the answer to the question under
discussion was to read the list of sections, up to the second entry:

1. Declarations and Initializations
2. Structures, Unions, and Enumerations

and then the TOC of Section 2 up to:

2.1: What's the difference between struct x1 { ... }; and
typedef struct { ... } x2; ?
2.2: Why doesn't "struct x { ... }; x thestruct;" work?
2.3: Can a structure contain a pointer to itself?
2.4: How can I implement opaque (abstract) data types in C?
2.4b: Is there a good way of simulating OOP-style inheritance in C?
2.6: I came across some code that declared a structure with the last
member an array of one element, and then did some tricky
allocation to make it act like the array had several elements.
Is this legal or portable?
2.8: Is there a way to compare structures automatically?
2.10: Can I pass constant values to functions which accept structure
arguments?
2.11: How can I read/write structures from/to data files?
2.12: How can I turn off structure padding?
2.13: Why does sizeof report a larger size than I expect for a
structure type?

Someone whose intellectual capabilities are exceeded by this exercise
has no business programming computers, in *any* language.

Dan
 
S

Sam Halliday

Dan said:
But this is a far too subtle point for people like you.

do you ever get any work done, dan? you just seem to spend all day
and night on c.l.c, flaming people.
 
D

Dan Pop

In said:
do you ever get any work done, dan? you just seem to spend all day
and night on c.l.c, flaming people.

I couldn't expect you to notice the rest...

Dan
 
D

Default User

Old Wolf wrote:
It's hard to find a specific issue in the online FAQ, except for the
small subset of issues which are easy to find, and also difficult to
find it in the flat text file if it isn't something you can easily
search for.


For searching the HTML one, I go to the "all questions" page and use the
browser search feature. That works pretty well. In the given case, a
search on "struct" or "sizeof" would quickly find a relevant answer.

The point is not whether the person could find it, it's whether any
attempt was made.



Brian Rodenborn
 
D

Dan Pop

In said:
Old said:
[...]
The HTML FAQ would be much more useful if it:
- had a single page with a hyperlinked summary-line of each entry

It has exactly that. See

http://www.eskimo.com/~scs/C-faq/questions.html

I suspect the wish was for a FAQ on a single html page, plus the
hyperlinked summary lines. It would make it searchable.

I merge the plain text version of the FAQ with the plain text table
of contents (I have yet to figure out why Steve keeps them separate) and
have a plain text document with a TOC and perfectly searchable with
any pager.

Dan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,263
Messages
2,571,064
Members
48,769
Latest member
Clifft

Latest Threads

Top