variable size structs and diminishing returns

D

David Mathog

In the EMF graphics file format there are records (for instance,
EMR_EXTRECREATEPEN) which correspond to structs like this:

#typedef struct {
/* bunch of fields */
uint32_t numEntries; /* number of members in Entries */
SOMETYPE Entries[1];
} FIRSTSTRUCT;;

#typedef struct {
/* bunch of other fields */
uint32_t offBmi; /* offset to bitmapinfo from start of record */
uint32_t cbBmi; /* size of bitmapinfo in bytes */
uint32_t offBits; /* offset to bitmap from start of record */
uint32_t cbBits; /* size of bitmap in bytes */
FIRSTSTRUCT fieldname;
} SECONDSTRUCT;

Where the bitmapinfo and bitmap data (also both structs) follow in the
file record like:
SECONDSTRUCT <x>bitmapinfo<x>bitmap<x>
and<x> is optional space which is ignored.

As far as I can tell the intended benefit of having the Entries[1]
array within the defining struct is to allow
the programmer to reference it by name, like:

FIRSTSTRUCT *data;
data->Entries = memcpy();

However, the flip side of that is that it makes calculating the
offsets for the (optional) bitmap fields in
the file record a PITA. One cannot just use:

offBmi = recordinmemory + sizeof(SECONDSTRUCT);

because SECONDSTRUCT, courtesy of FIRSTSTRUCT is rarely if ever
sizeof(SECONDSTRUCT) bytes. This results in code like:

offBmi = sizeof(SECONDSTRUCT) + (data->numEntries-1)*sizeof(SOMETYPE);

Is this really an improvement over:

#typedef struct {
/* bunch of fields */
uint32_t numEntries; /* number of members in Entries */
/* SOMETYPE Entries[1]; follows in file record, but is not
explicitly in the struct */
} ALTSTRUCT;

(then define SECONDSTRUCT with ALTSTRUCT, leaving the array out of the
structs completely, as is
already the case for the bitmapinfo and bitmaps)

where

offBmi = sizeof(SECONDSTRUCT) + data->numEntries*sizeof(SOMETYPE);

and the variable Entries[] array is to be found at: recordinmemory +
sizeof(ALTSTRUCT);

?

Why include variable array(s) within the struct using the ARRAY[1]
notation? Is there some compelling reason to do it that way?

Thanks,

David Mathog
 
H

Heikki Kallasjoki

In the EMF graphics file format there are records (for instance,
EMR_EXTRECREATEPEN) which correspond to structs like this:

#typedef struct {
/* bunch of fields */
uint32_t numEntries; /* number of members in Entries */
SOMETYPE Entries[1];
} FIRSTSTRUCT;; ....
Is this really an improvement over:

#typedef struct {
/* bunch of fields */
uint32_t numEntries; /* number of members in Entries */
/* SOMETYPE Entries[1]; follows in file record, but is not
explicitly in the struct */
} ALTSTRUCT; ....
Why include variable array(s) within the struct using the ARRAY[1]
notation? Is there some compelling reason to do it that way?

Besides the obvious thing of being able to refer to it by name, there is
at least the reason that the compiler will make sure that the "Entries"
member is properly aligned for SOMETYPE. That is not necessarily the
case for malloc(...)+sizeof(ALTSTRUCT).

A C99 "flexible array member" (which does not count in sizeof of the
structure type) would possibly be the optimal solution, were C99 support
universal. (Or even existed at the time the structures above were
defined.)
 
D

David Mathog

Besides the obvious thing of being able to refer to it by name, there is
at least the reason that the compiler will make sure that the "Entries"
member is properly aligned for SOMETYPE.  That is not necessarily the
case for malloc(...)+sizeof(ALTSTRUCT).

OK, alignment is a good reason. Referring by name may not even work
though, for instance
there are other structs like for EMR_POLYPOLYLINE (not sure how stable
this link will be)

http://msdn.microsoft.com/en-us/library/dd162568(v=vs.85).aspx

where the struct ends in:
DWORD aPolyCounts[1];
POINTL aptl[1];

Hard to imagine the situation where aptl could ever be successfully
referenced by name when aPolyCounts is
also used.
A C99 "flexible array member" (which does not count in sizeof of the
structure type) would possibly be the optimal solution, were C99 support
universal.

I thought that structs including flexible array members were not
allowed to be members of other structs (or used in arrays, not that
that applies here). Otherwise flexible array members would be good
here because sizeof() would ignore the variable array on the end.

Basically the problem is that since the run time data structure in
memory (or on disk) looks like:

<struct1><variable length field(s)><struct2><variable length
field(s)>

there really is no way at compile time to reliably reference anything
but the members of struct1 by name using a pointer to the beginning of
this data assembly. (And then only if there aren't 2 arrays on the
end of struct1!) The compiler can generate name references to members
of struct2, but only with reference to a new memory pointer to the
beginning of that struct, and that pointer must be constructed at run
time.

Tnanks,

David Mathog
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,905
Latest member
Kristy_Poole

Latest Threads

Top