Padding bits and struct assignment

  • Thread starter Hallvard B Furuseth
  • Start date
H

Hallvard B Furuseth

Does struct assignment copy padding bytes? Some compilers do, but
I couldn't find anything in the standard which says they must.

What I need is for any padding bytes to contan initialized values
before fwrite(), to shut up memory debuggers like Valgrind about
writing uninitialized data to the file.

Simplified code:
static const struct S default_value = ...;
struct S s, t;
t = default_value;
for (...) {
s = t;
...modify s and t...;
fwrite(&s, sizeof(s), 1, f);
}

If padding bytes not copied, I guess I could wrap the struct in
union {
struct S s;
char bytes[sizeof(struct S)]
}
and copy that instead of struct S, or memset both s and t after
the declarations. (When Valgrind copies an uninitialized byte
from t to s, it remembers that that byte in s contains and
uninitialized value.)
 
E

Eric Sosman

Hallvard said:
Does struct assignment copy padding bytes? Some compilers do, but
I couldn't find anything in the standard which says they must.

It is not required that padding bytes be copied.

6.2.6 Representations of types
6.2.6.1 General
[...]
6/ When a value is stored in an object of structure or
union type [...] the bytes of the object representation that
correspond to any padding bytes take unspecified values. 42)

42) Thus, for example, structure assignment may be
implemented element-at-a-time or via memcpy.
What I need is for any padding bytes to contan initialized values
before fwrite(), to shut up memory debuggers like Valgrind about
writing uninitialized data to the file.

Simplified code:
static const struct S default_value = ...;
struct S s, t;
t = default_value;
for (...) {
s = t;
...modify s and t...;
fwrite(&s, sizeof(s), 1, f);
}

Hmmm. All the named elements of default_value are initialized,
either explicitly by the initializer or implicitly by virtue of the
static storage duration, but I don't think the guarantee extends to
its padding bytes. Even if the padding bytes are initialized (as
seems likely on common implementations), the assignment to t need
not copy them and may leave the padding bytes in t uninitialized.
You could do memset(&t, 42, sizeof t) first, but even then I don't
think you're completely safe: if the padding bytes of default_value
are uninitialized and the assignment *does* copy them, then the
padding bytes of t wind up "uninitialized," too.
If padding bytes not copied, I guess I could wrap the struct in
union {
struct S s;
char bytes[sizeof(struct S)]
}
and copy that instead of struct S, or memset both s and t after
the declarations. (When Valgrind copies an uninitialized byte
from t to s, it remembers that that byte in s contains and
uninitialized value.)

This may silence the memory monitor, but from C's point of
view I don't think it makes a difference. *Every* assignment
to a struct/union or to any of its members may pollute whatever
padding bytes are present. The memory monitor may no longer see
them as "uninitialized," but they are "potentially garbage"
nonetheless.

The only foolproof method I can think of is unfortunately
icky, and won't work for structs with bit-fields:

struct S s = ...;
unsigned char b[sizeof s];
memset (b, 42, sizeof b);
memcpy (b + offsetof(struct S, e1), &s.e1, sizeof s.e1);
memcpy (b + offsetof(struct S, e2), &s.e2, sizeof s.e2);
...
fwrite (b, sizeof b, 1, f);

.... which would probably be improved by a little table of sizes
and offsets. Given such a table, though, you might choose not
to write the padding bytes out at all.
 
H

Hallvard B Furuseth

Eric said:
It is not required that padding bytes be copied.

6.2.6 Representations of types
(...)

Thanks. That's what I was not hoping for:)
Hmmm. All the named elements of default_value are initialized,
either explicitly by the initializer or implicitly by virtue of the
static storage duration, but I don't think the guarantee extends to
its padding bytes. (...)

Probably true, but Valgrind & co treat static storage as initialized -
at least on the few hosts I have seen. I guess they have no way to tell
the difference between an uninitialized static byte and a static
variable with no initializer.

I think whether I'll use a static will depend on how cumbersome it
gets not to, and how safe the rest of the code gets. Which after
your reply doesn't look too hopeful...
If padding bytes not copied, I guess I could wrap the struct in
union {
struct S s;
char bytes[sizeof(struct S)]
}
and copy that instead of struct S, or memset both s and t after
the declarations. (When Valgrind copies an uninitialized byte
from t to s, it remembers that that byte in s contains and
uninitialized value.)

This may silence the memory monitor, but from C's point of
view I don't think it makes a difference. *Every* assignment
to a struct/union or to any of its members may pollute whatever
padding bytes are present. The memory monitor may no longer see
them as "uninitialized," but they are "potentially garbage"
nonetheless.

Oh, of course. I stuff nicely initialized data into the union,
then assign to a bitfield or char member in struct, and maybe the
padding bytes in the same word get scrambled:-(

Hmm. That's a problem with security too, if one wants to fwrite()
structs but be sure to not write other internal data in the program.
The only foolproof method I can think of is unfortunately
icky, and won't work for structs with bit-fields:

struct S s = ...;
unsigned char b[sizeof s];
memset (b, 42, sizeof b);
memcpy (b + offsetof(struct S, e1), &s.e1, sizeof s.e1);
memcpy (b + offsetof(struct S, e2), &s.e2, sizeof s.e2);
...
fwrite (b, sizeof b, 1, f);

... which would probably be improved by a little table of sizes
and offsets. Given such a table, though, you might choose not
to write the padding bytes out at all.

Ouch. I don't think I'll be going there just to be Valgrind-safe,
not until it actually complains anyway. After all the code is
correct as C code, it's just Valgrind which gets noisy.

Thanks for the help.
 
E

Eric Sosman

Hallvard said:
[about writing "uninitialized" padding bytes with fwrite]

Hmm. That's a problem with security too, if one wants to fwrite()
structs but be sure to not write other internal data in the program.

I've seen a related problem that didn't even need padding
bytes. Roughly, the scenario was:

struct {
char first[SIZE];
char last[SIZE];
} name;
strcpy (name.first, "Hallvard");
strcpy (name.last, "Furuseth");
fwrite (&name, sizeof name, 1, stream);

Each strcpy() writes to the first nine bytes of its target
array, but leaves the tail ends untouched (assuming SIZE > 9).
Whatever happens to be lying around in those locations gets
written to the output stream; it may or may not be something
that ought to be revealed. In the particular case I ran across,
the telephone numbers of some of my co-workers were visible in
the disk file.
 
C

CBFalconer

Eric said:
Hallvard said:
[about writing "uninitialized" padding bytes with fwrite]

Hmm. That's a problem with security too, if one wants to fwrite()
structs but be sure to not write other internal data in the program.

I've seen a related problem that didn't even need padding
bytes. Roughly, the scenario was:

struct {
char first[SIZE];
char last[SIZE];
} name;
strcpy (name.first, "Hallvard");
strcpy (name.last, "Furuseth");
fwrite (&name, sizeof name, 1, stream);

Each strcpy() writes to the first nine bytes of its target
array, but leaves the tail ends untouched (assuming SIZE > 9).
Whatever happens to be lying around in those locations gets
written to the output stream; it may or may not be something
that ought to be revealed. In the particular case I ran across,
the telephone numbers of some of my co-workers were visible in
the disk file.

Sounds like an excellent opportunity for strncpy.

strncpy(name.first, "Hallvard", SIZE);
strncpy(name.last, "Furuseth", SIZE);
 
E

Eric Sosman

CBFalconer said:
Eric said:
Hallvard said:
[about writing "uninitialized" padding bytes with fwrite]

Hmm. That's a problem with security too, if one wants to fwrite()
structs but be sure to not write other internal data in the program.
I've seen a related problem that didn't even need padding
bytes. Roughly, the scenario was:

struct {
char first[SIZE];
char last[SIZE];
} name;
strcpy (name.first, "Hallvard");
strcpy (name.last, "Furuseth");
fwrite (&name, sizeof name, 1, stream);

Each strcpy() writes to the first nine bytes of its target
array, but leaves the tail ends untouched (assuming SIZE > 9).
Whatever happens to be lying around in those locations gets
written to the output stream; it may or may not be something
that ought to be revealed. In the particular case I ran across,
the telephone numbers of some of my co-workers were visible in
the disk file.

Sounds like an excellent opportunity for strncpy.

strncpy(name.first, "Hallvard", SIZE);
strncpy(name.last, "Furuseth", SIZE);

Yes, that was my quick and dirty fix. "Dirty" because a
better solution would have been to devise a file format that
(1) didn't depend on non-portable details like struct layout
and (2) didn't waste time writing and reading useless bytes.
However, I also needed "quick," and since there was a large
number of old-format files kicking around in our own and our
customers' computers ... The desire for "quick" trumped the
distaste for "dirty."
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top