std::vector and structs of PODs

S

Simon Elliott

I have ome new code which has to work with some legacy code which does
a lot of memset's and memcmp's on structs of PODs. This leads me to
want to do stuff like this:

struct foo
{
unsigned char c1_;
unsigned short us1_;
};

std::vector<foo> fooVect;


memset(&foo, 0, sizeof(foo));
fooVect.push_back(foo);

The compiler I've tested this with uses 4 byte alignment by default and
hence puts in some padding bytes. Looking at the copy of foo which is
now at fooVect[0], the padding bytes have random values. The zero
setting of the padding bytes has not been preserved when foo has been
copied to the vector.

I'd expect that what happens to padding bytes is undefined, but it
would be useful to have them preserved as zero. Any sensible,
non-kludgy, portable way of doing this?
 
V

Victor Bazarov

Simon said:
I have ome new code which has to work with some legacy code which does
a lot of memset's and memcmp's on structs of PODs. This leads me to
want to do stuff like this:

struct foo
{
unsigned char c1_;
unsigned short us1_;
};

std::vector<foo> fooVect;


memset(&foo, 0, sizeof(foo));
fooVect.push_back(foo);

The compiler I've tested this with uses 4 byte alignment by default and
hence puts in some padding bytes. Looking at the copy of foo which is
now at fooVect[0], the padding bytes have random values. The zero
setting of the padding bytes has not been preserved when foo has been
copied to the vector.

I'd expect that what happens to padding bytes is undefined, but it
would be useful to have them preserved as zero.

How would it be useful? I am genuinely curious.
Any sensible,
non-kludgy, portable way of doing this?

I don't think so. If you know what the size of 'foo' is with your current
alignment requirements, you could add

char padding[known_size_of_foo - sizeof(short) - 1];

To know what the size of 'foo' is, you need to replicate it

struct foo_replica {
char c; short s;
};

and use *its* size instead. Basically it should be

struct foo_replica
{
char c; short s;
};

struct foo
{
unsigned char c1_;
unsigned short us1_;
char padding[sizeof(foo_replica) - 1 - sizeof(short)];
};

Victor
 
K

Karl Heinz Buchegger

Simon said:
I have ome new code which has to work with some legacy code which does
a lot of memset's and memcmp's on structs of PODs. This leads me to
want to do stuff like this:

struct foo
{
unsigned char c1_;
unsigned short us1_;
};

std::vector<foo> fooVect;

memset(&foo, 0, sizeof(foo));
fooVect.push_back(foo);

The compiler I've tested this with uses 4 byte alignment by default and
hence puts in some padding bytes. Looking at the copy of foo which is
now at fooVect[0], the padding bytes have random values. The zero
setting of the padding bytes has not been preserved when foo has been
copied to the vector.

I'd expect that what happens to padding bytes is undefined, but it
would be useful to have them preserved as zero. Any sensible,
non-kludgy, portable way of doing this?

struct foo
{
foo() { memset( this, 0, sizeof( foo ) ); }
foo( const foo& Arg ) { memcpy( this, Arg, sizeof( foo ) ); }

unsigned char c1_;
unsigned short us1_;
};

But note: This works as long as data members of that struct qualify
the struct as a POD (It is no longer one, since a POD doesn't have
a ctor). At the moment somebody introdues a std::string or any other
class with dynamic management in it, it will fail miserably.
 
D

Dietmar Kuehl

Simon said:
The compiler I've tested this with uses 4 byte alignment by default and
hence puts in some padding bytes. Looking at the copy of foo which is
now at fooVect[0], the padding bytes have random values.

Sure but who cares? You can neither access these bytes nor take
any advantage of them: they are an implementation detail of the
compiler, nothing more. They might highlight some programming
errors, though: e.g. they may cause unterminated C-strings to
become terminated.
I'd expect that what happens to padding bytes is undefined, but it
would be useful to have them preserved as zero.

Why?
 
S

Simon Elliott

How would it be useful? I am genuinely curious.

Down in the legacy code, memcmp() is used extensively to compare these
things. Unfortunately, memcmp() will compare things like padding bytes,
trailing bytes in C strings etc. So either I find a way of doing this,
or I have to make some significant changes to legacy code.
Any sensible,
non-kludgy, portable way of doing this?

I don't think so. If you know what the size of 'foo' is with your
current alignment requirements, you could add

char padding[known_size_of_foo - sizeof(short) - 1];

To know what the size of 'foo' is, you need to replicate it

struct foo_replica {
char c; short s;
};

and use its size instead. Basically it should be

struct foo_replica
{
char c; short s;
};

struct foo
{
unsigned char c1_;
unsigned short us1_;
char padding[sizeof(foo_replica) - 1 - sizeof(short)];
};

I'm not convinced that this will work because the compiler inserts
padding bytes between c1_ and us1_ so that they align optimally for the
target processor (4 byte boundaries I think in the case of the compiler
I'm using at the moment.) However, perhaps I could do something along
similar lines:

struct foo_holder
{
union
{
foo foo_;
char padding_[sizeof(foo)];
}
};

Then my vector would be a vector of foo_holder, and I'd pass
foo_holder::foo_ down to the legacy code. I expect I could have a
templated holder struct which could hold all the legacy structs.

Or I could push an undefined foo onto the vector, and do something
like:
memset(&fooVect.back(), 0, sizeof(foo));
then fill in foo's fields as referenced by fooVect.back()
 
D

Dietmar Kuehl

Simon said:
Down in the legacy code, memcmp() is used extensively to compare these
things. Unfortunately, memcmp() will compare things like padding bytes,
trailing bytes in C strings etc. So either I find a way of doing this,
or I have to make some significant changes to legacy code.

Hm, OK. I don't think this is the best way to it but it seems like a
reasonable request. Do you need to pass (pointers to) arrays of your
type? If not, you could use something like this:

/**/ struct wrapper
/**/ {
/**/ wrapper() { std::memset(&m_foo, 0, sizeof(foo)); }
/**/ wrapper(wrapper const& w) {
/**/ std::memcpy(&m_foo, &w.mem_foo, sizeof(foo));
/**/ foo m_foo;
/**/ };

Even if you need arrays of 'foo's, you might be able to use this
wrapper technique, if 'sizeof(wrapper) == sizeof(foo)': you could
just reinterpret_cast to the appropriate type...
However, perhaps I could do something along similar lines:

struct foo_holder
{
union
{
foo foo_;
char padding_[sizeof(foo)];
}
};

Why use a union? If you use a wrapper, use it with appropriate
constructors...
 
L

Llewelly

Dietmar Kuehl said:
Simon said:
The compiler I've tested this with uses 4 byte alignment by default and
hence puts in some padding bytes. Looking at the copy of foo which is
now at fooVect[0], the padding bytes have random values.

Sure but who cares? You can neither access these bytes nor take
any advantage of them: they are an implementation detail of the
compiler, nothing more. They might highlight some programming
errors, though: e.g. they may cause unterminated C-strings to
become terminated.
I'd expect that what happens to padding bytes is undefined, but it
would be useful to have them preserved as zero.

Why?

It can help hide programming errors. Remember, the longer a
programming error goes undected, the more chances it has to cause
an embarassing crash during a demo.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top