gcc alignment options

B

BartC

I know this is considered off-topic but perhaps someone's got some ideas.

I have a struct like this with elements all packed:

#pragma pack(1)

typedef struct s {
short int a;
char b;
char c;
int d;
} V;

V *p;

When using gcc to access the field p->d, then it seems to assume that it's
an unaligned access (and generates atrocious code which loads or stores the
field a byte at a time). The offset of 'd' is 4 bytes.

The problem goes away when pack(1) is removed (even though the possibility
of p being misaligned, if that's what it's worried about, is the same).

This is for an ARM processor where alignment seems to be important, but
which already has performance issues. Perhaps there is a gcc option to turn
off this behaviour?

(I think a workaround is to not use pack(1) since I already define the
fields optimally - in my view. But I need to ensure no extra padding is
inserted in the the actual struct which is much more complex than the
example.)

Thanks.
 
I

Ian Collins

I know this is considered off-topic but perhaps someone's got some ideas.

I have a struct like this with elements all packed:

#pragma pack(1)

typedef struct s {
short int a;
char b;
char c;
int d;
} V;

V *p;

When using gcc to access the field p->d, then it seems to assume that it's
an unaligned access (and generates atrocious code which loads or stores the
field a byte at a time). The offset of 'd' is 4 bytes.

The problem goes away when pack(1) is removed (even though the possibility
of p being misaligned, if that's what it's worried about, is the same).

This is for an ARM processor where alignment seems to be important, but
which already has performance issues. Perhaps there is a gcc option to turn
off this behaviour?

(I think a workaround is to not use pack(1) since I already define the
fields optimally - in my view. But I need to ensure no extra padding is
inserted in the the actual struct which is much more complex than the
example.)

It sounds like you are fighting the compiler. Why don't you use a
compile time assert to verify the structure size?
 
S

Stephen Sprunk

When using gcc to access the field p->d, then it seems to assume
that it's an unaligned access (and generates atrocious code which
loads or stores the field a byte at a time). The offset of 'd' is 4
bytes.

The problem goes away when pack(1) is removed (even though the
possibility of p being misaligned, if that's what it's worried
about, is the same).

Normally, GCC will lay out the struct in memory so that all of the
fields are aligned, by inserting padding where necessary, so there is no
need to worry about misaligned accesses. If you manually pack the
struct, then you don't need #pragma pack, and GCC will generate code for
aligned loads like you want it to.

Using #pragma pack tells GCC to allow the fields to be misaligned, it
will generate code to handle potentially misaligned loads--even if they
don't happen to be misaligned in practice.

S
 
K

Keith Thompson

Stephen Sprunk said:
Normally, GCC will lay out the struct in memory so that all of the
fields are aligned, by inserting padding where necessary, so there is no
need to worry about misaligned accesses. If you manually pack the
struct, then you don't need #pragma pack, and GCC will generate code for
aligned loads like you want it to.

Using #pragma pack tells GCC to allow the fields to be misaligned, it
will generate code to handle potentially misaligned loads--even if they
don't happen to be misaligned in practice.

But gcc *could* have noticed that d happens to be aligned properly,
and generated simpler and faster code to access it.

Note that gcc's "#pragma pack" can be unsafe in some cases. Taking
the address of a misaligned member loses the alignment information.
See <http://stackoverflow.com/q/8568432/827263>.

To steer this back to some semblance of topicality, the behavior of
a #pragma other than the language-defined ones is *very* loosely
defined. N1570 6.10.6:

A preprocessing directive of the form

# pragma pp-tokens[opt] new-line

where the preprocessing token "STDC" does not immediately follow
"pragma" in the directive (prior to any macro replacement)
causes the implementation to behave in an implementation-defined
manner. The behavior might cause translation to fail or
cause the translator or the resulting program to behave in a
non-conforming manner. Any such "pragma" that is not recognized
by the implementation is ignored.

(In the above, I used quotation marks to denote bold type.)
 
K

Kaz Kylheku

The problem goes away when pack(1) is removed (even though the possibility
of p being misaligned, if that's what it's worried about, is the same).

This is for an ARM processor where alignment seems to be important, but
which already has performance issues. Perhaps there is a gcc option to turn
off this behaviour?

Now I'm a little rusty about details of the ARM architecture, but how this
works on some other architectures such as MIPS is that if you get gcc to
generate misaligned accessses, the result will be far worse. Because these will
trigger exceptions, and then it's up to an exception handler in the kernel to
make those accesses work. That is orders of magnitude slower than the
atrocious code which requires no traps.

It's a time-space tradeoff. You save space by packing, but then there are extra
cycles to do the access. If you think you can write better code to work with
the packed structure, you can make your own accessors in inline assembly
language.
(I think a workaround is to not use pack(1) since I already define the
fields optimally - in my view. But I need to ensure no extra padding is
inserted in the the actual struct which is much more complex than the
example.)

Is that for conforming to some network or file format?

You know, you could have two structures for that: the internal one and external
one. Do a field-for-field copy between them at the system boundaries where the
structure enters your program or is output. The in the guts of the program,
work with the unpacked one.

This kind of thing is commonly done (even without the packed structures: just
marshalling routines which pack the fields of a structure into a buffer, or
byte stream).
 
A

Alan Curry

I know this is considered off-topic but perhaps someone's got some ideas.

I have a struct like this with elements all packed:

#pragma pack(1)

typedef struct s {
short int a;
char b;
char c;
int d;
} V;

V *p;

When using gcc to access the field p->d, then it seems to assume that it's
an unaligned access (and generates atrocious code which loads or stores the
field a byte at a time). The offset of 'd' is 4 bytes.

The single-byte access is necessary because the whole struct is packed,
meaning that the compiler no longer feels free to assume that the struct
itself is aligned on a 4-byte boundary.

Here's a sample program which demonstrates the difference between a packed
struct and a non-packed struct with packed members. I don't have an ARM
compiler so I can't test the effect on loads and stores, but I'm betting it
will make a difference.

#include <stdio.h>

#pragma pack(push)
#pragma pack(1)
struct s1 {
short int a;
char b;
char c;
int d;
};
#pragma pack(pop)

struct s2 {
#pragma pack(push)
#pragma pack(1)
short int a;
char b;
char c;
int d;
#pragma pack(pop)
};

struct s1 *p_s1;
struct s2 *p_s2;

int main(void)
{
printf("%lu\n", __alignof__(struct s1));
printf("%lu\n", __alignof__(struct s2));
printf("%lu\n", __alignof__(*p_s1));
printf("%lu\n", __alignof__(*p_s2));
return 0;
}
 
K

Keith Thompson

Keith Thompson said:
But gcc *could* have noticed that d happens to be aligned properly,
and generated simpler and faster code to access it.

As Alan Curry points out, gcc's #pragma pack affects the alignment of
the structure itself, not just the layout of its members.

Which probably implies that BartC shouldn't be using #pragma pack
in this case. As Ian Collins says, you can just verify that the
struct has the size you want.
 
B

BartC

Alan Curry said:
The single-byte access is necessary because the whole struct is packed,
meaning that the compiler no longer feels free to assume that the struct
itself is aligned on a 4-byte boundary.

Here's a sample program which demonstrates the difference between a packed
struct and a non-packed struct with packed members. I don't have an ARM
compiler so I can't test the effect on loads and stores, but I'm betting
it
will make a difference.
#pragma pack(push)
#pragma pack(1)
struct s1 { ....
struct s2 {
#pragma pack(push)
#pragma pack(1)
short int a;

....

OK, I see the difference. The program gives 1,4,1,4. So pack(1) also means
the following struct type may itself be unaligned, if the target of a
pointer.

That also explains why, when copying such a struct from one pointer another,
it used a memcpy() call, rather than inline code.
 
B

BartC

BartC said:
I have a struct like this with elements all packed:

#pragma pack(1)

typedef struct s { ....
When using gcc to access the field p->d, then it seems to assume that it's
an unaligned access (and generates atrocious code which loads or stores
the
field a byte at a time). The offset of 'd' is 4 bytes.

Thanks for the replies. It looks like I'll be able to get away without using
pack(1), which is useful since it made my brief tests run up to 3 times as
fast! That's partly due to the compiler being able to use fast inline code
to copy these structs.

(This code is ported from one of my languages which never adds any struct
padding. The fields are also accessed in all sorts of dubious ways which
would raise eyebrows if I mentioned them. They are also accessed from
assembler code using a set of defines for the offsets which need to match
the expected layout. Hence the need to control exactly where the fields are
placed.)
 
B

Ben Kibbey

BartC said:
I know this is considered off-topic but perhaps someone's got some ideas.

I have a struct like this with elements all packed:

#pragma pack(1)

typedef struct s {
short int a;
char b;
char c;
int d;
} V;

V *p;

Is there a difference between GCC's #pragma pack(1) and
__attribute__((packed))?
 
B

BartC

Is there a difference between GCC's #pragma pack(1) and
__attribute__((packed))?

I don't know, but the former is easier to remember. It's more universal
(only gcc seems to know about __attribute__). And the syntax is
self-contained (I believe the attribute has to be part of a declaration,
whether before, somewhere in the middle, or after the declaration, I can't
remember either). It also seems to allow a choice of alignment.

So there's no contest really...
 
B

Ben Bacarisse

BartC said:
I don't know, but the former is easier to remember. It's more
universal (only gcc seems to know about __attribute__). And the syntax
is self-contained (I believe the attribute has to be part of a
declaration, whether before, somewhere in the middle, or after the
declaration, I can't remember either). It also seems to allow a choice
of alignment.

So there's no contest really...

Yes, __attribute__ is clearly superior!

I know it's not that clear-cut in all situation but there's no contest
for me: __attribute__ is much more flexible. The main advantage being
that you can use #define to remove it altogether or to simply the usage
as much as you want. You avoid a nest of often repeated #if's.

In GCC, you can't use it to push/pop the settings and you can't specify
a pack alignment, but the I've never found either to be particularly
useful. If you do need either then, I agree, no context the other way!
 
B

Ben Kibbey

Ben Bacarisse said:
Yes, __attribute__ is clearly superior!

I know it's not that clear-cut in all situation but there's no contest
for me: __attribute__ is much more flexible. The main advantage being
that you can use #define to remove it altogether or to simply the usage
as much as you want. You avoid a nest of often repeated #if's.

In GCC, you can't use it to push/pop the settings and you can't specify
a pack alignment, but the I've never found either to be particularly
useful. If you do need either then, I agree, no context the other
way!

But then theres the portability question. I'm sure sure what compilers
support __attribute__ for packing. I've only used GCC and CLang (which
support both).
 
B

Ben Bacarisse

Ben Kibbey said:
But then theres the portability question. I'm sure sure what compilers
support __attribute__ for packing. I've only used GCC and CLang (which
support both).

Something that can be #defined is more flexible for portability. If all
C systems support something similar, all you need is one #define per
system. MS compilers have, I believe, __pragma(...) and I'd hope others
have followed this pattern.

I agree that there will appear to be no practical advantage if all the
systems you intend to use support the same #pragma syntax, but I
consider that just a coincidence. The functional syntax is still better
even if there is no practical advantage on some particular set of
implementations.
 
B

Ben Kibbey

Ben Bacarisse said:
Something that can be #defined is more flexible for portability. If all
C systems support something similar, all you need is one #define per
system. MS compilers have, I believe, __pragma(...) and I'd hope others
have followed this pattern.

I agree that there will appear to be no practical advantage if all the
systems you intend to use support the same #pragma syntax, but I
consider that just a coincidence. The functional syntax is still better
even if there is no practical advantage on some particular set of
implementations.

I guess what I need to know is if #pragma pack() is standard/portable?
What about #pragma itself?
 
E

Eric Sosman

[...]
I agree that there will appear to be no practical advantage if all the
systems you intend to use support the same #pragma syntax, but I
consider that just a coincidence. The functional syntax is still better
even if there is no practical advantage on some particular set of
implementations.

A problem I haven't seen mentioned is the meaning of the number
in `#pragma pack'. It appears the Microsoft compilers treat the
number as a count of bytes, so 1,2,4 call for alignment to one-, two-,
and four-byte boundaries. I've seen a compiler whose analogous
construct used the number as the count of low-order zero bits in the
addresses, so 1,2,4 meant two-, four-, and sixteen-byte alignment;
for one-, two-, and four-byte alignments you'd specify 0,1,2 as the
number in the #pragma.

Some folks take comfort from the Standard's requirement that
unrecognized #pragmas be ignored, but I don't. The problem remains
that if you write a #pragma for Compiler X, a different Compiler Y
might "recognize" it but attach an entirely different meaning ...
 
B

Ben Bacarisse

Ben Kibbey said:
I guess what I need to know is if #pragma pack() is standard/portable?
What about #pragma itself?

The pack pragma is not standard is the sense of being in the language
standard, but the notion of having pragmas is standard. C defines three
pragmas (they all start #pragma STDC ...) but they are all about
floating-point or complex arithmetic.
 
E

Eric Sosman

I guess what I need to know is if #pragma pack() is standard/portable?

It's not standard. "Portable" is a fuzzier notion.
What about #pragma itself?

The Standard describes #pragma (and _Pragma) and describes the
effect of `#pragma STDC ...' forms. Other forms are implementation-
defined.
 
G

George Beer

BartC said:
I know this is considered off-topic but perhaps someone's got some
ideas.
I have a struct like this with elements all packed:

#pragma pack(1)

typedef struct s {
short int a;
char b;
char c;
int d;
} V;

V *p;

Is all this banter necessary? Why don't you just fix it and "graduate" to
another level? Seesh. You life-long programmers are quite annoying. Oh wait,
are y'all Viet Nam vets? OK then, my bad. Not that you have less legs, that
is your government's fault. I was going to say something about "I", but your
goverment has you bitched so much that you think that loss of life and limb
is a solution. It isn't. But I won't call it a crime, cuz then I'd get a
public defender and be so damned by that post-menapausal bitch who doesn't
know your leg, from her period.

That, of course, is not to say that "the judge" is in anyway more better.
Hate justly. They ("they") want to outlay hate: so that they have carte
blanch control over everyone's existence. Hate is not bad. Those that do bad
things want to escape "justice". Someone wrongs you, you hate them forever.
They are bad. There is no absolution in "government can do no bad", not
courtney **** defender or the system of violation commonly known as "the
justice system". If I was in charge, I would send them to prison and throw
away the key. Yep, that phoning insurance agent "determining" what happened
and whether they should pay. The examples abound. How about the
court-ordered "psychologist", that has a job, can you imagine, some kind of
crime of existence that is her?

I would throw away the key. Yes I would.

You, of course, are stupid but not knowing is a crime. But they escape
justice by being stupid in appropriate times. It is just a ruse of course:
you are stupid, or you are evil. I "hypothiesize" that very very few people
are stupid. That "theory" makes a lot of people evil.

Stop hating? Why don't you outlaw me hating , you evil ****? Hmm? You're
trying, but I just threw a wrench into your machine of evil. I guess you'll
have to fall back on your less obvious crimes against humanity: flags,
voting, democracy. Next thing you will suggest is that silver-spoon
politicians and their system of bullshit means anything less than crime
against humanity.

A child could do better than you, YOU are the problem, so don't be even try
to turn the tablets on me and start calling me bad. I'm am homeless!
Assholes.
 
J

Jens Gustedt

Am 16.09.2012 19:29, schrieb Ben Kibbey:
But then theres the portability question. I'm sure sure what compilers
support __attribute__ for packing. I've only used GCC and CLang (which
support both).

Concerning portability the easiest is to wrap this in the C11 syntax
with something like

#define _Alignas(X) __attribute__((__aligned__(X)))

and similar things for the compilers that you encounter.

with that you'd be future proof.

Jens
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top