On alignment (final committee draft for C++0x and n1425 for C1X)

G

Gennaro Prota

NOTE:

This is multi-posted and cross-posted, and follow ups are set
to comp.lang.c++.

The cross-post is to comp.std.c++ and comp.lang.c++ and
follow-ups are set to comp.lang.c++.

Furthermore, a copy of this message was also posted to
comp.std.c ("multi-posting"), for information, asking to use
comp.lang.c++, instead, for any replies.

Here's why:

the message was originally intended for comp.std.c++ only;
then I noticed that the wording it refers to was basically
copied from a C1X draft, so I cross-posted it to the two
".std." groups. But the comp.std.c++ software auto-rejected
it, on the grounds that this is difficult to handle.

Furthermore, since these days comp.std.c++ has an unbelievably
high latency the only way I could think of to make the
discussion happen was to set the follow-ups to a low-latency
group. I apologize, it's probably the Usenet hack of the year,
and I'm not proud of it, but I really couldn't think how else
to manage it (if you have better ideas, feel free to tell).

In any case, beware that the message is geared towards C++,
including the terminology and the references to the standard.
----------------------------------------------------------------


I was reading the Alignment paragraph ([basic.align]) in the FCD
for C++0x and was really, really perplexed.

In particular I couldn't find an answer to this question:

a) is "alignment" a function of the type (over the set of
complete object types [less, perhaps, array types])? Or can two
instances of the same type have different alignments?

(Note that in the question above "complete" refers to types, not
objects (parse it as "complete types that are object types").
Non-complete objects, i.e. sub-objects, do enter in the picture.
In particular I was looking for a guarantee that given e.g.

void f() {
T t ;
}
struct C {
char c ;
T t2 ;
} ;

the object t and the subobject t2 in an instance of C would have
the same alignment.)

Here are some sentences that I found particularly perplexing:

--

Furthermore, the types char, signed char, and unsigned char
shall have the weakest alignment requirement.

That is? Just 1, no? I was thinking (before reading the
paragraph) that since sizeof( T ) must be a multiple of the
alignment on every object, and since by (a) (if it holds) the
alignment of the type is that of any object, it was guaranteed
that align( char ) == 1.

--

An aligment [sic] is an implementation-defined integer value
representing the number of bytes between successive addresses
at which a given object can be allocated.

Minimum positive number? (Among other things, if one doesn't
make it (existing and) unique I don't even see how one can use
the definite article "the".)

<note>
Note, too, that this definition (or pseudo such) doesn't imply
that the numerical address is a multiple of the alignment:
think e.g. of alignment = 4 and the invented addresses 7, 11,
15 (as opposed to 8, 12, 16).

One might thing that talking of addresses as numbers
("multiples of") is problematic in the context of the standard
specification, but note that the above is basically talking
about the difference of two arbitrary pointers, which isn't
defined in general, either.
</note>


And is it a function of the type or not? alignof is applicable
to a type-id and its description says "An alignof expression
yields the alignment requirement of its operand *type*".

(But why "alignment requirement" rather than just "alignment"?)

Also, consider:

char c [[ align( 4 ) ]] ;
static_assert( alignof( c ) == 1, "" ) ; // intentional?

(I think this is OK: the attribute applies to the declaration,
thus to the particular object c, not the type. I'm asking just
because I seem to recall a gcc patch where the author assumed
that alignof worked like their __alignof__. But then, their
__alignof__ may also yield different values for a standalone
double than for a double in a struct, at least on some targets.
Again we are at the "is a function of the type" issue.)


--

Alignments are represented as values of the type std::size_t

That is? I thought they *were* numbers. And, at this stage,
alignof hasn't been introduced yet, so what's the point of
bringing in std::size_t? Aren't we talking of integers in the
mathematical sense?

--

A fundamental alignment is represented by an alignment less
than or equal...

An alignment is represented by an alignment?

Guys, please, consider that we need definitions, here, not
novels. If you have to explain what a fundamental alignment *is*
just say "a fundamental alignment is"; or something like "an
alignment is said to be "fundamental" if and only if...". (Note
that there's a "representing the number of bytes" above, too.
Just a little more acceptable than this one.)

In case you are wondering: yes, these things make me angry. They
waste everyone's time and mental energies.

--

Alignments have an order from weaker to stronger or stricter
alignments. Stricter alignments have larger alignment values.
An address that satisfies an alignment requirement also
satisfies any weaker valid alignment requirement.

Again, vagueness. Couldn't you just have said e.g.:

given two alignments a1 and a2 (a1 > 0, a2 > 0):

- a1 is said to be weaker than a2 if and only if a1 is a
proper integer submultiple of a2

- a1 is said to be stronger, or stricter, than a2 if and
only if a2 is weaker than a1

About this matter, I also found the following example in
7.6.2/7:

[Example: An aligned buffer with an alignment requirement of A
and holding N elements of type T other than char, signed char,
or unsigned char can be declared as:

T buffer [[ align(T), align(A) ]] [N];

Specifying align(T) in the attribute-list ensures that the
final requested alignment will not be weaker than alignof(T),
and therefore the program will not be ill-formed. —end example
]

I thought that such a thing would require a minimum alignment
that was the lcm of align( T ) and A.

Hmm, I think I found the key: it's /assumed/ that any valid
alignment is a power of 2 with a non-negative integer exponent;
but where is such a requirement?

--

Valid alignments include only those values returned by an
alignof expression for the fundamental types plus an
additional implementation-defined set of values which may be
empty.

What's the point of this if there's no requirement for the set
to be finite, or to contain PODs only, or to satisfy any
particular property? As I see it, this is just saying that it's
implementation-defined what alignments are valid, and that
alignof shall only yield valid alignments.


A PROPOSED, PROVISIONAL, NEW WORDING
------------------------------------

Here's some provisional wording which I think solves the
problems above. With this in place the paragraph about the
alignment attribute and the alignof operator would only need
minor tweaks.

NOTE: Just because of ASCII limitations, I use "!=" for "not
equal to" and "**" for "raised to".

For each implementation, there exists a mathematical function

align: S -> V

defined on the set S of all and only the complete types that are
object types but not array types. Its codomain V contains only
powers of two with an integral non-negative integer exponent.

For every t belonging to S, align(t) is the greatest a=2**k,
with k being a non-negative integer, such that

- all addresses at which instances of t can be placed are
exact multiples of a and

- it's possible for the implementation to place some instances
of t at an address which is *not* a multiple of 2a.
[footnote: Thus, for instance, an implementation which
places all instances of t to addresses multiple of 8 cannot
"lie" and just consider the alignment of the type to be four
on the ground that any multiple of 8 is also a multiple of
4. --endfootnote]

[NOTE: although there doesn't necessarily exist a way for the
program to check whether an address is a multiple of a given
integer, this is intended to be unsurprising to those who know
the addressing structure of the underlying machine. And when an
integral type Int large enough exists, it is intended that
reinterpret_cast< Int >( address ) % n == 0 has the expected
truth value.]

Note that, due to the power-of-two requirement, the following
property trivially holds: given two values in V, a1 and a2, a1
is a submultiple of a2 if and only if a1 <= a2; or,
equivalently, if and only if log2(a1) <= log2(a2).

Also, the least common multiple of two alignments is just the
greatest of them.

By definition, an alignment a1 is said to be "stricter" (or
"stronger") than a2 if and only if a2 != a1 and a2 is a
submultiple of a1.

Likewise, by definition, a1 is said to be "weaker" than a2 <=>
a2 is stricter than a1.

Let t0 be a type in the domain of align and arr an array
thereof, with at least two elements: since two consecutive
elements of arr have each an address multiple of align(t0) then
the positive difference (i.e. the difference from the address of
the later one), which is sizeof(t0), is a multiple of align(t0),
too. That is:

- for any type in S, align(t) is a submultiple of sizeof(t).

In particular, align( char ) is 1.

--
Gennaro Prota | name.surname yahoo.com
Breeze C++ (preview): <https://sourceforge.net/projects/breeze/>
Do you need expertise in C++? I'm available.


[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:[email protected]]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
 
G

Gennaro Prota

Sigh. Ignore this message, please. It comes from an approval by
the comp.std.c++ moderators which should have never happened.
Sorry.
 
F

Francesco S. Carta

Gennaro Prota said:
NOTE:

This is multi-posted and cross-posted, and follow ups are set
to comp.lang.c++.
[snip]
Let t0 be a type in the domain of align and arr an array
thereof, with at least two elements: since two consecutive
elements of arr have each an address multiple of align(t0) then
the positive difference (i.e. the difference from the address of
the later one), which is sizeof(t0), is a multiple of align(t0),
too. That is:

- for any type in S, align(t) is a submultiple of sizeof(t).

In particular, align( char ) is 1.

So no one who uses a language with more than 256
characters can have all the characters in the character
set for that language?

Look at Chinese, Japanese, and various other languages
with large character sets.

Apart that the issue here has nothing to do with languages and
characters (albeit UTF8 is perfectly happy with 8-bit chars, and UTF8
can represent all characters of all the world's languages), if you're
building your objection from Gennaro's sentence "[...] align( char ) is
1", then you need to understand better what a char is.

Nowhere in the C++ Standard it is mandated for a char to be an 8-bit
type, which would limit the count of different values it could hold up
to 256 and no more, as you seem to be saying.

It is used here (and everywhere else the C++ Standard is looked at
correctly) as the base for measuring the size of the various objects.

In particular, the standard mandates that "sizeof(char) == 1" must hold
true, regardless of whether a char stores 8, 16, 32 or even more bits.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top