On alignment (final committee draft for C++0x and n1425 for C1X)

Discussion in 'C++' started by Gennaro Prota, Aug 21, 2010.

  1. NOTE:

    This is multi-posted and cross-posted, and follow ups are set
    to comp.lang.c++.

    The cross-post is to comp.std.c++ and comp.lang.c++ and
    follow-ups are set to comp.lang.c++.

    Furthermore, a copy of this message was also posted to
    comp.std.c ("multi-posting"), for information, asking to use
    comp.lang.c++, instead, for any replies.

    Here's why:

    the message was originally intended for comp.std.c++ only;
    then I noticed that the wording it refers to was basically
    copied from a C1X draft, so I cross-posted it to the two
    ".std." groups. But the comp.std.c++ software auto-rejected
    it, on the grounds that this is difficult to handle.

    Furthermore, since these days comp.std.c++ has an unbelievably
    high latency the only way I could think of to make the
    discussion happen was to set the follow-ups to a low-latency
    group. I apologize, it's probably the Usenet hack of the year,
    and I'm not proud of it, but I really couldn't think how else
    to manage it (if you have better ideas, feel free to tell).

    In any case, beware that the message is geared towards C++,
    including the terminology and the references to the standard.
    ----------------------------------------------------------------


    I was reading the Alignment paragraph ([basic.align]) in the FCD
    for C++0x and was really, really perplexed.

    In particular I couldn't find an answer to this question:

    a) is "alignment" a function of the type (over the set of
    complete object types [less, perhaps, array types])? Or can two
    instances of the same type have different alignments?

    (Note that in the question above "complete" refers to types, not
    objects (parse it as "complete types that are object types").
    Non-complete objects, i.e. sub-objects, do enter in the picture.
    In particular I was looking for a guarantee that given e.g.

    void f() {
    T t ;
    }
    struct C {
    char c ;
    T t2 ;
    } ;

    the object t and the subobject t2 in an instance of C would have
    the same alignment.)

    Here are some sentences that I found particularly perplexing:

    --

    Furthermore, the types char, signed char, and unsigned char
    shall have the weakest alignment requirement.

    That is? Just 1, no? I was thinking (before reading the
    paragraph) that since sizeof( T ) must be a multiple of the
    alignment on every object, and since by (a) (if it holds) the
    alignment of the type is that of any object, it was guaranteed
    that align( char ) == 1.

    --

    An aligment [sic] is an implementation-defined integer value
    representing the number of bytes between successive addresses
    at which a given object can be allocated.

    Minimum positive number? (Among other things, if one doesn't
    make it (existing and) unique I don't even see how one can use
    the definite article "the".)

    <note>
    Note, too, that this definition (or pseudo such) doesn't imply
    that the numerical address is a multiple of the alignment:
    think e.g. of alignment = 4 and the invented addresses 7, 11,
    15 (as opposed to 8, 12, 16).

    One might thing that talking of addresses as numbers
    ("multiples of") is problematic in the context of the standard
    specification, but note that the above is basically talking
    about the difference of two arbitrary pointers, which isn't
    defined in general, either.
    </note>


    And is it a function of the type or not? alignof is applicable
    to a type-id and its description says "An alignof expression
    yields the alignment requirement of its operand *type*".

    (But why "alignment requirement" rather than just "alignment"?)

    Also, consider:

    char c [[ align( 4 ) ]] ;
    static_assert( alignof( c ) == 1, "" ) ; // intentional?

    (I think this is OK: the attribute applies to the declaration,
    thus to the particular object c, not the type. I'm asking just
    because I seem to recall a gcc patch where the author assumed
    that alignof worked like their __alignof__. But then, their
    __alignof__ may also yield different values for a standalone
    double than for a double in a struct, at least on some targets.
    Again we are at the "is a function of the type" issue.)


    --

    Alignments are represented as values of the type std::size_t

    That is? I thought they *were* numbers. And, at this stage,
    alignof hasn't been introduced yet, so what's the point of
    bringing in std::size_t? Aren't we talking of integers in the
    mathematical sense?

    --

    A fundamental alignment is represented by an alignment less
    than or equal...

    An alignment is represented by an alignment?

    Guys, please, consider that we need definitions, here, not
    novels. If you have to explain what a fundamental alignment *is*
    just say "a fundamental alignment is"; or something like "an
    alignment is said to be "fundamental" if and only if...". (Note
    that there's a "representing the number of bytes" above, too.
    Just a little more acceptable than this one.)

    In case you are wondering: yes, these things make me angry. They
    waste everyone's time and mental energies.

    --

    Alignments have an order from weaker to stronger or stricter
    alignments. Stricter alignments have larger alignment values.
    An address that satisfies an alignment requirement also
    satisfies any weaker valid alignment requirement.

    Again, vagueness. Couldn't you just have said e.g.:

    given two alignments a1 and a2 (a1 > 0, a2 > 0):

    - a1 is said to be weaker than a2 if and only if a1 is a
    proper integer submultiple of a2

    - a1 is said to be stronger, or stricter, than a2 if and
    only if a2 is weaker than a1

    About this matter, I also found the following example in
    7.6.2/7:

    [Example: An aligned buffer with an alignment requirement of A
    and holding N elements of type T other than char, signed char,
    or unsigned char can be declared as:

    T buffer [[ align(T), align(A) ]] [N];

    Specifying align(T) in the attribute-list ensures that the
    final requested alignment will not be weaker than alignof(T),
    and therefore the program will not be ill-formed. —end example
    ]

    I thought that such a thing would require a minimum alignment
    that was the lcm of align( T ) and A.

    Hmm, I think I found the key: it's /assumed/ that any valid
    alignment is a power of 2 with a non-negative integer exponent;
    but where is such a requirement?

    --

    Valid alignments include only those values returned by an
    alignof expression for the fundamental types plus an
    additional implementation-defined set of values which may be
    empty.

    What's the point of this if there's no requirement for the set
    to be finite, or to contain PODs only, or to satisfy any
    particular property? As I see it, this is just saying that it's
    implementation-defined what alignments are valid, and that
    alignof shall only yield valid alignments.


    A PROPOSED, PROVISIONAL, NEW WORDING
    ------------------------------------

    Here's some provisional wording which I think solves the
    problems above. With this in place the paragraph about the
    alignment attribute and the alignof operator would only need
    minor tweaks.

    NOTE: Just because of ASCII limitations, I use "!=" for "not
    equal to" and "**" for "raised to".

    For each implementation, there exists a mathematical function

    align: S -> V

    defined on the set S of all and only the complete types that are
    object types but not array types. Its codomain V contains only
    powers of two with an integral non-negative integer exponent.

    For every t belonging to S, align(t) is the greatest a=2**k,
    with k being a non-negative integer, such that

    - all addresses at which instances of t can be placed are
    exact multiples of a and

    - it's possible for the implementation to place some instances
    of t at an address which is *not* a multiple of 2a.
    [footnote: Thus, for instance, an implementation which
    places all instances of t to addresses multiple of 8 cannot
    "lie" and just consider the alignment of the type to be four
    on the ground that any multiple of 8 is also a multiple of
    4. --endfootnote]

    [NOTE: although there doesn't necessarily exist a way for the
    program to check whether an address is a multiple of a given
    integer, this is intended to be unsurprising to those who know
    the addressing structure of the underlying machine. And when an
    integral type Int large enough exists, it is intended that
    reinterpret_cast< Int >( address ) % n == 0 has the expected
    truth value.]

    Note that, due to the power-of-two requirement, the following
    property trivially holds: given two values in V, a1 and a2, a1
    is a submultiple of a2 if and only if a1 <= a2; or,
    equivalently, if and only if log2(a1) <= log2(a2).

    Also, the least common multiple of two alignments is just the
    greatest of them.

    By definition, an alignment a1 is said to be "stricter" (or
    "stronger") than a2 if and only if a2 != a1 and a2 is a
    submultiple of a1.

    Likewise, by definition, a1 is said to be "weaker" than a2 <=>
    a2 is stricter than a1.

    Let t0 be a type in the domain of align and arr an array
    thereof, with at least two elements: since two consecutive
    elements of arr have each an address multiple of align(t0) then
    the positive difference (i.e. the difference from the address of
    the later one), which is sizeof(t0), is a multiple of align(t0),
    too. That is:

    - for any type in S, align(t) is a submultiple of sizeof(t).

    In particular, align( char ) is 1.

    --
    Gennaro Prota | name.surname yahoo.com
    Breeze C++ (preview): <https://sourceforge.net/projects/breeze/>
    Do you need expertise in C++? I'm available.


    [ comp.std.c++ is moderated. To submit articles, try just posting with ]
    [ your news-reader. If that fails, use mailto:]
    [ --- Please see the FAQ before posting. --- ]
    [ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
    Gennaro Prota, Aug 21, 2010
    #1
    1. Advertising

  2. Sigh. Ignore this message, please. It comes from an approval by
    the comp.std.c++ moderators which should have never happened.
    Sorry.

    --
    Gennaro Prota | I'm available for your projects.
    Breeze (preview): <https://sourceforge.net/projects/breeze/>
    Gennaro Prota, Aug 21, 2010
    #2
    1. Advertising

  3. Robert Miles <>, on 25/08/2010 18:21:13, wrote:

    > "Gennaro Prota"<> wrote in message
    > news:i4khmi$1pk$...
    >> NOTE:
    >>
    >> This is multi-posted and cross-posted, and follow ups are set
    >> to comp.lang.c++.
    >>

    > [snip]
    >> Let t0 be a type in the domain of align and arr an array
    >> thereof, with at least two elements: since two consecutive
    >> elements of arr have each an address multiple of align(t0) then
    >> the positive difference (i.e. the difference from the address of
    >> the later one), which is sizeof(t0), is a multiple of align(t0),
    >> too. That is:
    >>
    >> - for any type in S, align(t) is a submultiple of sizeof(t).
    >>
    >> In particular, align( char ) is 1.
    >>
    >> --
    >> Gennaro Prota | name.surname yahoo.com

    >
    > So no one who uses a language with more than 256
    > characters can have all the characters in the character
    > set for that language?
    >
    > Look at Chinese, Japanese, and various other languages
    > with large character sets.


    Apart that the issue here has nothing to do with languages and
    characters (albeit UTF8 is perfectly happy with 8-bit chars, and UTF8
    can represent all characters of all the world's languages), if you're
    building your objection from Gennaro's sentence "[...] align( char ) is
    1", then you need to understand better what a char is.

    Nowhere in the C++ Standard it is mandated for a char to be an 8-bit
    type, which would limit the count of different values it could hold up
    to 256 and no more, as you seem to be saying.

    It is used here (and everywhere else the C++ Standard is looked at
    correctly) as the base for measuring the size of the various objects.

    In particular, the standard mandates that "sizeof(char) == 1" must hold
    true, regardless of whether a char stores 8, 16, 32 or even more bits.

    --
    FSC - http://userscripts.org/scripts/show/59948
    http://fscode.altervista.org - http://sardinias.com
    Francesco S. Carta, Aug 26, 2010
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. G Patel
    Replies:
    1
    Views:
    535
  2. Gennaro Prota
    Replies:
    6
    Views:
    641
    Larry Evans
    Aug 25, 2010
  3. New C1X Draft

    , Feb 24, 2011, in forum: C Programming
    Replies:
    3
    Views:
    641
  4. Shugo Maeda

    The Final Draft Ruby Specification

    Shugo Maeda, Sep 22, 2010, in forum: Ruby
    Replies:
    0
    Views:
    113
    Shugo Maeda
    Sep 22, 2010
  5. Jorge

    5th Edition final draft

    Jorge, Apr 8, 2009, in forum: Javascript
    Replies:
    3
    Views:
    95
    Dr J R Stockton
    Apr 13, 2009
Loading...

Share This Page