On alignment (final committee draft for C++0x and n1425 for C1X)

Discussion in 'C++' started by Gennaro Prota, Aug 20, 2010.

  1. NOTE:

    This is multi-posted. However this newsgroup (comp.lang.c++)
    is where the discussion is meant to happen.

    The message has been posted to comp.std.c++, comp.std.c and
    comp.lang.c++, with suitable notices (a different notice for
    each group).

    Here's why:

    the message was originally intended for comp.std.c++ only;
    then I noticed that the wording it refers to was basically
    copied from a C1X draft, so I cross-posted it to the two
    ".std." groups. But the comp.std.c++ software auto-rejected
    it, on the grounds that this is difficult to handle.

    Furthermore, since these days comp.std.c++ has an unbelievably
    high latency the only way I could think of to make the
    discussion possible was to set the follow-ups to a low-latency
    group. I apologize, it's probably the Usenet hack of the year,
    and I'm not proud of it, but I really couldn't think how else
    to manage it (if you have better ideas, feel free to tell).

    In any case, beware that the message is geared towards C++,
    including the terminology and the references to the standard.
    ----------------------------------------------------------------


    I was reading the Alignment paragraph ([basic.align]) in the FCD
    for C++0x and was really, really perplexed.

    In particular I couldn't find an answer to this question:

    a) is "alignment" a function of the type (over the set of
    complete object types [less, perhaps, array types])? Or can two
    instances of the same type have different alignments?

    (Note that in the question above "complete" refers to types, not
    objects (parse it as "complete types that are object types").
    Non-complete objects, i.e. sub-objects, do enter in the picture.
    In particular I was looking for a guarantee that given e.g.

    void f() {
    T t ;
    }
    struct C {
    char c ;
    T t2 ;
    } ;

    the object t and the subobject t2 in an instance of C would have
    the same alignment.)

    Here are some sentences that I found particularly perplexing:

    --

    Furthermore, the types char, signed char, and unsigned char
    shall have the weakest alignment requirement.

    That is? Just 1, no? I was thinking (before reading the
    paragraph) that since sizeof( T ) must be a multiple of the
    alignment on every object, and since by (a) (if it holds) the
    alignment of the type is that of any object, it was guaranteed
    that align( char ) == 1.

    --

    An aligment [sic] is an implementation-defined integer value
    representing the number of bytes between successive addresses
    at which a given object can be allocated.

    Minimum positive number? (Among other things, if one doesn't
    make it (existing and) unique I don't even see how one can use
    the definite article "the".)

    <note>
    Note, too, that this definition (or pseudo such) doesn't imply
    that the numerical address is a multiple of the alignment:
    think e.g. of alignment = 4 and the invented addresses 7, 11,
    15 (as opposed to 8, 12, 16).

    One might thing that talking of addresses as numbers
    ("multiples of") is problematic in the context of the standard
    specification, but note that the above is basically talking
    about the difference of two arbitrary pointers, which isn't
    defined in general, either.
    </note>


    And is it a function of the type or not? alignof is applicable
    to a type-id and its description says "An alignof expression
    yields the alignment requirement of its operand *type*".

    (But why "alignment requirement" rather than just "alignment"?)

    Also, consider:

    char c [[ align( 4 ) ]] ;
    static_assert( alignof( c ) == 1, "" ) ; // intentional?

    (I think this is OK: the attribute applies to the declaration,
    thus to the particular object c, not the type. I'm asking just
    because I seem to recall a gcc patch where the author assumed
    that alignof worked like their __alignof__. But then, their
    __alignof__ may also yield different values for a standalone
    double than for a double in a struct, at least on some targets.
    Again we are at the "is a function of the type" issue.)


    --

    Alignments are represented as values of the type std::size_t

    That is? I thought they *were* numbers. And, at this stage,
    alignof hasn't been introduced yet, so what's the point of
    bringing in std::size_t? Aren't we talking of integers in the
    mathematical sense?

    --

    A fundamental alignment is represented by an alignment less
    than or equal...

    An alignment is represented by an alignment?

    Guys, please, consider that we need definitions, here, not
    novels. If you have to explain what a fundamental alignment *is*
    just say "a fundamental alignment is"; or something like "an
    alignment is said to be "fundamental" if and only if...". (Note
    that there's a "representing the number of bytes" above, too.
    Just a little more acceptable than this one.)

    In case you are wondering: yes, these things make me angry. They
    waste everyone's time and mental energies.

    --

    Alignments have an order from weaker to stronger or stricter
    alignments. Stricter alignments have larger alignment values.
    An address that satisfies an alignment requirement also
    satisfies any weaker valid alignment requirement.

    Again, vagueness. Couldn't you just have said e.g.:

    given two alignments a1 and a2 (a1 > 0, a2 > 0):

    - a1 is said to be weaker than a2 if and only if a1 is a
    proper integer submultiple of a2

    - a1 is said to be stronger, or stricter, than a2 if and
    only if a2 is weaker than a1

    About this matter, I also found the following example in
    7.6.2/7:

    [Example: An aligned buffer with an alignment requirement of A
    and holding N elements of type T other than char, signed char,
    or unsigned char can be declared as:

    T buffer [[ align(T), align(A) ]] [N];

    Specifying align(T) in the attribute-list ensures that the
    final requested alignment will not be weaker than alignof(T),
    and therefore the program will not be ill-formed. —end example
    ]

    I thought that such a thing would require a minimum alignment
    that was the lcm of align( T ) and A.

    Hmm, I think I found the key: it's /assumed/ that any valid
    alignment is a power of 2 with a non-negative integer exponent;
    but where is such a requirement?

    --

    Valid alignments include only those values returned by an
    alignof expression for the fundamental types plus an
    additional implementation-defined set of values which may be
    empty.

    What's the point of this if there's no requirement for the set
    to be finite, or to contain PODs only, or to satisfy any
    particular property? As I see it, this is just saying that it's
    implementation-defined what alignments are valid, and that
    alignof shall only yield valid alignments.


    A PROPOSED, PROVISIONAL, NEW WORDING
    ------------------------------------

    Here's some provisional wording which I think solves the
    problems above. With this in place the paragraph about the
    alignment attribute and the alignof operator would only need
    minor tweaks.

    NOTE: Just because of ASCII limitations, I use "!=" for "not
    equal to" and "**" for "raised to".

    For each implementation, there exists a mathematical function

    align: S -> V

    defined on the set S of all and only the complete types that are
    object types but not array types. Its codomain V contains only
    powers of two with an integral non-negative integer exponent.

    For every t belonging to S, align(t) is the greatest a=2**k,
    with k being a non-negative integer, such that

    - all addresses at which instances of t can be placed are
    exact multiples of a and

    - it's possible for the implementation to place some instances
    of t at an address which is *not* a multiple of 2a.
    [footnote: Thus, for instance, an implementation which
    places all instances of t to addresses multiple of 8 cannot
    "lie" and just consider the alignment of the type to be four
    on the ground that any multiple of 8 is also a multiple of
    4. --endfootnote]

    [NOTE: although there doesn't necessarily exist a way for the
    program to check whether an address is a multiple of a given
    integer, this is intended to be unsurprising to those who know
    the addressing structure of the underlying machine. And when an
    integral type Int large enough exists, it is intended that
    reinterpret_cast< Int >( address ) % n == 0 has the expected
    truth value.]

    Note that, due to the power-of-two requirement, the following
    property trivially holds: given two values in V, a1 and a2, a1
    is a submultiple of a2 if and only if a1 <= a2; or,
    equivalently, if and only if log2(a1) <= log2(a2).

    Also, the least common multiple of two alignments is just the
    greatest of them.

    By definition, an alignment a1 is said to be "stricter" (or
    "stronger") than a2 if and only if a2 != a1 and a2 is a
    submultiple of a1.

    Likewise, by definition, a1 is said to be "weaker" than a2 <=>
    a2 is stricter than a1.

    Let t0 be a type in the domain of align and arr an array
    thereof, with at least two elements: since two consecutive
    elements of arr have each an address multiple of align(t0) then
    the positive difference (i.e. the difference from the address of
    the later one), which is sizeof(t0), is a multiple of align(t0),
    too. That is:

    - for any type in S, align(t) is a submultiple of sizeof(t).

    In particular, align( char ) is 1.

    --
    Gennaro Prota | name.surname yahoo.com
    Breeze C++ (preview): <https://sourceforge.net/projects/breeze/>
    Do you need expertise in C++? I'm available.
     
    Gennaro Prota, Aug 20, 2010
    #1
    1. Advertising

  2. On 20/08/2010 9.29, Paavo Helde wrote:
    > Gennaro Prota <> wrote in news:i4kjk8$3p4$1
    > @speranza.aioe.org:
    > [...]
    >> Here's some provisional wording which I think solves the
    >> problems above.

    > [...]
    >> For every t belonging to S, align(t) is the greatest a=2**k,
    >> with k being a non-negative integer, such that
    >>
    >> - all addresses at which instances of t can be placed are
    >> exact multiples of a and

    >
    > But on a very common hardware platform (Intel) one can use misaligned data,
    > and sometimes this comes quite handy, e.g. when processing PKZIP file
    > headers. I am not sure, but I think all the jumble-mumble in the draft
    > might be an disguised attempt to make this legal. Or not?


    Well, I don't think that this wording makes a difference in this
    respect. You can still apply an alignment attribute to an object
    declaration (giving to that object an alignment which is
    different from the "alignment of the type"), or play with
    reinterpret_cast<>.

    Did you have something specific in mind which makes the two
    definition approaches ("number of bytes between" vs. "address
    multiple of") different?

    --
    Gennaro Prota | I'm available for your projects.
    Breeze (preview): <https://sourceforge.net/projects/breeze/>
     
    Gennaro Prota, Aug 21, 2010
    #2
    1. Advertising

  3. Gennaro Prota

    Larry Evans Guest

    On 08/19/10 19:52, Gennaro Prota wrote:
    [snip]
    >
    > Note that, due to the power-of-two requirement, the following
    > property trivially holds: given two values in V, a1 and a2, a1
    > is a submultiple of a2 if and only if a1<= a2; or,
    > equivalently, if and only if log2(a1)<= log2(a2).
    >
    > Also, the least common multiple of two alignments is just the
    > greatest of them.

    Would not the "extended alignments" mentioned in paragraph 3
    on page 3 of:
    http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2165.pdf
    require using lcm, as shown here:

    http://svn.boost.org/svn/boost/sand...boost/composite_storage/alignment/compose.hpp
    ?
    [snip]

    Larry
     
    Larry Evans, Aug 21, 2010
    #3
  4. On 21/08/2010 10.30, Paavo Helde wrote:
    > Gennaro Prota <> wrote in
    > news:i4n45v$2qn$:
    >
    >> On 20/08/2010 9.29, Paavo Helde wrote:
    >>> Gennaro Prota <> wrote in news:i4kjk8$3p4$1
    >>> @speranza.aioe.org:
    >>> [...]
    >>>> Here's some provisional wording which I think solves the
    >>>> problems above.
    >>> [...]
    >>>> For every t belonging to S, align(t) is the greatest a=2**k,
    >>>> with k being a non-negative integer, such that
    >>>>
    >>>> - all addresses at which instances of t can be placed are
    >>>> exact multiples of a and
    >>>
    >>> But on a very common hardware platform (Intel) one can use misaligned
    >>> data, and sometimes this comes quite handy, e.g. when processing
    >>> PKZIP file headers. I am not sure, but I think all the jumble-mumble
    >>> in the draft might be an disguised attempt to make this legal. Or
    >>> not?

    >>
    >> Well, I don't think that this wording makes a difference in this
    >> respect. You can still apply an alignment attribute to an object
    >> declaration (giving to that object an alignment which is
    >> different from the "alignment of the type"), or play with
    >> reinterpret_cast<>.
    >>
    >> Did you have something specific in mind which makes the two
    >> definition approaches ("number of bytes between" vs. "address
    >> multiple of") different?

    >
    > No, not really. I just thought you are attempting to define the alignment
    > in terms of hardware ("can be placed ..."), but this loses meaning on the
    > hardware where any data *can* be placed at any address (like Intel x86).


    Ah, I see where you're coming from. It was meant as "can be
    placed by the C++ implementation". The current wording uses this
    same expression, with --AFAICS-- the same "by the C++
    implementation" implication. It's up to the implementation to
    define the align function and it may well make it the constant
    function whose only value if 1.

    Anyway:

    something like "all objects to which no /alignment attribute
    that says otherwise/ applies *will have* an address which is a
    multiple..." probably works better.

    Example:

    // if align( double ) is 4

    void f()
    {
    double d ; // address will be multiple of 4
    }

    struct [[ align( 2 ) ]] Pod
    {
    double m ;
    } ;

    void g()
    {
    Pod p ; // p.m has the same address as p, thus not
    // necessarily a multiple of 4 (the attribute on the
    // declaration of struct Pod indirectly "applies" (= has an
    // effect on) to p.m)
    }

    --
    Gennaro Prota | I'm available for your projects.
    Breeze (preview): <https://sourceforge.net/projects/breeze/>
     
    Gennaro Prota, Aug 21, 2010
    #4
  5. On 21/08/2010 18.22, Larry Evans wrote:
    > On 08/19/10 19:52, Gennaro Prota wrote:
    > [snip]
    >>
    >> Note that, due to the power-of-two requirement, the following
    >> property trivially holds: given two values in V, a1 and a2, a1
    >> is a submultiple of a2 if and only if a1<= a2; or,
    >> equivalently, if and only if log2(a1)<= log2(a2).
    >>
    >> Also, the least common multiple of two alignments is just the
    >> greatest of them.

    > Would not the "extended alignments" mentioned in paragraph 3
    > on page 3 of:
    > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2165.pdf
    > require using lcm


    Sorry for the late reply. I haven't read the paper but the issue
    is purely mathematical: if S is a set of powers of two with a
    non-negative integer exponent then lcm( S ) and max( S ) are the
    same number.

    --
    Gennaro Prota | I'm available for your projects.
    Breeze (preview): <https://sourceforge.net/projects/breeze/>
     
    Gennaro Prota, Aug 24, 2010
    #5
  6. Gennaro Prota

    Larry Evans Guest

    On 08/24/10 14:28, Gennaro Prota wrote:
    > On 21/08/2010 18.22, Larry Evans wrote:

    [snip]
    >> On 08/19/10 19:52, Gennaro Prota wrote:
    >> [snip]
    >>>
    >>> Note that, due to the power-of-two requirement, the following
    >>> property trivially holds: given two values in V, a1 and a2, a1
    >>> is a submultiple of a2 if and only if a1<= a2; or,
    >>> equivalently, if and only if log2(a1)<= log2(a2).
    >>>
    >>> Also, the least common multiple of two alignments is just the
    >>> greatest of them.

    >> Would not the "extended alignments" mentioned in paragraph 3
    >> on page 3 of:
    >> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2165.pdf
    >> require using lcm

    >
    > Sorry for the late reply. I haven't read the paper but the issue
    > is purely mathematical: if S is a set of powers of two with a
    > non-negative integer exponent then lcm( S ) and max( S ) are the
    > same number.
    >

    However, I think I remember reading somewhere that an extended alignment
    could be something other than a power of 2, and that
    was why lcm would be required. Sorry, I've been looking for
    the example for the last few minutes but have been unable to
    find it. I did find:

    http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1447.htm

    which *proposes* to restrict alignments to power of 2 values;
    however, I don't know if that's been accepted.
     
    Larry Evans, Aug 25, 2010
    #6
  7. Gennaro Prota

    Larry Evans Guest

    On 08/25/10 09:28, Larry Evans wrote:
    [snip]
    > However, I think I remember reading somewhere that an extended alignment
    > could be something other than a power of 2, and that
    > was why lcm would be required. Sorry, I've been looking for
    > the example for the last few minutes but have been unable to
    > find it.


    After several more minutes looking, I still couldn't find
    any document examples showing other than power of 2 alignments.

    Maybe I just imagined it :(

    Sorry for noise.
     
    Larry Evans, Aug 25, 2010
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. G Patel
    Replies:
    1
    Views:
    558
  2. Gennaro Prota
    Replies:
    2
    Views:
    351
    Francesco S. Carta
    Aug 26, 2010
  3. New C1X Draft

    , Feb 24, 2011, in forum: C Programming
    Replies:
    3
    Views:
    667
  4. Shugo Maeda

    The Final Draft Ruby Specification

    Shugo Maeda, Sep 22, 2010, in forum: Ruby
    Replies:
    0
    Views:
    133
    Shugo Maeda
    Sep 22, 2010
  5. Jorge

    5th Edition final draft

    Jorge, Apr 8, 2009, in forum: Javascript
    Replies:
    3
    Views:
    103
    Dr J R Stockton
    Apr 13, 2009
Loading...

Share This Page