Re: Casting an array to integer type

Discussion in 'C Programming' started by Peter Nilsson, Sep 2, 2005.

  1. [Followup set to comp.lang.c.]

    comp.std.c is the wrong newsgroup to post such questions as the
    group discusses the suitability and consistency of the wording of
    the Standards documents that define the C language, not the
    application of the standards to programming.

    wrote:
    > Hi,
    >
    > Here is the problem I am facing. I have a structure as under
    >
    > struct A{
    > int x;
    > char y : 1;
    > char z : 1;


    It's highly unlikely, but it's possible that plain char is signed
    on an implementation that uses sign-magnitude or one's complement.
    On such a machine, y and z are only capable of storing the values
    0 and -0. You are better off using an unsigned type for boolean
    flags.

    > };
    >
    > Now this takes 4 (int) + 1 (for the 2 bits) + 3 bytes (padding) = 8
    > bytes.


    No it takes sizeof(int) + whatever padding + sizeof(char) +
    whatever further padding the compiler deems fit.

    > Now I dont want to waste those 3 bytes.
    >
    > So I am trying something like
    >
    > struct A{
    > char x[4];
    > char y : 1;
    > char z : 1;
    > };


    Why? Why is the 'wastage' a problem?

    If it truly is, then you should consult a compiler specific newsgroup
    about whatever structure 'packing' options are available.

    > Now to continue using x as an int, I am trying a cast like this:
    >
    > *(int *)(a.x) = integer_value;


    ITYM: * (int *) &a.x = 42;

    >
    > I am having problems with this and my system crashes.


    There is a VERY GOOD REASON why compilers pad structures. It's called
    alignment. Certain datatypes need to be aligned on certain byte
    boundaries. Character types are the lowest addressable unit and so
    never need alignment.

    > Is this kind of cast incorrect?


    Yes, because the conversion of a character pointer to integer
    pointer is only defined if the resultant pointer is properly
    aligned for the type it's pointing to. Since you're going out
    of your way to undermine your compiler's ability to guarantee
    that alignment, your program is no longer stable (and it's
    considerably less maintainable).

    It _is_ possible to minimise the space used by the structure,
    but it comes at a high performance and maintenance cost. It is
    not recommended.

    struct A
    {
    char x[sizeof int];
    unsigned char y : 1;
    unsigned char z : 1;
    } a;

    int x = 42;

    memcpy(a.x, &x, sizeof a.x);
    memcpy(&x, a.x, sizeof x);

    > Is there any other way I can avoid the padding?


    Redesign your data structures. For example, instead of having (say)
    an array of struct A, use two (or more) parallel arrays.

    --
    Peter
     
    Peter Nilsson, Sep 2, 2005
    #1
    1. Advertising

  2. "Peter Nilsson" <> writes:
    > wrote:
    >> Hi,
    >>
    >> Here is the problem I am facing. I have a structure as under
    >>
    >> struct A{
    >> int x;
    >> char y : 1;
    >> char z : 1;

    >
    > It's highly unlikely, but it's possible that plain char is signed
    > on an implementation that uses sign-magnitude or one's complement.
    > On such a machine, y and z are only capable of storing the values
    > 0 and -0. You are better off using an unsigned type for boolean
    > flags.


    The only allowed types for bit-fields are plain int, unsigned int,
    signed int, and (C99 only) _Bool. Some implementations may support
    char bit-fields as an extension, but they're not portable.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Sep 2, 2005
    #2
    1. Advertising

  3. Keith Thompson wrote:
    > "Peter Nilsson" <> writes:
    > > wrote:
    > >> Hi,
    > >>
    > >> Here is the problem I am facing. I have a structure as under
    > >>
    > >> struct A{
    > >> int x;
    > >> char y : 1;
    > >> char z : 1;

    > >
    > > It's highly unlikely, but it's possible that plain char is signed
    > > on an implementation that uses sign-magnitude or one's complement.
    > > On such a machine, y and z are only capable of storing the values
    > > 0 and -0. You are better off using an unsigned type for boolean
    > > flags.

    >
    > The only allowed types for bit-fields are plain int, unsigned int,
    > signed int, and (C99 only) _Bool.


    Thanks for that.

    I note though that C99 also includes "or some other implementation-
    defined type." Did any of the final C89/90/94/95 include this too?
    [I notice that it's missing from the C89 draft.]

    --
    Peter
     
    Peter Nilsson, Sep 2, 2005
    #3
  4. Peter Nilsson

    Antoine Leca Guest

    En <news:>,
    Peter Nilsson va escriure:
    > [Followup set to comp.lang.c.]


    Seen that. I hope Peter is a regular there. I am not.

    >> Here is the problem I am facing. I have a structure as under
    >>
    >> struct A{
    >> int x;
    >> char y : 1;
    >> char z : 1;
    >> };


    > struct A
    > {
    > char x[sizeof int];

    char x[sizeof(int)];

    > unsigned char y : 1;
    > unsigned char z : 1;
    > } a;


    I do not believe this will always be reduced in size with respect to the
    original. In other words, a lot of compilers will still make it
    2*sizeof(int) bytes long, in order to keep access to arrays of struct A
    efficient.

    And, as Peter rightly noted, IMHO they are right.



    Antoine
     
    Antoine Leca, Sep 2, 2005
    #4
  5. Peter Nilsson

    Eric Sosman Guest

    Peter Nilsson wrote:
    > Keith Thompson wrote:
    >
    >>"Peter Nilsson" <> writes:
    >>
    >>> wrote:
    >>>
    >>>>Hi,
    >>>>
    >>>>Here is the problem I am facing. I have a structure as under
    >>>>
    >>>>struct A{
    >>>> int x;
    >>>> char y : 1;
    >>>> char z : 1;
    >>>
    >>>It's highly unlikely, but it's possible that plain char is signed
    >>>on an implementation that uses sign-magnitude or one's complement.
    >>>On such a machine, y and z are only capable of storing the values
    >>>0 and -0. You are better off using an unsigned type for boolean
    >>>flags.

    >>
    >>The only allowed types for bit-fields are plain int, unsigned int,
    >>signed int, and (C99 only) _Bool.

    >
    >
    > Thanks for that.
    >
    > I note though that C99 also includes "or some other implementation-
    > defined type." Did any of the final C89/90/94/95 include this too?
    > [I notice that it's missing from the C89 draft.]


    A given implementation is permitted to accept `wchar_t',
    say, as a bit-field base type, and is not required to issue
    a diagnostic if you use it. However, the next implementation
    in line is permitted to reject the code out of hand. Only the
    types Keith listed are sure to be valid on all implementations.

    In any case, it's certainly risky to use a one-bit field
    for a type that is (or might be) signed! In some circumstances
    (not theoretical; I've seen it), you may find that with

    struct { int flag : 1; } s;

    .... the value of `s.flag' will always be zero, no matter what
    you try to assign to it. (Explanation: if `int' is signed
    when used as a bit-field base, the lone bit of `s.flag' will
    be its sign bit and there will be no value bits at all.)

    --
     
    Eric Sosman, Sep 2, 2005
    #5
  6. Peter Nilsson

    tedu Guest

    Eric Sosman wrote:
    > struct { int flag : 1; } s;
    >
    > ... the value of `s.flag' will always be zero, no matter what
    > you try to assign to it. (Explanation: if `int' is signed
    > when used as a bit-field base, the lone bit of `s.flag' will
    > be its sign bit and there will be no value bits at all.)


    I may be misunderstanding how bitfields work, but why couldn't you read
    and write the sign bit in such a case?
     
    tedu, Sep 2, 2005
    #6
  7. "tedu" <> writes:
    > Eric Sosman wrote:
    >> struct { int flag : 1; } s;
    >>
    >> ... the value of `s.flag' will always be zero, no matter what
    >> you try to assign to it. (Explanation: if `int' is signed
    >> when used as a bit-field base, the lone bit of `s.flag' will
    >> be its sign bit and there will be no value bits at all.)

    >
    > I may be misunderstanding how bitfields work, but why couldn't you read
    > and write the sign bit in such a case?


    Assuming a two's-complement representation, and assuming that a plain
    int bit-field is treated as signed, you should be able to store either
    0 or -1 in s.flag. But given a one's-complement or signed-magnitude
    representation, s.flag can only hold the values +0 and -0 -- and most
    attempts to store -0 will probably implicitly store +0.

    On the other hand, even on a two's-complement system, I wouldn't be
    astonished to see a compiler get this wrong.

    Signed bit-fields are rarely useful.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Sep 2, 2005
    #7
  8. Peter Nilsson

    Eric Sosman Guest

    tedu wrote:
    > Eric Sosman wrote:
    >
    >> struct { int flag : 1; } s;
    >>
    >>... the value of `s.flag' will always be zero, no matter what
    >>you try to assign to it. (Explanation: if `int' is signed
    >>when used as a bit-field base, the lone bit of `s.flag' will
    >>be its sign bit and there will be no value bits at all.)

    >
    >
    > I may be misunderstanding how bitfields work, but why couldn't you read
    > and write the sign bit in such a case?


    The Standard allows three representations for signed
    integers: signed magnitude, ones' complement, and two's
    complement. In all three, if the sign bit is clear the
    value is what you'd expect by considering the value bits
    as ordinary binary notation. Since s.flag has no value
    bits, its value when the sign bit is clear must be zero.

    Now for the negatives. If the sign bit is set

    - for signed magnitude, the value is the negative of
    the number obtained from the value bits. The value
    bits contribute zero, which when negated is still
    zero.[*]

    - for ones' complement, the value is the negative of
    the number obtained by inverting all the value bits.
    There are no value bits to invert, so the zero value
    they contribute is still zero after negation.[*]

    - for two's complement, the value is the number obtained
    from the value bits, minus 2**k where k is the count
    of value bits (zero in this case). The value bits
    produce a zero, and subtracting 2**0 == -1 yields -1.
    Aha! I see that I mis-remembered and mis-stated
    the case.

    So: in two of the allowable representations the value
    is uniformly zero.[*] In two's complement the value is
    either zero or minus one, but never plus one. When I said
    `if (s.flag)' could not succeed I was wrong (for the common
    case of two's complement); what I should have said had I
    remembered the case better and/or been thinking more clearly
    is that `if (s.flag > 0)' or `if (s.flag == 1)' could never
    succeed, not in any of the three representations. And indeed,
    this is what the compiler warned about: the range of s.flag
    includes no positive values, so the compiler emitted the same
    warning it would for `if (sizeof(int) < 0)'. I apologize if
    I've misled anyone.

    [*] The Standard permits minus zero to be a "trap value,"
    meaning that you don't even get a value at all: you get a
    signal of some kind instead. In all three representations
    s.flag = 1 is dubious, because 1 is outside the representable
    range: you'll get an implementation-defined value or an
    implementation-defined signal.

    --
     
    Eric Sosman, Sep 2, 2005
    #8
  9. In article <dfaaov$4k$>,
    Eric Sosman <> wrote:
    >
    > Since s.flag has no value
    > bits, its value when the sign bit is clear must be zero.


    Philosophically (i.e. I'm not describing the current state of C), it
    seems to me that a value with no bits should be void. There just isn't
    any information there, not even a zero. Following from that, -1 * void
    would be a non-sequitur.
     
    Anonymous 7843, Sep 2, 2005
    #9
  10. Peter Nilsson

    tedu Guest

    Eric Sosman wrote:
    > tedu wrote:
    > > Eric Sosman wrote:
    > >
    > >> struct { int flag : 1; } s;
    > >>
    > >>... the value of `s.flag' will always be zero, no matter what
    > >>you try to assign to it. (Explanation: if `int' is signed
    > >>when used as a bit-field base, the lone bit of `s.flag' will
    > >>be its sign bit and there will be no value bits at all.)

    > >
    > >
    > > I may be misunderstanding how bitfields work, but why couldn't you read
    > > and write the sign bit in such a case?


    > So: in two of the allowable representations the value
    > is uniformly zero.[*] In two's complement the value is
    > either zero or minus one, but never plus one. When I said
    > `if (s.flag)' could not succeed I was wrong (for the common
    > case of two's complement); what I should have said had I
    > remembered the case better and/or been thinking more clearly
    > is that `if (s.flag > 0)' or `if (s.flag == 1)' could never
    > succeed, not in any of the three representations. And indeed,
    > this is what the compiler warned about: the range of s.flag
    > includes no positive values, so the compiler emitted the same
    > warning it would for `if (sizeof(int) < 0)'. I apologize if
    > I've misled anyone.


    Thanks a lot. I was thinking of bitfields (of the :1 variety) as "1
    bit of an int type" when "an integer type of 1 bit" seems more
    appropriate.
     
    tedu, Sep 2, 2005
    #10
  11. Peter Nilsson

    Joe Wright Guest

    Eric Sosman wrote:
    >
    > tedu wrote:
    >
    >>Eric Sosman wrote:
    >>
    >>
    >>> struct { int flag : 1; } s;
    >>>
    >>>... the value of `s.flag' will always be zero, no matter what
    >>>you try to assign to it. (Explanation: if `int' is signed
    >>>when used as a bit-field base, the lone bit of `s.flag' will
    >>>be its sign bit and there will be no value bits at all.)

    >>
    >>
    >>I may be misunderstanding how bitfields work, but why couldn't you read
    >>and write the sign bit in such a case?

    >
    >
    > The Standard allows three representations for signed
    > integers: signed magnitude, ones' complement, and two's
    > complement. In all three, if the sign bit is clear the
    > value is what you'd expect by considering the value bits
    > as ordinary binary notation. Since s.flag has no value
    > bits, its value when the sign bit is clear must be zero.
    >
    > Now for the negatives. If the sign bit is set
    >
    > - for signed magnitude, the value is the negative of
    > the number obtained from the value bits. The value
    > bits contribute zero, which when negated is still
    > zero.[*]
    >
    > - for ones' complement, the value is the negative of
    > the number obtained by inverting all the value bits.
    > There are no value bits to invert, so the zero value
    > they contribute is still zero after negation.[*]
    >
    > - for two's complement, the value is the number obtained
    > from the value bits, minus 2**k where k is the count
    > of value bits (zero in this case). The value bits
    > produce a zero, and subtracting 2**0 == -1 yields -1.
    > Aha! I see that I mis-remembered and mis-stated
    > the case.
    >
    > So: in two of the allowable representations the value
    > is uniformly zero.[*] In two's complement the value is
    > either zero or minus one, but never plus one. When I said
    > `if (s.flag)' could not succeed I was wrong (for the common
    > case of two's complement); what I should have said had I
    > remembered the case better and/or been thinking more clearly
    > is that `if (s.flag > 0)' or `if (s.flag == 1)' could never
    > succeed, not in any of the three representations. And indeed,
    > this is what the compiler warned about: the range of s.flag
    > includes no positive values, so the compiler emitted the same
    > warning it would for `if (sizeof(int) < 0)'. I apologize if
    > I've misled anyone.
    >
    > [*] The Standard permits minus zero to be a "trap value,"
    > meaning that you don't even get a value at all: you get a
    > signal of some kind instead. In all three representations
    > s.flag = 1 is dubious, because 1 is outside the representable
    > range: you'll get an implementation-defined value or an
    > implementation-defined signal.
    >

    Try this..

    #include <stdio.h>
    struct { int flag : 1; } s;
    int main(void) {
    s.flag = 1;
    printf("%d\n", s.flag);
    s.flag = 0;
    printf("%d\n", s.flag);
    s.flag = -1;
    printf("%d\n", s.flag);
    s.flag = -2;
    printf("%d\n", s.flag);
    return 0;
    }
    Here at home gcc 3.1 prints..

    -1
    0
    -1
    0

    ...which seems at odds with your explanation. It looks like the low-order
    bit gets assigned as a value bit and that in the conversion :1 to int
    the value expands into the sign.

    Of course gcc could get it wrong. (?)

    --
    Joe Wright
    "Everything should be made as simple as possible, but not simpler."
    --- Albert Einstein ---
     
    Joe Wright, Sep 2, 2005
    #11
  12. Peter Nilsson

    Eric Sosman Guest

    Joe Wright wrote:

    > Eric Sosman wrote:
    >> [...]
    >> - for two's complement, the value is the number obtained
    >> from the value bits, minus 2**k where k is the count
    >> of value bits (zero in this case). The value bits
    >> produce a zero, and subtracting 2**0 == -1 yields -1.

    >
    > Try this..
    >
    > #include <stdio.h>
    > struct { int flag : 1; } s;
    > int main(void) {
    > s.flag = 1;
    > printf("%d\n", s.flag);
    > s.flag = 0;
    > printf("%d\n", s.flag);
    > s.flag = -1;
    > printf("%d\n", s.flag);
    > s.flag = -2;
    > printf("%d\n", s.flag);
    > return 0;
    > }
    > Here at home gcc 3.1 prints..
    >
    > -1
    > 0
    > -1
    > 0
    >
    > ..which seems at odds with your explanation. It looks like the low-order
    > bit gets assigned as a value bit and that in the conversion :1 to int
    > the value expands into the sign.


    It seems to confirm what I wrote (once I got my head
    unscrewed): in two's complement, a one-bit signed integer
    is either 0 or -1. Your first and fourth assignments invoke
    implementation-defined behavior, because neither 1 nor -2
    is one of the possible values. You got an implementation-
    defined value as a result; the Standard also permits the
    raising of an implementation-defined signal.

    (Actually, there's implementation-defined behavior in
    the third assignment, too, because the implementation defines
    the representation of signed integers: if it chooses signed
    magnitude or ones' complement the only representable value
    is zero, and -1 is out of the representable range.)

    --
    Eric Sosman
    lid
     
    Eric Sosman, Sep 3, 2005
    #12
  13. Peter Nilsson

    Tim Rentsch Guest

    Eric Sosman <> writes:

    > tedu wrote:
    > > Eric Sosman wrote:
    > >
    > >> struct { int flag : 1; } s;
    > >>
    > >>... the value of `s.flag' will always be zero, no matter what
    > >>you try to assign to it. (Explanation: if `int' is signed
    > >>when used as a bit-field base, the lone bit of `s.flag' will
    > >>be its sign bit and there will be no value bits at all.)

    > >
    > >
    > > I may be misunderstanding how bitfields work, but why couldn't you read
    > > and write the sign bit in such a case?

    >
    > The Standard allows three representations for signed
    > integers: signed magnitude, ones' complement, and two's
    > complement. In all three, if the sign bit is clear the
    > value is what you'd expect by considering the value bits
    > as ordinary binary notation. Since s.flag has no value
    > bits, its value when the sign bit is clear must be zero.
    >
    > Now for the negatives. If the sign bit is set
    >
    > - for signed magnitude, the value is the negative of
    > the number obtained from the value bits. The value
    > bits contribute zero, which when negated is still
    > zero.[*]
    >
    > - for ones' complement, the value is the negative of
    > the number obtained by inverting all the value bits.
    > There are no value bits to invert, so the zero value
    > they contribute is still zero after negation.[*]
    >
    > - for two's complement, the value is the number obtained
    > from the value bits, minus 2**k where k is the count
    > of value bits (zero in this case). The value bits
    > produce a zero, and subtracting 2**0 == -1 yields -1.
    > Aha! I see that I mis-remembered and mis-stated
    > the case.
    >
    > So: in two of the allowable representations the value
    > is uniformly zero.[*] In two's complement the value is
    > either zero or minus one, but never plus one. When I said
    > `if (s.flag)' could not succeed I was wrong (for the common
    > case of two's complement); what I should have said had I
    > remembered the case better and/or been thinking more clearly
    > is that `if (s.flag > 0)' or `if (s.flag == 1)' could never
    > succeed, not in any of the three representations. And indeed,
    > this is what the compiler warned about: the range of s.flag
    > includes no positive values, so the compiler emitted the same
    > warning it would for `if (sizeof(int) < 0)'. I apologize if
    > I've misled anyone.
    >
    > [*] The Standard permits minus zero to be a "trap value,"
    > meaning that you don't even get a value at all: you get a
    > signal of some kind instead. In all three representations
    > s.flag = 1 is dubious, because 1 is outside the representable
    > range: you'll get an implementation-defined value or an
    > implementation-defined signal.


    Actually, 'if (s.flag > 0)' or 'if (s.flag == 1)' could
    succeed, in any of the three representation schemes,
    depending on the implementation.

    The rule about conversions that's indirectly referenced in
    the *'ed paragraph is slightly misquoted. The wording in
    6.3.1.3 p3 says the result is implementation-defined, not
    that the conversions yields an implementation-defined value:

    6.3.1.3

    3 Otherwise, the new type is signed and the value
    cannot be represented in it; either the result is
    implementation-defined or an implementation-defined
    signal is raised.

    Because the result is implementation-defined, it can be a
    trap representation. A sign bit of 1, with no value bits,
    can be (implementation-)defined to be a trap representation
    in any of the three representation schemes (6.2.6.2 p2). If
    the result of the conversion is a trap representation, the
    expressions accessing s.flag are undefined behavior; so,
    they might yield 1 as a result.

    The question of what conversions happen when assigning to
    the bitfield s.flag is a little murky, because exactly what
    the type is of a bitfield is a little murky. However, one
    of three cases holds: (1) a conversion to the narrow type
    happens because of the assignment -- this conversion can
    yield a trap representation and so produces undefined
    behavior; (2) a conversion to the narrow type happens when
    the value of the right hand side is stored, which again can
    yield a trap representation and produce undefined behavior;
    or, (3) the right hand side is converted to 'int' type, and
    there is no conversion to the narrow type, but storing the
    value into the too-narrow bitfield results in an exceptional
    condition, again producing undefined behavior. In each case
    the result of the undefined behavior could cause the
    expressions referencing s.flag to have the value 1.

    The s.flag expressions could also yield the value 1 without
    the presence of undefined behavior. The reason is that it's
    implementation-defined whether 'int' on a bitfield is the
    same as 'signed int' or 'unsigned int' (6.7.2 p5).
     
    Tim Rentsch, Sep 3, 2005
    #13
  14. Eric Sosman wrote:
    > Peter Nilsson wrote:
    > > Keith Thompson wrote:
    > > > The only allowed types for bit-fields are plain int, unsigned
    > > > int, signed int, and (C99 only) _Bool.

    > >
    > > Thanks for that.
    > >
    > > I note though that C99 also includes "or some other
    > > implementation-defined type." Did any of the final C89/90/94/95
    > > include this too? [I notice that it's missing from the C89
    > > draft.]

    >
    > A given implementation is permitted to accept `wchar_t',
    > say, as a bit-field base type, and is not required to issue
    > a diagnostic if you use it.


    This seems a very round about way to answer my question. You seem
    to be implying 'yes, implementation defined bitfield types also
    made C90'. ;)

    <snip>

    --
    Peter
     
    Peter Nilsson, Sep 5, 2005
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. kevin
    Replies:
    11
    Views:
    5,811
    Andrew McDonagh
    Jan 8, 2005
  2. heyo
    Replies:
    3
    Views:
    924
    Dan Pop
    Apr 1, 2004
  3. pete
    Replies:
    4
    Views:
    802
    Dan Pop
    Apr 2, 2004
  4. Wally Barnes
    Replies:
    3
    Views:
    530
    Wally Barnes
    Nov 20, 2008
  5. Sosuke

    Up casting and down casting

    Sosuke, Dec 20, 2009, in forum: C++
    Replies:
    2
    Views:
    569
    James Kanze
    Dec 20, 2009
Loading...

Share This Page