Is it legal to type cast to DWORD* ???

Discussion in 'C++' started by Denis Remezov, Jul 10, 2004.

  1. __PPS__ wrote:
    >
    > Actually what I mean is that - if I have some memory buffer, lets say
    > char a[64]; and then I do like this:
    >
    > DWORD num = 0x1234;
    > *(DWORD*)a = num; (1)
    > *(DWORD*)(a+1) = num; (2)
    >
    > either (1) or (2) will assign dword value to not dword aligned
    > address. The question is - will this code be fatal on some systems??
    >
    > I used it in a program and somebody told me that it produces fatal
    > bugs for some systems . From language point of view this staement is
    > perfectly legal, so I'm wondering is it a real problem???
    >


    For this type of conversion (reinterpret_cast for data pointers) the
    standard makes only one guarantee: the result of a second convertion to
    the original pointer type is the original pointer value (and only
    provided that the alignment requirements for unsigned long are the
    same (for this example) as for char). The result of anything that
    you do beyond that is unspecified, i.e. implementation-dependent.

    On some systems, misaligned data access will cause a fatal error.
    You may be able to enable system-specific traps to handle misaligned
    addresses at the cost of the performance. On some others, there
    are no traps and no errors but the performance will still suffer.
    Sometimes noticeably. It's all system-specific.

    The standard explicitly permits to copy a POD object (that includes
    built-in types) into a char array and then copy the contents back
    into the object, character by character (e.g. by using memcpy).
    If what you read from the array is garbage, the behaviour, I suppose,
    is undefined (not all possible bit combinations need to represent
    a valid object value, in principle; unfortunately, I don't remember
    seeing examples for integral types).

    (By the way, it's better to avoid using non-standard definitions
    such as DWORD for the purpose of discussion here. What I managed
    to grep on my system might be even slightly different from
    your DWORD :) ).

    Denis
     
    Denis Remezov, Jul 10, 2004
    #1
    1. Advertising

  2. Denis Remezov

    __PPS__ Guest

    Actually what I mean is that - if I have some memory buffer, lets say
    char a[64]; and then I do like this:

    DWORD num = 0x1234;
    *(DWORD*)a = num; (1)
    *(DWORD*)(a+1) = num; (2)

    either (1) or (2) will assign dword value to not dword aligned
    address. The question is - will this code be fatal on some systems??

    I used it in a program and somebody told me that it produces fatal
    bugs for some systems . From language point of view this staement is
    perfectly legal, so I'm wondering is it a real problem???

    Thank you.
     
    __PPS__, Jul 10, 2004
    #2
    1. Advertising

  3. On 9 Jul 2004 19:08:00 -0700, __PPS__ <> wrote:

    > Actually what I mean is that - if I have some memory buffer, lets say
    > char a[64]; and then I do like this:
    >
    > DWORD num = 0x1234;
    > *(DWORD*)a = num; (1)
    > *(DWORD*)(a+1) = num; (2)
    >
    > either (1) or (2) will assign dword value to not dword aligned
    > address. The question is - will this code be fatal on some systems??


    Yes, and even on systems where it is not fatal it might be inefficient.

    >
    > I used it in a program and somebody told me that it produces fatal
    > bugs for some systems . From language point of view this staement is
    > perfectly legal, so I'm wondering is it a real problem???


    No it is not perfectly legal from the language point of view. It might
    compile on your compiler but that doesn't make it legal.

    The problem is exactly as you say. Some systems make assumtpion about the
    alignment of data and therefore the C++ language standard takes great
    pains to allow for such systems by forbidding code like yours.

    It is impossible in general to write a compiler that will detect and warn
    about such code, but that doesn't make it legal. If you write code like
    that you are on your own.

    john
     
    John Harrison, Jul 10, 2004
    #3
  4. Denis Remezov

    Phlip Guest

    __PPS__ wrote:

    > Actually what I mean is that - if I have some memory buffer, lets say
    > char a[64]; and then I do like this:
    >
    > DWORD num = 0x1234;
    > *(DWORD*)a = num; (1)
    > *(DWORD*)(a+1) = num; (2)
    >
    > either (1) or (2) will assign dword value to not dword aligned
    > address. The question is - will this code be fatal on some systems??


    *(DWORD*)(a+1) = num will cause a hardware fault on a Motorola 68x00 chip. A
    pointer to an integer type cannot contain an odd address. (A character array
    has good odds of landing on an even address, but even *(DWORD*)a = num might
    croak.)

    > I used it in a program and somebody told me that it produces fatal
    > bugs for some systems . From language point of view this staement is
    > perfectly legal, so I'm wondering is it a real problem???


    Legal ain't moral. Don't do it.

    You are abusing an array by copying 4 byte quads into it. If you need an
    array of DWORDs, declare one.

    The only reason you might need an array of DWORDs shifted at integral
    indices would be some kind of binary compatibility. If so, there must be
    some better way to get what you need. Such a binary compatibility will come
    with rules regarding the "big endian" or "little endian" holy war (look
    those up), and you can usually shift and mask binary bytes out of DWORDs and
    pack them into characters.

    As a style rule that leads to technical rules, shun C style casts, such as
    (DWORD*). In this case, the only C++ alternative would be
    reinterpret_cast<DWORD*>(a).

    As a style rule that leads to technical rules, shun reinterpret_cast<>
    without an overwhelming reason to use it.

    --
    Phlip
    http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces
     
    Phlip, Jul 10, 2004
    #4
  5. Denis Remezov

    Jack Klein Guest

    On 9 Jul 2004 19:08:00 -0700, (__PPS__) wrote in
    comp.lang.c++:

    I don't know, you haven't provided a definition of the type DWORD,
    which is not a standard C++ type.

    > Actually what I mean is that - if I have some memory buffer, lets say
    > char a[64]; and then I do like this:
    >
    > DWORD num = 0x1234;
    > *(DWORD*)a = num; (1)
    > *(DWORD*)(a+1) = num; (2)
    >
    > either (1) or (2) will assign dword value to not dword aligned
    > address. The question is - will this code be fatal on some systems??


    Yes, quite a few. From very old architectures like the 8096 and the
    68000, to many newer RISC platforms like ARM and TI28xx.

    > I used it in a program and somebody told me that it produces fatal
    > bugs for some systems . From language point of view this staement is
    > perfectly legal, so I'm wondering is it a real problem???


    Who says it is "perfectly legal"? You? Can you cite a reference from
    the ISO C++ language standard confirming that it is "perfectly legal"?
    I can cite one that says it is not "perfectly legal".

    [ISO 14882:1998 3.9 Types paragraph 5]

    Object types have alignment requirements (3.9.1, 3.9.2). The alignment
    of a complete object type is an implementation-defined integer value
    representing a number of bytes; an object is allocated at an address
    that meets the alignment requirements of its object type.

    [end quotation]

    Accessing an object at an improperly aligned address is undefined
    behavior.

    --
    Jack Klein
    Home: http://JK-Technology.Com
    FAQs for
    comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
    comp.lang.c++ http://www.parashift.com/c -faq-lite/
    alt.comp.lang.learn.c-c++
    http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
     
    Jack Klein, Jul 10, 2004
    #5
  6. Denis Remezov

    Phlip Guest

    Jack Klein wrote:

    > [ISO 14882:1998 3.9 Types paragraph 5]
    >
    > Object types have alignment requirements (3.9.1, 3.9.2). The alignment
    > of a complete object type is an implementation-defined integer value
    > representing a number of bytes; an object is allocated at an address
    > that meets the alignment requirements of its object type.
    >
    > [end quotation]
    >
    > Accessing an object at an improperly aligned address is undefined
    > behavior.


    Ah, language law...

    What Jack means is: Pointing at an object at an address the implementation
    defines as improper is undefined.

    If a DWORD is indeed the legendary "double word", or a quad, and the chip is
    x86, and the C++ implementation implements this, the behavior is defined.

    Don't do it anyway.

    --
    Phlip
    http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces
     
    Phlip, Jul 10, 2004
    #6
  7. Denis Remezov

    __PPS__ Guest

    Thanks guys for your replies - it helped alot - from now on I will
    avoid this problem.

    Why I used it:
    I wrote a simple class for cerating and sending radius packets, radius
    client (rfc2865) to make it faster and easier to use for an opensource
    project that is run on almost all unixes, windowses, macos, solaris,
    etc...
    and I defined a radius pdu like this:

    NOTE: it should be exactly 4096 bytes in size to avoid misaligned
    fields at some places, if all the fields are chars, hopefully, it will
    always be (if some compiler doesn't think it could be more)

    class something{
    union {
    unsigned char as_raw_data[4096];
    struct {
    unsigned char code;
    unsigned char identifier;
    unsigned char length[2];
    unsigned char authenticator[16];
    unsigned char pdu[4076]; //this later contains list of
    attributes.
    }as_pdu;
    } data;
    public:
    ....
    ....
    };


    When preparing a packet to be sent sometimes the unsigned char
    authenticator[16]; field is set to random 16 bytes. I do it using
    mersenne twister pseudorandom like this (random is a static instance
    of a class)
    *(DWORD*)(&data.as_pdu.authenticator[0]) = random;
    *(DWORD*)(&data.as_pdu.authenticator[4]) = random;
    *(DWORD*)(&data.as_pdu.authenticator[8]) = random;
    *(DWORD*)(&data.as_pdu.authenticator[12]) = random;

    As you can see authenticator is 4 bytes aligned to data - will data
    (data is the name for the structure) be aligned to to 4 bytes or it's
    unaligned? I'm going to change my code to reflect your comments, but
    fot the sake of better knowlege I have other questions:
    if I defined pdu like this:
    union {
    unsigned char as_raw_data[4096];
    struct {
    unsigned char code;
    unsigned char identifier;
    unsigned char length[2];
    union {
    unsigned char authenticator_as_chars[16];
    unsigned int authenticator_as_ints[4]; //each 4 bytes...
    }
    unsigned char pdu[4076];
    }as_pdu;
    } data;

    I wouldn't probably have this problem with:
    authenticator_as_ints[0] = random;
    authenticator_as_ints[1] = random;
    authenticator_as_ints[2] = random;
    authenticator_as_ints[3] = random;

    BUT, would my structure still be 4096 in total? (Looks like it should
    be for systems that I do test on - what about others??)


    //////////////////////////////

    Jack Klein wrote:
    > I don't know, you haven't provided a definition of the type DWORD,
    > which is not a standard C++ type.


    So then, how could you possibly answer my question??

    > Who says it is "perfectly legal"?


    What I was sure is that it doesn't make any difference with simple
    (DWORD*)pointer cast,
    as value of pointer itself is not chaged. But accssing data as char or
    dword
    pointed by a pointer makes difference (at least for some processors.)
    differences are not mentioned about performance - it's about errors.
    If I didn't express myself clearly - sorry then.


    DWORD was intended to indicate 4 bytes - it wasn't about standart
    or not - most of the people undestood what I meant.
     
    __PPS__, Jul 10, 2004
    #7
  8. Denis Remezov

    Phlip Guest

    "__PPS__" <> wrote in message
    news:...
    > Thanks guys for your replies - it helped alot - from now on I will
    > avoid this problem.


    "Avoid" it by only doing it inside one function. Keep the rest of your
    program ignorant of low-level data issues.

    However...

    > class something{
    > union {
    > unsigned char as_raw_data[4096];
    > struct {
    > unsigned char code;
    > unsigned char identifier;
    > unsigned char length[2];
    > unsigned char authenticator[16];
    > unsigned char pdu[4076]; //this later contains list of
    > attributes.
    > }as_pdu;
    > } data;
    > public:
    > ...
    > ...
    > };
    >
    >
    > When preparing a packet to be sent sometimes the unsigned char
    > authenticator[16]; field is set to random 16 bytes. I do it using
    > mersenne twister pseudorandom like this (random is a static instance
    > of a class)
    > *(DWORD*)(&data.as_pdu.authenticator[0]) = random;
    > *(DWORD*)(&data.as_pdu.authenticator[4]) = random;
    > *(DWORD*)(&data.as_pdu.authenticator[8]) = random;
    > *(DWORD*)(&data.as_pdu.authenticator[12]) = random;


    You did not present a reason to index authenticator at byte addresses. So
    you might ought to do this:

    > I wouldn't probably have this problem with:
    > authenticator_as_ints[0] = random;
    > authenticator_as_ints[1] = random;
    > authenticator_as_ints[2] = random;
    > authenticator_as_ints[3] = random;
    >
    > BUT, would my structure still be 4096 in total? (Looks like it should
    > be for systems that I do test on - what about others??)


    The padding between PODS data elements is implementation-defined. Plenty of
    platforms provide #pragma pack() to prevent padding.

    --
    Phlip
    http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces
     
    Phlip, Jul 10, 2004
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Lee K

    DWORD date value

    Lee K, Jul 17, 2003, in forum: C++
    Replies:
    6
    Views:
    5,727
    Victor Bazarov
    Jul 18, 2003
  2. Xanax
    Replies:
    20
    Views:
    3,619
    Corno
    Sep 10, 2003
  3. Khuong Dinh Pham

    convert std::string to (byte*, DWORD)

    Khuong Dinh Pham, Aug 20, 2005, in forum: C++
    Replies:
    16
    Views:
    1,334
    Larry I Smith
    Aug 30, 2005
  4. monkeydragon
    Replies:
    7
    Views:
    560
    monkeydragon
    Dec 14, 2005
  5. Guest
    Replies:
    4
    Views:
    721
    Guest
    Aug 24, 2006
Loading...

Share This Page