Is it legal to type cast to DWORD* ???

D

Denis Remezov

__PPS__ said:
Actually what I mean is that - if I have some memory buffer, lets say
char a[64]; and then I do like this:

DWORD num = 0x1234;
*(DWORD*)a = num; (1)
*(DWORD*)(a+1) = num; (2)

either (1) or (2) will assign dword value to not dword aligned
address. The question is - will this code be fatal on some systems??

I used it in a program and somebody told me that it produces fatal
bugs for some systems . From language point of view this staement is
perfectly legal, so I'm wondering is it a real problem???

For this type of conversion (reinterpret_cast for data pointers) the
standard makes only one guarantee: the result of a second convertion to
the original pointer type is the original pointer value (and only
provided that the alignment requirements for unsigned long are the
same (for this example) as for char). The result of anything that
you do beyond that is unspecified, i.e. implementation-dependent.

On some systems, misaligned data access will cause a fatal error.
You may be able to enable system-specific traps to handle misaligned
addresses at the cost of the performance. On some others, there
are no traps and no errors but the performance will still suffer.
Sometimes noticeably. It's all system-specific.

The standard explicitly permits to copy a POD object (that includes
built-in types) into a char array and then copy the contents back
into the object, character by character (e.g. by using memcpy).
If what you read from the array is garbage, the behaviour, I suppose,
is undefined (not all possible bit combinations need to represent
a valid object value, in principle; unfortunately, I don't remember
seeing examples for integral types).

(By the way, it's better to avoid using non-standard definitions
such as DWORD for the purpose of discussion here. What I managed
to grep on my system might be even slightly different from
your DWORD :) ).

Denis
 
P

__PPS__

Actually what I mean is that - if I have some memory buffer, lets say
char a[64]; and then I do like this:

DWORD num = 0x1234;
*(DWORD*)a = num; (1)
*(DWORD*)(a+1) = num; (2)

either (1) or (2) will assign dword value to not dword aligned
address. The question is - will this code be fatal on some systems??

I used it in a program and somebody told me that it produces fatal
bugs for some systems . From language point of view this staement is
perfectly legal, so I'm wondering is it a real problem???

Thank you.
 
J

John Harrison

Actually what I mean is that - if I have some memory buffer, lets say
char a[64]; and then I do like this:

DWORD num = 0x1234;
*(DWORD*)a = num; (1)
*(DWORD*)(a+1) = num; (2)

either (1) or (2) will assign dword value to not dword aligned
address. The question is - will this code be fatal on some systems??

Yes, and even on systems where it is not fatal it might be inefficient.
I used it in a program and somebody told me that it produces fatal
bugs for some systems . From language point of view this staement is
perfectly legal, so I'm wondering is it a real problem???

No it is not perfectly legal from the language point of view. It might
compile on your compiler but that doesn't make it legal.

The problem is exactly as you say. Some systems make assumtpion about the
alignment of data and therefore the C++ language standard takes great
pains to allow for such systems by forbidding code like yours.

It is impossible in general to write a compiler that will detect and warn
about such code, but that doesn't make it legal. If you write code like
that you are on your own.

john
 
P

Phlip

__PPS__ said:
Actually what I mean is that - if I have some memory buffer, lets say
char a[64]; and then I do like this:

DWORD num = 0x1234;
*(DWORD*)a = num; (1)
*(DWORD*)(a+1) = num; (2)

either (1) or (2) will assign dword value to not dword aligned
address. The question is - will this code be fatal on some systems??

*(DWORD*)(a+1) = num will cause a hardware fault on a Motorola 68x00 chip. A
pointer to an integer type cannot contain an odd address. (A character array
has good odds of landing on an even address, but even *(DWORD*)a = num might
croak.)
I used it in a program and somebody told me that it produces fatal
bugs for some systems . From language point of view this staement is
perfectly legal, so I'm wondering is it a real problem???

Legal ain't moral. Don't do it.

You are abusing an array by copying 4 byte quads into it. If you need an
array of DWORDs, declare one.

The only reason you might need an array of DWORDs shifted at integral
indices would be some kind of binary compatibility. If so, there must be
some better way to get what you need. Such a binary compatibility will come
with rules regarding the "big endian" or "little endian" holy war (look
those up), and you can usually shift and mask binary bytes out of DWORDs and
pack them into characters.

As a style rule that leads to technical rules, shun C style casts, such as
(DWORD*). In this case, the only C++ alternative would be
reinterpret_cast<DWORD*>(a).

As a style rule that leads to technical rules, shun reinterpret_cast<>
without an overwhelming reason to use it.
 
J

Jack Klein

On 9 Jul 2004 19:08:00 -0700, (e-mail address removed) (__PPS__) wrote in
comp.lang.c++:

I don't know, you haven't provided a definition of the type DWORD,
which is not a standard C++ type.
Actually what I mean is that - if I have some memory buffer, lets say
char a[64]; and then I do like this:

DWORD num = 0x1234;
*(DWORD*)a = num; (1)
*(DWORD*)(a+1) = num; (2)

either (1) or (2) will assign dword value to not dword aligned
address. The question is - will this code be fatal on some systems??

Yes, quite a few. From very old architectures like the 8096 and the
68000, to many newer RISC platforms like ARM and TI28xx.
I used it in a program and somebody told me that it produces fatal
bugs for some systems . From language point of view this staement is
perfectly legal, so I'm wondering is it a real problem???

Who says it is "perfectly legal"? You? Can you cite a reference from
the ISO C++ language standard confirming that it is "perfectly legal"?
I can cite one that says it is not "perfectly legal".

[ISO 14882:1998 3.9 Types paragraph 5]

Object types have alignment requirements (3.9.1, 3.9.2). The alignment
of a complete object type is an implementation-defined integer value
representing a number of bytes; an object is allocated at an address
that meets the alignment requirements of its object type.

[end quotation]

Accessing an object at an improperly aligned address is undefined
behavior.
 
P

Phlip

Jack said:
[ISO 14882:1998 3.9 Types paragraph 5]

Object types have alignment requirements (3.9.1, 3.9.2). The alignment
of a complete object type is an implementation-defined integer value
representing a number of bytes; an object is allocated at an address
that meets the alignment requirements of its object type.

[end quotation]

Accessing an object at an improperly aligned address is undefined
behavior.

Ah, language law...

What Jack means is: Pointing at an object at an address the implementation
defines as improper is undefined.

If a DWORD is indeed the legendary "double word", or a quad, and the chip is
x86, and the C++ implementation implements this, the behavior is defined.

Don't do it anyway.
 
P

__PPS__

Thanks guys for your replies - it helped alot - from now on I will
avoid this problem.

Why I used it:
I wrote a simple class for cerating and sending radius packets, radius
client (rfc2865) to make it faster and easier to use for an opensource
project that is run on almost all unixes, windowses, macos, solaris,
etc...
and I defined a radius pdu like this:

NOTE: it should be exactly 4096 bytes in size to avoid misaligned
fields at some places, if all the fields are chars, hopefully, it will
always be (if some compiler doesn't think it could be more)

class something{
union {
unsigned char as_raw_data[4096];
struct {
unsigned char code;
unsigned char identifier;
unsigned char length[2];
unsigned char authenticator[16];
unsigned char pdu[4076]; //this later contains list of
attributes.
}as_pdu;
} data;
public:
....
....
};


When preparing a packet to be sent sometimes the unsigned char
authenticator[16]; field is set to random 16 bytes. I do it using
mersenne twister pseudorandom like this (random is a static instance
of a class)
*(DWORD*)(&data.as_pdu.authenticator[0]) = random;
*(DWORD*)(&data.as_pdu.authenticator[4]) = random;
*(DWORD*)(&data.as_pdu.authenticator[8]) = random;
*(DWORD*)(&data.as_pdu.authenticator[12]) = random;

As you can see authenticator is 4 bytes aligned to data - will data
(data is the name for the structure) be aligned to to 4 bytes or it's
unaligned? I'm going to change my code to reflect your comments, but
fot the sake of better knowlege I have other questions:
if I defined pdu like this:
union {
unsigned char as_raw_data[4096];
struct {
unsigned char code;
unsigned char identifier;
unsigned char length[2];
union {
unsigned char authenticator_as_chars[16];
unsigned int authenticator_as_ints[4]; //each 4 bytes...
}
unsigned char pdu[4076];
}as_pdu;
} data;

I wouldn't probably have this problem with:
authenticator_as_ints[0] = random;
authenticator_as_ints[1] = random;
authenticator_as_ints[2] = random;
authenticator_as_ints[3] = random;

BUT, would my structure still be 4096 in total? (Looks like it should
be for systems that I do test on - what about others??)


//////////////////////////////

Jack said:
I don't know, you haven't provided a definition of the type DWORD,
which is not a standard C++ type.

So then, how could you possibly answer my question??
Who says it is "perfectly legal"?

What I was sure is that it doesn't make any difference with simple
(DWORD*)pointer cast,
as value of pointer itself is not chaged. But accssing data as char or
dword
pointed by a pointer makes difference (at least for some processors.)
differences are not mentioned about performance - it's about errors.
If I didn't express myself clearly - sorry then.


DWORD was intended to indicate 4 bytes - it wasn't about standart
or not - most of the people undestood what I meant.
 
P

Phlip

__PPS__ said:
Thanks guys for your replies - it helped alot - from now on I will
avoid this problem.

"Avoid" it by only doing it inside one function. Keep the rest of your
program ignorant of low-level data issues.

However...
class something{
union {
unsigned char as_raw_data[4096];
struct {
unsigned char code;
unsigned char identifier;
unsigned char length[2];
unsigned char authenticator[16];
unsigned char pdu[4076]; //this later contains list of
attributes.
}as_pdu;
} data;
public:
...
...
};


When preparing a packet to be sent sometimes the unsigned char
authenticator[16]; field is set to random 16 bytes. I do it using
mersenne twister pseudorandom like this (random is a static instance
of a class)
*(DWORD*)(&data.as_pdu.authenticator[0]) = random;
*(DWORD*)(&data.as_pdu.authenticator[4]) = random;
*(DWORD*)(&data.as_pdu.authenticator[8]) = random;
*(DWORD*)(&data.as_pdu.authenticator[12]) = random;

You did not present a reason to index authenticator at byte addresses. So
you might ought to do this:
I wouldn't probably have this problem with:
authenticator_as_ints[0] = random;
authenticator_as_ints[1] = random;
authenticator_as_ints[2] = random;
authenticator_as_ints[3] = random;

BUT, would my structure still be 4096 in total? (Looks like it should
be for systems that I do test on - what about others??)

The padding between PODS data elements is implementation-defined. Plenty of
platforms provide #pragma pack() to prevent padding.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,053
Latest member
BrodieSola

Latest Threads

Top