Int to char[4]

K

Kai-Uwe Bux

Jerry said:
[ ... ]
I think, I see your point now: although the standard guarantees that
memory consists of bytes and each byte is invidually addressable, and
although it guarantees that an unsigned char has size 1, which means it
is exactly one byte, there is no guarantee that unsigned char has no
alignment, i.e., the standard does not guarantee that each bytes can be
addressed by means of a pointer to unsigned char.

Actually, I don't see anything that even says every byte is individually
addressable. It says a char is one byte, and everything else is composed
of bytes, but I don't see anything that guarantees that those bytes are
all individually addressable.

I was thinking of [1.7/1]: ... Every byte has a unique address.


[very interesting non-controversial material snipped]


Best

Kai-Uwe Bux
 
J

Jerry Coffin

[ ... ]
I was thinking of [1.7/1]: ... Every byte has a unique address.

I'm pretty sure I didn't express what I was trying to say very well.
What I was trying to say is that this requires the implementation to
_assign_ an address to every byte, but doesn't guarantee that you can
_do_ anything with an arbitrary address. For example, a machine could
have an address range in which the smallest item it could read or write
was 32-bits. Assuming its char was 8 bits, each of those 32-bit items
would have to be given four different addresses -- even if you could
only read or write anything via one of them.

As I said, I seriously doubt anybody intended or wanted that, but I
can't think of anything in the standard it would violate.
 
K

Kai-Uwe Bux

Jerry said:
[ ... ]
I was thinking of [1.7/1]: ... Every byte has a unique address.

I'm pretty sure I didn't express what I was trying to say very well.
What I was trying to say is that this requires the implementation to
_assign_ an address to every byte, but doesn't guarantee that you can
_do_ anything with an arbitrary address. For example, a machine could
have an address range in which the smallest item it could read or write
was 32-bits. Assuming its char was 8 bits, each of those 32-bit items
would have to be given four different addresses -- even if you could
only read or write anything via one of them.

Hm, now you have confused me again. How would pointer arithmetic work on
that machine for

char banner [20];
for ( char * iter = &banner; iter != &banner+20; ++ iter ) {
*iter = 0;
}

(or whatever, I am not good at raw arrays and pointer so there may be syntax
issues).

I think the array is contiguous and the running pointer is not supposed to
hit any trap values within the loop.


Best

Kai-Uwe Bux
 
J

Jerry Coffin

Jerry Coffin wrote:

[ ... ]

Note that I said "an address range" -- this wouldn't necessarily be the
case with all its memory.
Assuming its char was 8 bits, each of those 32-bit items
would have to be given four different addresses -- even if you could
only read or write anything via one of them.

Hm, now you have confused me again. How would pointer arithmetic work on
that machine for

char banner [20];
for ( char * iter = &banner; iter != &banner+20; ++ iter ) {
*iter = 0;
}

(or whatever, I am not good at raw arrays and pointer so there may be syntax
issues).

The implementation has two choices. The first is if you've defined
something as a char, it allocates it in a different address range that
allows byte-level access. In this case, your code works fine, but if you
tried something like:

int x;
char *y = &x;

y[1] = 0;

In this case, we've defined an int, so it allocates memory that's only
accessible on word boundaries. Since we'll assume it's four bytes, it
has to assign an address to each of those four bytes, so 'y[1]'
generates a meaningful address, but I don't see anything that says the
assignment has to word -- in fact, it seems to me that the alignment
rules allow it to fail. That byte has to have an address, but nothing
says we can use it. If we originally allocated the memory as an array of
char (or used malloc, etc.) then it's required to be aligned so we can
use it in this fashion -- but if we allocate it statically or
automatically for a non-char type, the access may be misaligned.

The second approach is to fundge it: read a whole 32-bit word (or
whatever) and carry out byte-level operations inside of the registers,
using bit masks, anding/oring, etc., to make the right things happen.
For example, your code above could be encoded something like:

loop:
sub r1, r1
mov r0, banner[r1]
and r0, 0xffffff00
and r0, 0xffff00ff
and r0, 0xff00ffff
and r0, 0x00ffffff
mov banner[r1], r0
cmp banner, 20
jne loop

Of course any decent compiler would figure out that the four consecutive
AND's set the whole thing to zero, and that it was never using the
previous value, so this would almost certainly end up just writing full-
words. A more interesting case would be something like:

char x[10];

x[1] += 2;

For this, the compiler would have to generate something like:

mov r0, x[0] // includes x[1]
mov r1, r0 // make a copy
and r1, 0x0000ff00 // isolate the addend
shr r1, 8 // shift it so that byte is a bottom of reg.
add r1, 2 // do the actual addition
and r0, 0xffff00ff // mask byte out of original word
shl r1, 8 // shift copy back where it belongs
or r0, r1 // or the result back into the original word
mov x[0], r0 // store the result, including 3 unchanged bytes

If x were declared volatile, however, this probably wouldn't conform
anymore, since it generates reads and writes of values that aren't
changed in the source code.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,602
Members
45,185
Latest member
GluceaReviews

Latest Threads

Top