Casting int from char pointer

T

Test

I have:

int xx;
char *buf;

buf is pointed to a string with int values, for example:

buf=(char *)malloc(4);
buf[0]=0;
buf[1]=0;
buf[2]=178;
buf[3]=3;
etc.

I want xx to have the int value of 946 from buf[2] and buf[3].
I can do it like this:

memcpy( &xx, &buffer[2], 2 );

or

((char *)(&xx))[0]=buffer[2];
((char *)(&xx))[1]=buffer[3];


Both look clumsy. How would I cast it more elegantly.

My target is win32 if that matters.
 
M

Morris Keesan

I have:

int xx;
char *buf;

buf is pointed to a string with int values, for example:

buf=(char *)malloc(4);
buf[0]=0;
buf[1]=0;
buf[2]=178;
buf[3]=3;
etc.

I want xx to have the int value of 946 from buf[2] and buf[3].
I can do it like this:

memcpy( &xx, &buffer[2], 2 );

or

((char *)(&xx))[0]=buffer[2];
((char *)(&xx))[1]=buffer[3];


Both look clumsy. How would I cast it more elegantly.

Avoiding non-portable assumptions about endian-ness and int sizes,
why not
xx = (buf[2] << 8) + buf[3];
My target is win32 if that matters.
Best not to write code for which the target machine would matter.
 
B

Ben Pfaff

Test said:
I want xx to have the int value of 946 from buf[2] and buf[3].
I can do it like this:

memcpy( &xx, &buffer[2], 2 );

or

((char *)(&xx))[0]=buffer[2];
((char *)(&xx))[1]=buffer[3];


Both look clumsy. How would I cast it more elegantly.

My target is win32 if that matters.

"int" is only 2 bytes on Win32? I am surprised. I would have
guessed 4. If "int" is not exactly 2 bytes long, then neither of
your approaches will work. If it is, on the other hand, then I'd
suggest the first solution.
 
M

Martin Ambuhl

I have:

int xx;
char *buf;

buf is pointed to a string with int values, for example:

buf=(char *)malloc(4);
^^^^^^^^
The cast is unnecessary, and most unnecessary things introduced into
code are unwise. The cast tells anyone reading this that there is
something special which requires it. Don't cause unnecessary headaches
for anyone reading your code.
buf[0]=0;
buf[1]=0;
buf[2]=178;
^^^
This is not portable. If you want the buffer to contained unsigned
chars, declare buf as a pointer to unsigned char.
buf[3]=3;
etc.

I want xx to have the int value of 946 from buf[2] and buf[3].
I can do it like this:

memcpy(&xx,&buffer[2], 2 );

No, you can't. This presumes
1) a particular size for char
2) a particular size for int
3) a particular byte order within ints
4) a particular signedness for char
((char *)(&xx))[0]=buffer[2];
((char *)(&xx))[1]=buffer[3];

No, you can't, for the same reasons. Besides being unreasonably messy.
Both look clumsy. How would I cast it more elegantly.

Stop thinking about casts. You simply shift the value in buffer[3] an
appropriate number of bits and add the value in buffer[2]. No casts, no
silly pointer tricks, mo memcpy.
 
K

Keith Thompson

John Gordon said:
Isn't sizeof(char) guaranteed to be 1?

Yes.

But the code in question appeared to assume that an int is 2 bytes.
I suppose one could say either than (a) it assumes int is 2 bytes,
or (b) it assumes char is 8 bits and int is 16 bits.

(Incidentally, int is 32 bits, or 4 8-bit bytes, in win32, which is the
OP's environment.)

Determining exactly which questionable assumptions the code made is
probably less useful than actually fixing it.
 
J

James Kuyper

Isn't sizeof(char) guaranteed to be 1?

Yes, but I think he was referring to the size in bits, rather than the
size in bytes. CHAR_BITS can be greater than 8.
 
M

Martin Ambuhl

Isn't sizeof(char) guaranteed to be 1?

Indeed, sizeof(char) = 1, but that is precisely in units of
sizeof(char). There are, as I'm sure you know, more meanings for the
size of a region of storage than simply the result of the operator
sizeof. How large, for example, how large would a char be on a GE 645
running multics? The result of sizeof(char) would certainly be 1, but
the on the GE645 the norm for character storage was four 9-bit chars in
a 36-bit word. His assumption of octets would surely land him hot water.
 
I

Ike Naar

On 04/19/2011 03:59 PM, John Gordon wrote:
Yes, but I think he was referring to the size in bits, rather than the
size in bytes. CHAR_BITS can be greater than 8.

Nit: it's CHAR_BIT .
 
B

Ben Bacarisse

Morris Keesan said:
I have:

int xx;
char *buf;

buf is pointed to a string with int values, for example:

buf=(char *)malloc(4);
buf[0]=0;
buf[1]=0;
buf[2]=178;
buf[3]=3;
etc.
Avoiding non-portable assumptions about endian-ness and int sizes,
why not
xx = (buf[2] << 8) + buf[3];

It's not as simple as that. A couple of posters have suggested that
this is all that is needed, but to avoid assumptions about int sizes you
need to work harder.

(1) char could be signed and buf[2] negative so the shift could be
undefined.

(2) If we make buf unsigned char, it will still probably promote to int
so the arithmetic value of the shift may be unrepresentable in the
result type. That also makes it undefined.

(3) If we cast to unsigned int (i.e. (unsigned int)buf[2] << 8) the
result may be too large to assign to an int without triggering
implementation defined behaviour or (worse) the raising of an
implementation defined signal.

There are, of course, ways round all this but, frankly, I'd write what
you have done 99 times out of 100. I've only had to do this once in a
library that needed to give strong portability assurances, and I did it
your way there too, but that was in K&R1 days so I could just wave my
hands an talk about what will "probably happen on most systems".

All of the above problems assume C99. In C90 ("ANSI C") the wording is
more vague, so while buf[2] << 8 can't be undefined, the exact value of
a signed shift is not 100 clear to me.

<snip>
 
T

Test

Morris Keesan said:
I have:

int xx;
char *buf;

buf is pointed to a string with int values, for example:

buf=(char *)malloc(4);
buf[0]=0;
buf[1]=0;
buf[2]=178;
buf[3]=3;
etc.

I want xx to have the int value of 946 from buf[2] and buf[3].
I can do it like this:

memcpy( &xx, &buffer[2], 2 );

or

((char *)(&xx))[0]=buffer[2];
((char *)(&xx))[1]=buffer[3];


Both look clumsy. How would I cast it more elegantly.

Avoiding non-portable assumptions about endian-ness and int sizes,
why not
xx = (buf[2] << 8) + buf[3];
My target is win32 if that matters.
Best not to write code for which the target machine would matter.

For my purpose
xx = (buf[3] << 8) + buf[2];

was correct. Thank you for your post.
This code was ported from 16-bit to 32 and xx never exceeds FF FF. After some
testing it seems that the old clumsy looking is slightly faster.
 
J

James Kuyper

Morris Keesan said:
I have:

int xx;
char *buf;

buf is pointed to a string with int values, for example:

buf=(char *)malloc(4);
buf[0]=0;
buf[1]=0;
buf[2]=178;
buf[3]=3;
etc.

I want xx to have the int value of 946 from buf[2] and buf[3].
I can do it like this:

memcpy( &xx, &buffer[2], 2 );

or

((char *)(&xx))[0]=buffer[2];
((char *)(&xx))[1]=buffer[3];


Both look clumsy. How would I cast it more elegantly.

Avoiding non-portable assumptions about endian-ness and int sizes,
why not
xx = (buf[2] << 8) + buf[3];
My target is win32 if that matters.
Best not to write code for which the target machine would matter.

For my purpose
xx = (buf[3] << 8) + buf[2];

was correct. Thank you for your post.
This code was ported from 16-bit to 32 and xx never exceeds FF FF. After some
testing it seems that the old clumsy looking is slightly faster.

Which do you prefer: getting the wrong answer fast, or the right answer
more slowly?
The clumsy code will give the wrong results on a wide variety of
systems. If you're absolutely certain that this code will never be
ported to any machine where it will produce incorrect results, leave it
as is. Please note that such certainty is usually self-delusional. If
you're actually in touch with reality, use the shift-and-add approach,
which is far more portable, even if it turns out to be slower on some
platforms.
 
B

Ben Bacarisse

James Kuyper said:
Morris Keesan said:
I have:

int xx;
char *buf;
((char *)(&xx))[0]=buffer[2];
((char *)(&xx))[1]=buffer[3];
For my purpose
xx = (buf[3] << 8) + buf[2];

was correct. Thank you for your post.
This code was ported from 16-bit to 32 and xx never exceeds FF FF. After some
testing it seems that the old clumsy looking is slightly faster.

Which do you prefer: getting the wrong answer fast, or the right answer
more slowly?
The clumsy code will give the wrong results on a wide variety of
systems. If you're absolutely certain that this code will never be
ported to any machine where it will produce incorrect results, leave it
as is. Please note that such certainty is usually self-delusional. If
you're actually in touch with reality, use the shift-and-add approach,
which is far more portable, even if it turns out to be slower on some
platforms.

But this code may also fail on some systems because char might be
signed. To the OP: change buf to be an unsigned char array and you
remove one of the pitfalls from this shifting code.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,139
Latest member
JamaalCald
Top