Int to char[4]

T

TheDrunkenDead

Hello, just wondering how I would assign a char array of four elements
to the four bytes used in an int. As of right now my code is:
cNameSize = (char)((void)NameSize);
cFileSize = (char)((void)FileSize);
Where NameSize and FileSize are the integers, and cNameSize and
cFileSize are 4 element arrays. This doesn't work.
 
R

Ron Natalie

Hello, just wondering how I would assign a char array of four elements
to the four bytes used in an int. As of right now my code is:
cNameSize = (char)((void)NameSize);
cFileSize = (char)((void)FileSize);
Where NameSize and FileSize are the integers, and cNameSize and
cFileSize are 4 element arrays. This doesn't work.
Neither of the above are valid. Your first issue is you can't
assign arrays at all. The easiest way is just to use memcpy:

int NameSize;
char cNameSize[4];

memcpy(cNameSize, &NameSize, 4);
 
J

Jerry Coffin

Hello, just wondering how I would assign a char array of four elements
to the four bytes used in an int. As of right now my code is:
cNameSize = (char)((void)NameSize);
cFileSize = (char)((void)FileSize);
Where NameSize and FileSize are the integers, and cNameSize and
cFileSize are 4 element arrays. This doesn't work.

You can't do anything entirely portably. About as good as it gets is:

union type_pun {
int i;
char c[sizeof(int)];
};

Officially, it gives undefined behavior, but with a typical compiler you
can write an int into i, and then access its individual bytes via c.
 
I

Ian Collins

Jerry said:
Hello, just wondering how I would assign a char array of four elements
to the four bytes used in an int. As of right now my code is:
cNameSize = (char)((void)NameSize);
cFileSize = (char)((void)FileSize);
Where NameSize and FileSize are the integers, and cNameSize and
cFileSize are 4 element arrays. This doesn't work.


You can't do anything entirely portably. About as good as it gets is:

union type_pun {
int i;
char c[sizeof(int)];
};

Officially, it gives undefined behavior, but with a typical compiler you
can write an int into i, and then access its individual bytes via c.
Assuming you aren't concerned with the byte order.
 
J

Jerry Coffin

Jerry Coffin wrote:

[ ... ]
union type_pun {
int i;
char c[sizeof(int)];
};

Officially, it gives undefined behavior, but with a typical compiler you
can write an int into i, and then access its individual bytes via c.
Assuming you aren't concerned with the byte order.

....or, perhaps more accurately, assuming you're ready to deal with byte
order on your own. If you want to avoid that, you can do something like:

for (int i=0; i<4; i++)
byte = (integer >> i * 8) & 0xff;

If you don't want to assume an 8-bit char, you can base the mask and
shift on std::numeric_limits<char>::digits (and possibly radix, if
you're concerned with the possibility of non-binary represenations).
 
D

Default User

Jerry said:
Hello, just wondering how I would assign a char array of four
elements to the four bytes used in an int. As of right now my code
is: cNameSize = (char)((void)NameSize);
cFileSize = (char)((void)FileSize);
Where NameSize and FileSize are the integers, and cNameSize and
cFileSize are 4 element arrays. This doesn't work.

You can't do anything entirely portably. About as good as it gets is:

union type_pun {
int i;
char c[sizeof(int)];
};

Officially, it gives undefined behavior, but with a typical compiler
you can write an int into i, and then access its individual bytes via
c.


You don't need to go through all that union stuff to do that. Perfectly
safe and portable is:

int i = 123;
unsigned char* p;

p = (char*) &i;



Brian
 
J

Jerry Coffin

[ ... ]
Perfectly safe and portable is:

int i = 123;
unsigned char* p;

p = (char*) &i;

Section 5.2.10/7 of the C++ standard seems to disagree.

Don't get me wrong: none of the other methods produces a specified
result either -- but I see little evidence that this one is any safer or
more portable than any of the others. Using your example, I could
perfectly reasonably have *p == 0 or *p == 123, depending on whether the
machine was little endian or big endian. Of course, with the possibility
of padding bits and such, there could be other values as well, but both
of those are _extremely_ common.
 
R

Rolf Magnus

Jerry said:
[ ... ]
Perfectly safe and portable is:

int i = 123;
unsigned char* p;

p = (char*) &i;

Section 5.2.10/7 of the C++ standard seems to disagree.

Well, of course the result is unspecified, but with the enum trick, it's
undefined.
Don't get me wrong: none of the other methods produces a specified
result either -- but I see little evidence that this one is any safer or
more portable than any of the others.

I agree. However, I'd still not go through the union. For one, the cast is
simpler, and it's the tool that was made for the job. Unions are not meant
to be used like that. I mean, you may be able to use a brick to hammer a
nail into a wall, but a hammer just seems to be a more "natural" choice.
 
J

Jerry Coffin

[ ... ]
I agree. However, I'd still not go through the union. For one, the cast is
simpler, and it's the tool that was made for the job. Unions are not meant
to be used like that. I mean, you may be able to use a brick to hammer a
nail into a wall, but a hammer just seems to be a more "natural" choice.

In this case, the correct metaphor doesn't seem to be comparing a hammer
to a brick, but comaparing a chunk of reddish granite to a chunk of grey
granite, both roughly the same size and shape.

If he'd used a reinterpret_cast instead of a C-style cast, you _might_
have a point, but even then (IMO) we're just talking about a
specifically designated chunk of granite, still not really a hammer.
 
P

peter koch

Jerry Coffin wrote:[ ... ]
union type_pun {
int i;
char c[sizeof(int)];
};
Officially, it gives undefined behavior, but with a typical compiler you
can write an int into i, and then access its individual bytes via c.
Assuming you aren't concerned with the byte order
....or, perhaps more accurately, assuming you're ready to deal with byte
order on your own. If you want to avoid that, you can do something like:

for (int i=0; i<4; i++)
byte = (integer >> i * 8) & 0xff;


This probably is not what is wanted: shifting ints is implementation
defined and should normally be avoided. I would either use the union
trick (have never seen it not work) or use an unsigned int.

/Peter
 
R

Ron Natalie

Default said:
You don't need to go through all that union stuff to do that. Perfectly
safe and portable is:

int i = 123;
unsigned char* p;

p = (char*) &i;
You want to use unsigned char* in the cast to match the
type of p
 
J

Jerry Coffin

[ ... ]
This probably is not what is wanted: shifting ints is implementation
defined and should normally be avoided.

Specifically, right-shifting an int. I wasn't thinking of that when I
posted, but you're absolutely right.
 
N

NagelBagel

Shifting an unsigned int into an unsigned char is probably the most
portable because in that case, both the bit order and byte order do
not matter. The union version doesn't work because the standard only
allows you to inspect the common initial sequence of two structs.
However, the cast version works.

The problem with the C-style cast is that it's interpreted as a
reinterpret_cast and it's not required that you're able to access the
object after a reinterpret_cast. The standard mentions some places
where a (static) cast to void * then to char * is OK. As long as you
don't try to interpret a character array as an int (due to alignment
issues), and use memcpy instead, there won't be any trouble. However,
both the byte and bit orders are unspecified, and there are no
guarantees on anything except that copying a sequence out of an
integer, then back into (a possibly different) integer, will make that
integer retain the same value. The integer could have padding and trap
bits which may throw off contraints of the program if given arbitrary
values. I'd stick with shifts.

There's also a caveat with using shifts with an unsigned int, copying
to chars. char could be signed, so doing this:

unsigned int x = 0xFF00;
char c = char(x >> CHAR_BIT); // equivalent to char c = char(0xFF); if
CHAR_BIT is 8.

isn't guaranteed to produce anything specific even if CHAR_BIT is 8.
The result is implementation-defined, since 0xFF would be larger than
c can represent. There's also the fact that, even when the char has 8
bits, it doesn't have to represent -128, so it can do whatever it
wants when you try to do anything but copy that value to another char.
(Unlike signed char, it has to be valid to store that representation
because all POD objects are guaranteed to be able to be readable as a
series of chars or unsigned chars, and all bits of any char type have
to participate in value representation.)
 
J

Jim Langston

Hello, just wondering how I would assign a char array of four elements
to the four bytes used in an int. As of right now my code is:
cNameSize = (char)((void)NameSize);
cFileSize = (char)((void)FileSize);
Where NameSize and FileSize are the integers, and cNameSize and
cFileSize are 4 element arrays. This doesn't work.

I've done it a few different ways depending on why I'm doing it and what
makes it most obvious to some future reader of my program what I'm doing and
why.

#include <iostream>
#include <string>

int main ()
{
// Method 1
int p1 = 1234;
char n1[4];
*(reinterpret_cast<int*>( n1 )) = p1;

// Method 2
union
{
int p2;
char n2[4];
};
p2 = 1234;

// Method 3
int p3 = 1234;
char n3[4];
for ( int i = 0; i < sizeof p3; ++i )
n3 = reinterpret_cast<char*>( &p3 );

// Method 4 - Don't bother converting yet.
int p4 = 1234;

// Just output to show they do the same thing
for ( int i = 0; i < 4; ++i )
std::cout << static_cast<unsigned int>( n1 ) << " ";
std::cout << "\n";

for ( int i = 0; i < 4; ++i )
std::cout << static_cast<unsigned int>( n2 ) << " ";
std::cout << "\n";

for ( int i = 0; i < 4; ++i )
std::cout << static_cast<unsigned int>( n3 ) << " ";
std::cout << "\n";

for ( int i = 0; i < 4; ++i )
std::cout << static_cast<unsigned int>(reinterpret_cast<char*>(
&p4 )) << " ";
std::cout << "\n";

std::cin.get();
}

I've also written functions to do this, methods, used memcpy, etc...

It is actually very easy to get to each byte of anything.
 
K

Kai-Uwe Bux

Jim said:
Hello, just wondering how I would assign a char array of four elements
to the four bytes used in an int. As of right now my code is:
cNameSize = (char)((void)NameSize);
cFileSize = (char)((void)FileSize);
Where NameSize and FileSize are the integers, and cNameSize and
cFileSize are 4 element arrays. This doesn't work.

I've done it a few different ways depending on why I'm doing it and what
makes it most obvious to some future reader of my program what I'm doing
and why.

#include <iostream>
#include <string>

int main ()
{
// Method 1
int p1 = 1234;
char n1[4];
*(reinterpret_cast<int*>( n1 )) = p1;
[snip]

This one is interesting. Are you sure, n1 satisfies the alignment
requirements for int? (And I wondern what happens if it is not. Will the
reinterpret_cast modify the pointer so that one gets a valid int* to an
overlapping but different memory region, will the assignment fail, or will
we see the proverbial nasal daemons?)


Best

Kai-Uwe Bux
 
J

Jim Langston

Kai-Uwe Bux said:
Jim said:
Hello, just wondering how I would assign a char array of four elements
to the four bytes used in an int. As of right now my code is:
cNameSize = (char)((void)NameSize);
cFileSize = (char)((void)FileSize);
Where NameSize and FileSize are the integers, and cNameSize and
cFileSize are 4 element arrays. This doesn't work.

I've done it a few different ways depending on why I'm doing it and what
makes it most obvious to some future reader of my program what I'm doing
and why.

#include <iostream>
#include <string>

int main ()
{
// Method 1
int p1 = 1234;
char n1[4];
*(reinterpret_cast<int*>( n1 )) = p1;
[snip]

This one is interesting. Are you sure, n1 satisfies the alignment
requirements for int? (And I wondern what happens if it is not. Will the
reinterpret_cast modify the pointer so that one gets a valid int* to an
overlapping but different memory region, will the assignment fail, or will
we see the proverbial nasal daemons?)

I've actually contemplated that, which is why I am hesitant to use this
method in code that won't stay on one OS. The main reason I throw ints into
char arrays, and vice versa, is moving data from/to socket streams which are
char buffers. I know the next 4 bytes are a binary integer and want to get
it into an interger variable.

It's wored for me when I've used it on Windows XP in MS C++ .net 2003 but I
would think it may not work on some computers that have to have ints aligned
or won't work at all.

It's actually the one I tend to like the most, but have the most difficulty
deciding if it's good code or not. On a windows machine I can only envision
it being as fast as, or faster, than moving char by char (it's either going
to do it in one move, two or four). Not that speed really matters.

The uniion I really don't like, but is probably the concicest.
 
G

Greg Herlihy

Jim said:
#include <iostream>
#include <string>

int main ()
{
// Method 1
int p1 = 1234;
char n1[4];
*(reinterpret_cast<int*>( n1 )) = p1;
[snip]

This one is interesting. Are you sure, n1 satisfies the alignment
requirements for int? (And I wondern what happens if it is not. Will the
reinterpret_cast modify the pointer so that one gets a valid int* to an
overlapping but different memory region, will the assignment fail, or will
we see the proverbial nasal daemons?)

No, there is no problem with alignment in the above code. However there are
plenty of other reasons to find faults with this program: the double storage
is inefficient while the use of reinterpret_cast is inelegant, to use as
charitable a description as possible).

However unsightly this code may be, it will not blow up when run. Because
the four chars of storage for the int were allocated by a single array
object (n1), the storage will be aligned according the most stringent
alignment requirements of any four-byte sized type (including a four-byte
int type).

In fact a C++ program can always be certain that as long as a character
array is equal to (or larger than) the sizeof() a POD type, then the program
will be able to place of an object of that type into that character array
safely.

Greg
 
D

Default User

Jerry said:
[ ... ]
Perfectly safe and portable is:

int i = 123;
unsigned char* p;

p = (char*) &i;

Section 5.2.10/7 of the C++ standard seems to disagree.

I don't have a copy of the standard at home, but I don't believe
there's anything undefined or otherwise unsafe about this. The layout
of the int is of course not specified by the standard, but any object
can be examined as a sequence of bytes safely.
Don't get me wrong: none of the other methods produces a specified
result either -- but I see little evidence that this one is any safer
or more portable than any of the others. Using your example, I could
perfectly reasonably have *p == 0 or *p == 123, depending on whether
the machine was little endian or big endian. Of course, with the
possibility of padding bits and such, there could be other values as
well, but both of those are extremely common.

So? That's no different from what you have, except that the cast to one
of the char* types is specified to be safe. It avoids a lot of
unnecessary business in the union. If all you want to do is examine the
byte layout of an object, casting to unsigned char* is the way to go
(don't make my error that Ron pointed out, of course).



Brian
 
K

Kai-Uwe Bux

Greg said:
Jim said:
#include <iostream>
#include <string>

int main ()
{
// Method 1
int p1 = 1234;
char n1[4];
*(reinterpret_cast<int*>( n1 )) = p1;
[snip]

This one is interesting. Are you sure, n1 satisfies the alignment
requirements for int? (And I wondern what happens if it is not. Will the
reinterpret_cast modify the pointer so that one gets a valid int* to an
overlapping but different memory region, will the assignment fail, or
will we see the proverbial nasal daemons?)

No, there is no problem with alignment in the above code. However there
are plenty of other reasons to find faults with this program: the double
storage is inefficient while the use of reinterpret_cast is inelegant, to
use as charitable a description as possible).

However unsightly this code may be, it will not blow up when run. Because
the four chars of storage for the int were allocated by a single array
object (n1), the storage will be aligned according the most stringent
alignment requirements of any four-byte sized type (including a four-byte
int type).

In fact a C++ program can always be certain that as long as a character
array is equal to (or larger than) the sizeof() a POD type, then the
program will be able to place of an object of that type into that
character array safely.

Before I posted, I hit the standard to find out (because I vaguely
remembered something like this). However, all I was able to confirm is that
allocation functions return pointer suitably aligned for any size that fits
in there [5.3.4/10]. I did not find a corresponding guarantee for character
arrays that are not dynamically allocated. Could you point me to the clause
that says so.


Thanks

Kai-Uwe Bux
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top