Playing around with byte order

F

Frederick Gotham

What do you think of the following code for setting and retrieving the
value of bytes in an unsigned integer? The least significant bit has
index 0, then the next least significant bit has index 1, and so on. The
code computes at runtime the byte-order of the unsigned integer, but alas
it would be better if it could be determined at compile-time. The code
potentially invokes undefined behaviour if an unsigned integer contains
padding bits. Hypothetically, it would be possible to work with an
unsigned integer type which contains padding, so long as the amount of
value representation bits is an exact multiple of CHAR_BIT.

I've written such code before in C++, making use of classes and the
like... but I have to say I found it very enjoyable to write it in C.

I also welcome any nit-picks whatsoever...


#include <stddef.h>

typedef struct ByteIndexes {

size_t indexes[ sizeof(unsigned) ];

} ByteIndexes;


#include <limits.h>

ByteIndexes DetermineIndexes(void)
{
unsigned guinea_pig = 0;
const unsigned char *p = (unsigned char*)&guinea_pig;
const unsigned char * const p_over = (unsigned char*)(&guinea_pig +
1);


ByteIndexes bi;
size_t *p_struct = bi.indexes;


unsigned byte_number = 1;


do guinea_pig |= byte_number << CHAR_BIT * byte_number;
while ( ++byte_number != sizeof(unsigned) );


do *p_struct++ = *p++;
while( p != p_over );


return bi;
}

size_t LSBIndexToByteIndex( size_t const i )
{
static ByteIndexes bi;

static unsigned first_time = 1;

if (first_time) first_time = 0, bi = DetermineIndexes();

return bi.indexes;
}

unsigned char GetByte(const unsigned * const pu, size_t const LSB_index)
{
const unsigned char * const p = (unsigned char*)pu;

return p[ LSBIndexToByteIndex(LSB_index) ];
}

void SetByte(unsigned * const pu,size_t const LSB_index,unsigned char
const val)
{
unsigned char * const p = (unsigned char*)pu;

p[ LSBIndexToByteIndex(LSB_index) ] = val;
}

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
unsigned i;

SetByte( &i, 0, 59 );
SetByte( &i, 1, 58 );
SetByte( &i, 2, 57 );
SetByte( &i, 3, 56 );

printf( "LSB: Byte 0: %u\n"
" Byte 1: %u\n"
" Byte 2: %u\n"
" Byte 3: %u\n\n",
GetByte(&i,0),
GetByte(&i,1),
GetByte(&i,2),
GetByte(&i,3) );

SetByte( &i, 0, 1 );
SetByte( &i, 1, 2 );
SetByte( &i, 2, 3 );
SetByte( &i, 3, 4 );

printf( "LSB: Byte 0: %u\n"
" Byte 1: %u\n"
" Byte 2: %u\n"
" Byte 3: %u\n\n",
GetByte(&i,0),
GetByte(&i,1),
GetByte(&i,2),
GetByte(&i,3) );


SetByte( &i, 0, 9 );
SetByte( &i, 1, 7 );
SetByte( &i, 2, 5 );
SetByte( &i, 3, 3 );

printf( "LSB: Byte 0: %u\n"
" Byte 1: %u\n"
" Byte 2: %u\n"
" Byte 3: %u\n\n",
GetByte(&i,0),
GetByte(&i,1),
GetByte(&i,2),
GetByte(&i,3) );

system( "PAUSE" );
}
 
P

pete

Frederick said:
What do you think of the following code for setting and retrieving the
value of bytes in an unsigned integer? The least significant bit has
index 0, then the next least significant bit has index 1, and so on. The
code computes at runtime the byte-order of the unsigned integer, but alas
it would be better if it could be determined at compile-time. The code
potentially invokes undefined behaviour if an unsigned integer contains
padding bits. Hypothetically, it would be possible to work with an
unsigned integer type which contains padding, so long as the amount of
value representation bits is an exact multiple of CHAR_BIT.

I've written such code before in C++, making use of classes and the
like... but I have to say I found it very enjoyable to write it in C.

I also welcome any nit-picks whatsoever...

#include <stddef.h>

typedef struct ByteIndexes {

size_t indexes[ sizeof(unsigned) ];

} ByteIndexes;

#include <limits.h>

I write C files with
standard headers included at the top,
followed by non standard header files,
followed by type declarations,
followed by prototypes,
followed external object definitions
followed by function definitions.
While not everybody agrees with all of that,
most people put all their standard header inclusions together,
either at the top or following non standard header files.

ByteIndexes DetermineIndexes(void)
{
unsigned guinea_pig = 0;
const unsigned char *p = (unsigned char*)&guinea_pig;
const unsigned char * const p_over = (unsigned char*)(&guinea_pig +
1);

ByteIndexes bi;
size_t *p_struct = bi.indexes;

unsigned byte_number = 1;

do guinea_pig |= byte_number << CHAR_BIT * byte_number;
while ( ++byte_number != sizeof(unsigned) );

do *p_struct++ = *p++;
while( p != p_over );

return bi;
}

size_t LSBIndexToByteIndex( size_t const i )
{
static ByteIndexes bi;

static unsigned first_time = 1;

if (first_time) first_time = 0, bi = DetermineIndexes();

return bi.indexes;
}

unsigned char GetByte(const unsigned * const pu, size_t const LSB_index)
{
const unsigned char * const p = (unsigned char*)pu;

return p[ LSBIndexToByteIndex(LSB_index) ];
}

void SetByte(unsigned * const pu,size_t const LSB_index,unsigned char
const val)
{
unsigned char * const p = (unsigned char*)pu;

p[ LSBIndexToByteIndex(LSB_index) ] = val;
}

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
unsigned i;


unsigned doesn't always have as many as four bytes
or as many as 32 bits,
SetByte( &i, 0, 59 );
SetByte( &i, 1, 58 );
SetByte( &i, 2, 57 );
SetByte( &i, 3, 56 );

printf( "LSB: Byte 0: %u\n"
" Byte 1: %u\n"
" Byte 2: %u\n"
" Byte 3: %u\n\n",
GetByte(&i,0),
GetByte(&i,1),
GetByte(&i,2),
GetByte(&i,3) );

SetByte( &i, 0, 1 );
SetByte( &i, 1, 2 );
SetByte( &i, 2, 3 );
SetByte( &i, 3, 4 );

printf( "LSB: Byte 0: %u\n"
" Byte 1: %u\n"
" Byte 2: %u\n"
" Byte 3: %u\n\n",
GetByte(&i,0),
GetByte(&i,1),
GetByte(&i,2),
GetByte(&i,3) );

SetByte( &i, 0, 9 );
SetByte( &i, 1, 7 );
SetByte( &i, 2, 5 );
SetByte( &i, 3, 3 );

printf( "LSB: Byte 0: %u\n"
" Byte 1: %u\n"
" Byte 2: %u\n"
" Byte 3: %u\n\n",
GetByte(&i,0),
GetByte(&i,1),
GetByte(&i,2),
GetByte(&i,3) );

system( "PAUSE" );

Just about anything that uses a system call, isn't portable.

You can read significant bytes portably
regardless of internal representation
and without taking the address of an object,
by value.

/* BEGIN new.c output */
The constant is 0x12345678lu
The value of the bytes from least to most significant, are:
0x78
0x56
0x34
0x12
The value of a long unsigned
with significant bytes in reversed order is 0x78563412
/* END new.c output */

/* BEGIN new.c */

#include <stdio.h>
#include <limits.h>

#define CONSTANT 0x12345678lu
#define str(s) # s
#define xstr(s) str(s)

long unsigned setLSBplus_offset
(long unsigned number, long unsigned byte, size_t offset);

unsigned readLSBplus_offset(long unsigned number, size_t offset);

int main(void)
{
long unsigned byte_counter = ULONG_MAX;
long unsigned number = 0;
size_t byte = 0;

puts("/* BEGIN new.c output */");
puts("The constant is " xstr(CONSTANT));
puts("The value of the bytes "
"from least to most significant, are:");
while (byte_counter != 0) {
printf("0x%x\n", readLSBplus_offset(CONSTANT, byte));
++byte;
number = setLSBplus_offset(
number,
readLSBplus_offset(CONSTANT, byte - 1),
sizeof number - byte
);
byte_counter /= 1u << 8 << CHAR_BIT - 8;
}
printf(
"The value of a long unsigned\nwith significant "
"bytes in reversed order is 0x%lx\n", number
);
puts("/* END new.c output */");
return 0;
}

long unsigned setLSBplus_offset
(long unsigned number, long unsigned byte, size_t offset)
{
while (offset-- != 0) {
byte *= 1u << 8 << CHAR_BIT - 8;
}
return number | byte;
}

unsigned readLSBplus_offset(long unsigned number, size_t offset)
{
while (offset-- != 0) {
number /= 1u << 8 << CHAR_BIT - 8;
}
return number &= UCHAR_MAX - 1;
}

/* END new.c */
 
I

inmatarian

Frederick said:
What do you think of the following code for setting and retrieving the
value of bytes in an unsigned integer? The least significant bit has
index 0, then the next least significant bit has index 1, and so on. The
code computes at runtime the byte-order of the unsigned integer, but alas
it would be better if it could be determined at compile-time. The code
potentially invokes undefined behaviour if an unsigned integer contains
padding bits. Hypothetically, it would be possible to work with an
unsigned integer type which contains padding, so long as the amount of
value representation bits is an exact multiple of CHAR_BIT.

I've written such code before in C++, making use of classes and the
like... but I have to say I found it very enjoyable to write it in C.

I also welcome any nit-picks whatsoever...


#include <stddef.h>

typedef struct ByteIndexes {

size_t indexes[ sizeof(unsigned) ];

} ByteIndexes;


#include <limits.h>

ByteIndexes DetermineIndexes(void)
{
unsigned guinea_pig = 0;
const unsigned char *p = (unsigned char*)&guinea_pig;
const unsigned char * const p_over = (unsigned char*)(&guinea_pig +
1);


ByteIndexes bi;
size_t *p_struct = bi.indexes;


unsigned byte_number = 1;


do guinea_pig |= byte_number << CHAR_BIT * byte_number;
while ( ++byte_number != sizeof(unsigned) );


do *p_struct++ = *p++;
while( p != p_over );


return bi;
}

size_t LSBIndexToByteIndex( size_t const i )
{
static ByteIndexes bi;

static unsigned first_time = 1;

if (first_time) first_time = 0, bi = DetermineIndexes();

return bi.indexes;
}

unsigned char GetByte(const unsigned * const pu, size_t const LSB_index)
{
const unsigned char * const p = (unsigned char*)pu;

return p[ LSBIndexToByteIndex(LSB_index) ];
}

void SetByte(unsigned * const pu,size_t const LSB_index,unsigned char
const val)
{
unsigned char * const p = (unsigned char*)pu;

p[ LSBIndexToByteIndex(LSB_index) ] = val;
}

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
unsigned i;

SetByte( &i, 0, 59 );
SetByte( &i, 1, 58 );
SetByte( &i, 2, 57 );
SetByte( &i, 3, 56 );

printf( "LSB: Byte 0: %u\n"
" Byte 1: %u\n"
" Byte 2: %u\n"
" Byte 3: %u\n\n",
GetByte(&i,0),
GetByte(&i,1),
GetByte(&i,2),
GetByte(&i,3) );

SetByte( &i, 0, 1 );
SetByte( &i, 1, 2 );
SetByte( &i, 2, 3 );
SetByte( &i, 3, 4 );

printf( "LSB: Byte 0: %u\n"
" Byte 1: %u\n"
" Byte 2: %u\n"
" Byte 3: %u\n\n",
GetByte(&i,0),
GetByte(&i,1),
GetByte(&i,2),
GetByte(&i,3) );


SetByte( &i, 0, 9 );
SetByte( &i, 1, 7 );
SetByte( &i, 2, 5 );
SetByte( &i, 3, 3 );

printf( "LSB: Byte 0: %u\n"
" Byte 1: %u\n"
" Byte 2: %u\n"
" Byte 3: %u\n\n",
GetByte(&i,0),
GetByte(&i,1),
GetByte(&i,2),
GetByte(&i,3) );

system( "PAUSE" );
}


While the technique may be cute, it is machine and platform dependant.
It's just as easy performing bit shifting and bit masking. For instance:

char getbyte( const int val, const int index )
{
return ( val >> ( index * 8 ) ) & 0xff;
}

This code, at least, is immune to the byte ordering of different
machines. However, it still hinges on an assumption that int might have
a specific size.

Inmatarian
 
F

Frederick Gotham

inmatarian posted:

While the technique may be cute, it is machine and platform dependant.
It's just as easy performing bit shifting and bit masking. For
instance:

char getbyte( const int val, const int index )
{
return ( val >> ( index * 8 ) ) & 0xff;
}

This code, at least, is immune to the byte ordering of different
machines. However, it still hinges on an assumption that int might
have a specific size.


Of course... I was having so much fun writing the code that I didn't
realise there was a better, less fun way of doing it!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top