Changing endianness

J

James Harris

Just sticking with big- and little-endianness and ignoring PDP
endianness do these routines work in C:

http://codewiki.wikispaces.com/endianness_changing.c

Or are there precautions to take so that the 32-bit and, especially,
the 64-bit values will convert as intended? Would it be better to say
that the 64-bit values will only work where C is executing with 64-bit
integers? I've tried to keep the numeric literals to within unsigned
32-bit values but that may not be enough.

James
 
E

Eric Sosman

Just sticking with big- and little-endianness and ignoring PDP
endianness do these routines work in C:

http://codewiki.wikispaces.com/endianness_changing.c

Or are there precautions to take so that the 32-bit and, especially,
the 64-bit values will convert as intended? Would it be better to say
that the 64-bit values will only work where C is executing with 64-bit
integers? I've tried to keep the numeric literals to within unsigned
32-bit values but that may not be enough.

The fragments look all right to me, in view of the text's
note about using them with unsigned types (maybe the note should
be in bold or something). Macro-ized implementations have the
usual advantages and the usual drawbacks, but that's not the
snippets' fault.

You've described each fragment as working on "NN-bit words,"
and left the question of type selection to the user: He's got to
find implementation-specific types of the appropriate widths in
order to use the code. If he can't find a 64-bit type, he can't
use the 64-bit SWAB, and there's an end on't.

One possible surprise concerning types: If the 16-bit type
is `unsigned short' and if USHRT_MAX <= INT_MAX, the result of
the 16-bit SWAB will be a `signed int'. In a simple assignment
context (`value = SWAB(value)') that won't hurt anything, but
in some other contexts it could make trouble.
 
P

Peter Nilsson

James Harris said:
Just sticking with big- and little-endianness and ignoring PDP
endianness do these routines work in C:

 http://codewiki.wikispaces.com/endianness_changing.c

Or are there precautions to take so that the 32-bit and,
especially, the 64-bit values will convert as intended?

Eric's comments notwithstanding, you can use the 'butterfly'
method to reduce the number of operations...

uint64_t swab64(uint64_t x)
{
x = ((x >> 8) & 0x00FF00FF00FF00FF)
| ((x << 8) & 0xFF00FF00FF00FF00);

x = ((x >> 16) & 0x0000FFFF0000FFFF)
| ((x << 16) & 0xFFFF0000FFFF0000);

return (x >> 32) | (x << 32);
}

But reality is that you're probably better off using htonll()
et al, as many implementations are able to implement these
with inline assembler.
 
F

Francois Grieu

Eric Sosman a écrit :
The fragments look all right to me, in view of the text's
note about using them with unsigned types (maybe the note should
be in bold or something). Macro-ized implementations have the
usual advantages and the usual drawbacks, but that's not the
snippets' fault.

You've described each fragment as working on "NN-bit words,"
and left the question of type selection to the user: He's got to
find implementation-specific types of the appropriate widths in
order to use the code. If he can't find a 64-bit type, he can't
use the 64-bit SWAB, and there's an end on't.

One possible surprise concerning types: If the 16-bit type
is `unsigned short' and if USHRT_MAX <= INT_MAX, the result of
the 16-bit SWAB will be a `signed int'. In a simple assignment
context (`value = SWAB(value)') that won't hurt anything, but
in some other contexts it could make trouble.

Yes. One of the single most common portability issue is that what is
shown (currently) won't work as expected in the following code with many
C compilers (or so I fear):

#define SWAP32(value) (SWAP32WRAPME((value)))
#define SWAP32WRAPME(value) \
((value & 0xff) << 24) | ((value & 0xff00) << 8) | \
((value >> 8) & 0xff00) | ((value >> 24) & 0xff)

unsigned long vTest = SWAP32(0x00001234);



This can be improved by

#include <limits.h>
#if UINT_MAX/3>=0x55555555
#define SWAP32(value) (SWAP32WRAPME((unsigned)(value)))
#else
#define SWAP32(value) (SWAP32WRAPME((unsigned long)(value)))
#endif


Francois Grieu
 
P

Phil Carmody

Francois Grieu said:
#define SWAP32(value) (SWAP32WRAPME((value)))
#define SWAP32WRAPME(value) \
((value & 0xff) << 24) | ((value & 0xff00) << 8) | \
((value >> 8) & 0xff00) | ((value >> 24) & 0xff)

Signed shift danger.
(And side effect danger too.)
unsigned long vTest = SWAP32(0x00001234);



This can be improved by

#include <limits.h>
#if UINT_MAX/3>=0x55555555
#define SWAP32(value) (SWAP32WRAPME((unsigned)(value)))
#else
#define SWAP32(value) (SWAP32WRAPME((unsigned long)(value)))
#endif

Or stdint types?

Phil
 
F

Francois Grieu

Phil Carmody a écrit :
Signed shift danger.

Indeed. I was thinking mostly of less-than-32 bit int, which is a
modern reality in the embedded world.
(And side effect danger too.)


Or stdint types?

Neither of the four commercial C compilers (among which three are
actively maintained) that I use for ST7, 8051, and PIC18 derivatives
in production support that. One of these compilers (can't remember
which) at one point had a preprocessor requiring some kludge for
testing UINT_MAX.


Francois Grieu
 
J

James Harris

Signed shift danger.
(And side effect danger too.)


....

Or stdint types?

So if stdint types are used the signedness and width issues won't
apply? Or is that too simplistic?

James
 
J

James Harris

Eric's comments notwithstanding, you can use the 'butterfly'
method to reduce the number of operations...

uint64_t swab64(uint64_t x)
{
  x = ((x >> 8) & 0x00FF00FF00FF00FF)
    | ((x << 8) & 0xFF00FF00FF00FF00);

  x = ((x >> 16) & 0x0000FFFF0000FFFF)
    | ((x << 16) & 0xFFFF0000FFFF0000);

  return (x >> 32) | (x << 32);

}

But reality is that you're probably better off using htonll()
et al, as many implementations are able to implement these
with inline assembler.

The butterfly method seems like a good alternative. I've added it to
the page as well as some text to point out the distinction between run-
time and compile-time selection of a change of endianness.

The htonl etc routines are a bit of a pain as they confuse issues.
Better, IMHO, would be macros which convert to and from little-endian
and big-endian representations. The htonl function and its friends
manage big-endian but AIUI there are no *standard* macros or routines
to write or read little-endian numbers.

In terms of performance the ideal situation, where a change of
endianness is needed, is for the compiler to recognise the
transformation. Then it can convert it appropriately for the target
CPU. For x86 only one instruction is needed as shown at

http://codewiki.wikispaces.com/endianness_changing.nasm

(On soap box) It is normally believed that the lower level the
language the more instructions are needed. I wonder if the reality is
more that a program which converts one language to another generally
increases code size. The hand-written Assembler fragments are much
shorter than the C fragments and this is far from being a one-off.

James
 
S

Seebs

Given a representation of a number in n 8-bit bytes of known, fixed
endianness, find the actual value.
Given a value, find the representation of that value in 8-bit bytes of
known, fixed endianness.

This is pretty easy.

Imagine that you know that the bytes are in order 0, 1, 2, ..., n-1:
new_val = bytes[0] << (8 * (n - 1)) +
bytes[1] << (8 * (n - 2)) +
...
bytes[n - 1] << (8 * (n - 8));

Or:
new_val = 0;
for (int i = 0; i < n; ++i) {
new_val += bytes[order] << (8 * (n - 1 - i));
}

You can reverse the sense of order if you prefer, it makes no difference.

You can reverse the operation just as easily:
(n >> (8 * something)) & 0xff

-s
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,071
Latest member
MetabolicSolutionsKeto

Latest Threads

Top