Endian swaps with C++; comments please

A

Aaron Graham

/**
* Sample usage:
* unsigned long longvar = 0x12345678;
* unsigned long be_longvar = endian::host_to_big(longvar);
* unsigned short shortvar = 0x1234;
* unsigned short le_shortvar = endian::host_to_little(shortvar);
*/

// for std::reverse:
#include <algorithm>
#include <limits>

// for endian information:
#include <endian.h>
// Linux uses __BYTE_ORDER
// FreeBSD and Apple/Darwin use _BYTE_ORDER
// Some other BSD variants use BYTE_ORDER
#if (defined __BYTE_ORDER && __BYTE_ORDER==__BIG_ENDIAN) || \
(defined _BYTE_ORDER && _BYTE_ORDER== _BIG_ENDIAN) || \
(defined BYTE_ORDER && BYTE_ORDER== BIG_ENDIAN)
#define IS_BIG_ENDIAN 1
#else
#define IS_BIG_ENDIAN 0
#endif

namespace endian {

// This function will copy the supplied value and return a byte-swapped
// version of it. This function may/should be optimized for specific
// architectures when necessary. It may also be necessary to create
// partial specializations for certain types, since the current state
// of this function only allows some fundamental types to be swapped.
template <typename _type>
_type byteswap(_type val) {
if (std::numeric_limits<_type>::is_specialized &&
!std::numeric_limits<_type>::is_signed) {
// Found a type that is specialized and is unsigned.
switch (sizeof(_type)) {
case 1:
return val;
case 2:
return ((val & 0x00ff) << 8) | ((val & 0xff00) >> 8);
case 4:
return ((val & 0x000000ff) << 24) | ((val & 0x0000ff00) << 8) |
((val & 0x00ff0000) >> 8) | ((val & 0xff000000) >> 24);
}
}
// Swap this type using a different/fallback/hacky method:
unsigned char* v = reinterpret_cast<unsigned char*>(&val);
std::reverse(v, v + sizeof(_type));
return val;
}

template <typename _type>
_type host_to_big(_type val) {
return IS_BIG_ENDIAN ? val : byteswap(val);
}

template <typename _type>
_type host_to_little(_type val) {
return IS_BIG_ENDIAN ? byteswap(val) : val;
}

template <typename _type>
_type big_to_host(_type val) {
return IS_BIG_ENDIAN ? val : byteswap(val);
}

template <typename _type>
_type little_to_host(_type val) {
return IS_BIG_ENDIAN ? byteswap(val) : val;
}

} // end namespace endian

// Don't need this definition anymore:
#undef IS_BIG_ENDIAN
 
V

Victor Bazarov

Aaron said:
> [...]
switch (sizeof(_type)) {
case 1:
return val;
case 2:
return ((val & 0x00ff) << 8) | ((val & 0xff00) >> 8);

This assumes that 'sizeof' returns the number of octets. It doesn't.
It returns the number of 'bytes'. Please read up on the difference.
case 4:
return ((val & 0x000000ff) << 24) | ((val & 0x0000ff00) << 8) |
((val & 0x00ff0000) >> 8) | ((val & 0xff000000) >> 24);
}
}
[..]

V
 
H

Howard

Aaron Graham said:
/**
* Sample usage:
* unsigned long longvar = 0x12345678;
* unsigned long be_longvar = endian::host_to_big(longvar);
* unsigned short shortvar = 0x1234;
* unsigned short le_shortvar = endian::host_to_little(shortvar);
*/

I've never seen a need to swap actual integer variable values. The only
time I execute any swapping code is when I'm writing out an integer-type
variable to disk (or reading it back), when that data might be read on
another platform. We decided on a standard for all integers in the files,
and all platforms must write (and read) in that format.

So, on each platform, we have read and write functions for the numeric data
types, which stream in/out the data in the order we need.

On the Mac, for example, the read and write functions simply read/write the
bytes from first memory location to last, while on Windows, we read/write
the bytes in reverse order.

This way, there's never a stored numeric variable in memory (aside from
perhaps in a buffer), which we have to worry about the "endianness" of.

-Howard
 
A

Aaron Graham

andrea said:
have a look at:

man htonl

I'm already very familiar with it. I was looking for a more general
solution. htonl only works for long, and htons only works for short.
What about 64-bit quantities?

And #include <netinet/in.h> brings in a lot of baggage (#defines
mostly) that is not desirable in portable C++ code. For instance, if
you #include <netinet/in.h> in vxWorks (and likely other BSD systems),
you get #defines of the following symbols: m_len, m_data, m_type,
m_flags, and many others. You can imagine what kind of problems you
would have trying to port/compile code that uses hungarian notation
(not that I use HN).

Thanks for your suggestion.
Aaron
 
A

Aaron Graham

This assumes that 'sizeof' returns the number of octets. It doesn't.
It returns the number of 'bytes'. Please read up on the difference.

I was not familiar with the distinction. I suppose systems that use
differently-sized-bytes would have to port this function, or let it
fall back to std::reverse. I'm not averse to having to port this
function for specific architectures, as long as the porting is highly
localized. Obviously, some architectures have native endian swapping
capabilities in their instruction sets, and it would be best to take
advantage of those as well (as I said in the comments).

Thanks for you input.
Aaron
 
A

Aaron Graham

I've never seen a need to swap actual integer variable values. The only
time I execute any swapping code is when I'm writing out an integer-type
variable to disk (or reading it back), when that data might be read on
another platform. We decided on a standard for all integers in the files,
and all platforms must write (and read) in that format.

So, on each platform, we have read and write functions for the numeric data
types, which stream in/out the data in the order we need.

This begs the question a little bit. Somewhere, something has to do
the endian swapping. Besides, I don't always have control over file
formats I read and write. For instance, FLAC files use big endian for
metadata blocks, but the Vorbis comment metadata block uses little
endian internally.

Aaron
 
R

red floyd

andrea said:
hello,

have a look at:

man htonl

[redacted]

1. Please do not top post.
2. htonl is a good solution, but it is not part of Standard C++. It is
a POSIX-ism that is implemented practically everywhere, but it's not in
the Standard. As such, it doesn't meet the OP's choice for a standard
C++ only solution. (of course, <endian.h> is also system specific....)
 
A

Aaron Graham

[...]
2. htonl is a good solution, but it is not part of Standard C++. It is
a POSIX-ism that is implemented practically everywhere, but it's not in
the Standard. As such, it doesn't meet the OP's choice for a standard
C++ only solution. (of course, <endian.h> is also system specific....)

htonl really _isn't_ a good solution, because it doesn't do anything on
big-endian machines. What if you're trying to read little-endian data
on a big-endian machine?

I agree that the #include <endian.h> is an ugly wart. Is there a
better way to know endianness at compile time? Is there a better
standard compiler built-in that will give you this information?

Aaron
 
R

red floyd

Aaron said:
[...]
2. htonl is a good solution, but it is not part of Standard C++. It is
a POSIX-ism that is implemented practically everywhere, but it's not in
the Standard. As such, it doesn't meet the OP's choice for a standard
C++ only solution. (of course, <endian.h> is also system specific....)

htonl really _isn't_ a good solution, because it doesn't do anything on
big-endian machines. What if you're trying to read little-endian data
on a big-endian machine?

Oh, good point. I got fixated on putting stuff into network byte order,
and forgot the general byteswap case.

I think you'll have to go compiler dependent and use the appropriate
manifest defines, or specify a command line option (or a custom endian.h
for each target platform).
 
G

Gianni Mariani

Aaron said:
htonl really _isn't_ a good solution, because it doesn't do anything on
big-endian machines. What if you're trying to read little-endian data
on a big-endian machine?

I agree that the #include <endian.h> is an ugly wart. Is there a
better way to know endianness at compile time? Is there a better
standard compiler built-in that will give you this information?

Why do you need to know at compile-time ?

The compiler's optimizer can (and does on compilers I've tested)
eliminate dead code when doing a "run time" endianness check.

This is one of those classic premature optimization issues.
 
A

andrea

2. htonl is a good solution, but it is not part of Standard C++. It is
Well, it is not the Standard but from your snippet it was clear that you
are working in a unix-like environment...
htonl really _isn't_ a good solution, because it doesn't do anything on
big-endian machines. What if you're trying to read little-endian data
on a big-endian machine?

I understand the desire to generalize the code as much as possible but,
IMHO, one should mainly aim at simplicity and efficiency. Foreseeing the
possibility to read little-endian data could be good for completeness
but I would write the data in bigendian, instead.

bye,
andrea
 
M

Maxim Yegorushkin

Aaron said:
I'm already very familiar with it. I was looking for a more general
solution. htonl only works for long, and htons only works for short.
What about 64-bit quantities?

Note that on 64-bit linux 8 == sizeof(long). I wonder if htonl operates
on long rather than int32_t.
 
A

Aaron Graham

Well, it is not the Standard but from your snippet it was clear that you
are working in a unix-like environment...

Well, I'm not working in a Windows environment, anyway...
I understand the desire to generalize the code as much as possible but,
IMHO, one should mainly aim at simplicity and efficiency. Foreseeing the
possibility to read little-endian data could be good for completeness
but I would write the data in bigendian, instead.

I think my solution is simple and efficient and general. The compiler
will take care of the optimizations to the point where it's just as
efficient for longs as htonl is (more efficient, if you consider that
swapbytes can be optimized for specific architectures).

It's not possible to always write files in big-endian, because I don't
dictate the endian-ness of popular file formats. If I write a wma file
using big-endian, for instance, nobody else will be able to read it.

Aaron
 
A

Aaron Graham

Why do you need to know at compile-time ?

Endian swapping is used in tight loops all the time, and is commonly
used on resource-lean embedded systems (I am often in situations where
both of these points are applicable).
The compiler's optimizer can (and does on compilers I've tested)
eliminate dead code when doing a "run time" endianness check.

I'm not sure I understand what you're saying. If the compiler can't
determine at compile-time which branch you're going to be taking, it
can't assume there's any dead code. If you mean that endian-ness
checks that are commonly regarded as "runtime" are actually "compile
time" checks with some compilers, then I think that may be true. But
the most common one:

unsigned x = 1;
return !(*(char*)(&x));

.... is not optimized away by gcc, even at the highest optimization
level (at least, not in any of the disassemblies I've looked at).
There's probably a good reason for it, but I don't know what that is.
This is one of those classic premature optimization issues.

How do you know this? How do you know I'm not attempting to create a
good general solution to a problem where I've determined that endian
swapping is a significant contributor to slow performance?

Aaron
 
G

Gianni Mariani

Aaron said:
unsigned x = 1;
return !(*(char*)(&x));

... is not optimized away by gcc, even at the highest optimization
level (at least, not in any of the disassemblies I've looked at).
There's probably a good reason for it, but I don't know what that is.

In my investigations, that *was* optimized away including the dead code.

What did you test ?
 
H

Howard

Aaron Graham said:
This begs the question a little bit. Somewhere, something has to do
the endian swapping. Besides, I don't always have control over file
formats I read and write. For instance, FLAC files use big endian for
metadata blocks, but the Vorbis comment metadata block uses little
endian internally.

Aaron

There doesn't ever have to be any swapping, as such. All data-ordering can
be done while reading and writing. If you know the ordering of the data to
be read or written, code that into your reading and writing routines for the
specific data you're handling. And there's no need to know what your
machine's internal byte-ordering is, since you can use mask&shift (or
multiplication/division) operations, which work the same, regardless of the
internal physical byte-ordering.

I'm pretty sure this is covered in the FAQ...?

-Howard
 
A

Aaron Graham

Gianni said:
In my investigations, that *was* optimized away including the dead code.

What did you test ?

Okay, you're right: I tried a couple more compilers that I have
sitting around on one of my dev machines. It seems that gcc 2.95.2
does the optimization, but I can't get it to happen with the latest gcc
4.0.2 for linux-x86. Maybe a bug?

Try this:
#include <stdio.h>
void tell_endian() {
unsigned x = 1;
if (*(char*)&x) printf("little endian\n");
else printf("big endian\n");
}

Doing an objdump of the results of "gcc-4.0.2 -O3 -c -o foo foo.c"
gives me this:

00000000 <tell_endian>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 ec 18 sub $0x18,%esp
6: c7 45 fc 01 00 00 00 movl $0x1,0xfffffffc(%ebp)
d: 80 7d fc 00 cmpb $0x0,0xfffffffc(%ebp)
11: 74 15 je 28 <tell_endian+0x28>
13: 83 ec 0c sub $0xc,%esp
16: 68 00 00 00 00 push $0x0
17: R_386_32 .rodata.str1.1
1b: e8 fc ff ff ff call 1c <tell_endian+0x1c>
1c: R_386_PC32 printf
20: 83 c4 10 add $0x10,%esp
23: c9 leave
24: c3 ret
25: 8d 76 00 lea 0x0(%esi),%esi
28: 83 ec 0c sub $0xc,%esp
2b: 68 1b 00 00 00 push $0x1b
2c: R_386_32 .rodata.str1.1
30: e8 fc ff ff ff call 31 <tell_endian+0x31>
31: R_386_PC32 printf
35: 83 c4 10 add $0x10,%esp
38: c9 leave
39: c3 ret
 
G

Gianni Mariani

Aaron said:
Okay, you're right: I tried a couple more compilers that I have
sitting around on one of my dev machines. It seems that gcc 2.95.2
does the optimization, but I can't get it to happen with the latest gcc
4.0.2 for linux-x86. Maybe a bug?

I changed it to:

bool tell_endian()
{
unsigned x = 1;
return *(char*)&x;
}

g++ 3.4.2 produces:

00000000 <_Z11tell_endianv>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: b8 01 00 00 00 mov $0x1,%eax
8: c9 leave
9: c3 ret


g++ 4.0.0 produces:

0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 ec 10 sub $0x10,%esp
6: c7 45 fc 01 00 00 00 movl $0x1,0xfffffffc(%ebp)
d: 31 c0 xor %eax,%eax
f: 80 7d fc 00 cmpb $0x0,0xfffffffc(%ebp)
13: 0f 95 c0 setne %al
16: c9 leave
17: c3 ret

compile line:
g++ -O3 -c -o endian_test.o endian_test.cpp


Seem like a serious optimizer regression to me.

With g++ 3.4.2 it appears that it creates the right code even on -O1
level optimization.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top