J
James Harris
(Related to a separate post about htons etc)
In endian.h gcc includes some useful names under the protection of #ifdef
__USE_BSD such as
# if __BYTE_ORDER == __LITTLE_ENDIAN
# define htobe16(x) __bswap_16 (x)
# define htole16(x) (x)
# define be16toh(x) __bswap_16 (x)
# define le16toh(x) (x)
Whether gcc can be encouraged to show them in the environment I am working
in or not, such names are not in all the compilers I am using. I therefore
need to set up some similar operations and can see some 'interesting' issues
over defining them. I am sure that this kind of thing is an oft-asked
question so rather than just asking for suggestions I'll write up what I
have been considering and would much appreciate feedback. I do have some
specific issues in mind.
First and foremost, there seems to be no practical way for the
*preprocessor* to detect the endianness of the target machine. If so, the
options seem to be either to select different endiannesses in the code as in
if little endian
...
else if big endian
...
else
...
or, alternatively, to specify the endianness when the code is compiled. I am
thinking that because each target machine would be different the object code
would have to be different for each. (Some machines such as Arm can operate
in either mode.) So it would be reasonable to produce different object
files. The compiler output directories would have to incude the name of the
target architecture so that a given piece of source code could compile to
each target. Even if the object code included if-endiannness tests such as
those above, only one branch of each such test would ever be used on a given
machine (in a given mode).
I think I could specify the endianness of the target by either including a
build-specific header file or by passing a symbol definition when the
compiler is invoked. If so, is either approach a generically better one to
take or is there another way to get the same effect?
Second, the use of macros is good since, as above, operations that have no
effect can clearly cost nothing at execution time. But why are the above
macro names not capitalised? I usually take capitals as a warning that
something that looks like a function call is really a macro and I need to be
careful about the type of parameter that is used. Are those names
uncapitalised because they are always safe to use as macros?
Third, on architectures where bytes have to be swapped, C code - as with
most HLL code - can be awkward. I tried to illustrate that in the related
post mentioned at the outset. What alternatives are there to writing the
code in C? I have seen headers include inline assembly for byte swapping but
I don't like making C code so unportable. If it's C it should be C! So I am
thinking to either write the long-winded code in C or to have the macro call
a function that is implemented by a separate assembly routine. For what I am
doing there will be a separate assembly layer for each target anyway so it's
not a big departure from what the rest of the code does.
In summary, I would have
a macro to read a 16-bit little endian value
a macro to read a 16-bit big endian value
ditto for writing the values, ditto for any other defined integer types.
Possibly I should have a macro for reading a PDP-endian 32-bit value too, if
I wanted to do the job properly ;-)
The idea is that these macros would be no-ops on the matching architectures
and calls to separate functions where the architecture doesn't match, and
that the choice of which family of macros to use would be controlled by
something specified at compile time.
How does that lot sound?
James
In endian.h gcc includes some useful names under the protection of #ifdef
__USE_BSD such as
# if __BYTE_ORDER == __LITTLE_ENDIAN
# define htobe16(x) __bswap_16 (x)
# define htole16(x) (x)
# define be16toh(x) __bswap_16 (x)
# define le16toh(x) (x)
Whether gcc can be encouraged to show them in the environment I am working
in or not, such names are not in all the compilers I am using. I therefore
need to set up some similar operations and can see some 'interesting' issues
over defining them. I am sure that this kind of thing is an oft-asked
question so rather than just asking for suggestions I'll write up what I
have been considering and would much appreciate feedback. I do have some
specific issues in mind.
First and foremost, there seems to be no practical way for the
*preprocessor* to detect the endianness of the target machine. If so, the
options seem to be either to select different endiannesses in the code as in
if little endian
...
else if big endian
...
else
...
or, alternatively, to specify the endianness when the code is compiled. I am
thinking that because each target machine would be different the object code
would have to be different for each. (Some machines such as Arm can operate
in either mode.) So it would be reasonable to produce different object
files. The compiler output directories would have to incude the name of the
target architecture so that a given piece of source code could compile to
each target. Even if the object code included if-endiannness tests such as
those above, only one branch of each such test would ever be used on a given
machine (in a given mode).
I think I could specify the endianness of the target by either including a
build-specific header file or by passing a symbol definition when the
compiler is invoked. If so, is either approach a generically better one to
take or is there another way to get the same effect?
Second, the use of macros is good since, as above, operations that have no
effect can clearly cost nothing at execution time. But why are the above
macro names not capitalised? I usually take capitals as a warning that
something that looks like a function call is really a macro and I need to be
careful about the type of parameter that is used. Are those names
uncapitalised because they are always safe to use as macros?
Third, on architectures where bytes have to be swapped, C code - as with
most HLL code - can be awkward. I tried to illustrate that in the related
post mentioned at the outset. What alternatives are there to writing the
code in C? I have seen headers include inline assembly for byte swapping but
I don't like making C code so unportable. If it's C it should be C! So I am
thinking to either write the long-winded code in C or to have the macro call
a function that is implemented by a separate assembly routine. For what I am
doing there will be a separate assembly layer for each target anyway so it's
not a big departure from what the rest of the code does.
In summary, I would have
a macro to read a 16-bit little endian value
a macro to read a 16-bit big endian value
ditto for writing the values, ditto for any other defined integer types.
Possibly I should have a macro for reading a PDP-endian 32-bit value too, if
I wanted to do the job properly ;-)
The idea is that these macros would be no-ops on the matching architectures
and calls to separate functions where the architecture doesn't match, and
that the choice of which family of macros to use would be controlled by
something specified at compile time.
How does that lot sound?
James