Stephen Sprunk said:
Given the prototype he gave for sendto(), it's virtually certain that he's
referring to the POSIX (and Berkeley sockets) function of that name. Since
POSIX requires 8-bit bytes (and the Internet is defined in terms of octets),
that's not likely to be a problem. POSIX allows many character encodings
(unfortunately), but AFAIK all of the allowed ones are supersets of ASCII.
Citations?
ISO/IEC 9945-1: 1990 (aka IEEE Std 1003.1-1990 aka POSIX.1)
section 2.2.2.8 defines character as "a sequence of one or more
bytes representing a single graphic symbol", and specifically notes
the definition to be equivilent to C's multibyte character.
section 2.71 imports byte from ISO C without any restriction to
8 bit bytes.
Section 2.2.2.60 refers to the "portable filename character set"
(upper and lower case english letters together with the 10 digits, period,
dash, and underscore), but B.2.2.2 "General Terms" indicates,
portable filename character set: The encoding of this character set
is not specified -- specifically, ASCII is not required. But the
implementation must provide a unique character code for each of
the printable graphics specified by POSIX.1.
8.1.2.2 (Description of setlocale) indicates that,
The value "C" for locale specifies the minimal environment for
C-Language translation.
There's nothing there that would prevent the local "C" locale from using
EBCDIC as the encoding, nor that would prevent bytes from being 9 or
16 bits or whatever.
(and the Internet is defined in terms of octets),
No it isn't. In particular, ethernet frames are defined in terms
of bits. Look, for example, at IEEE 802's use of preamble (some of it
is allowed to be eaten or distorted along the way, as long as enough
is left to establish signal synchronization). See also how the
intraframe gap is defined and notice that the standards allow for it to
shrink.
If "the Internet" was defined in terms of octets, then a received packet
would have to start on a synchronized octet boundary, which is not the
case: many internet transports are asynchronous, with the initial
field of the preamble being used to sychronize the data clocks.