The C standard guarantees that INT_MAX is at least 32767, and the size
of any scalar type will always be less than this (at least in the real
world). Rather than casting the result of sizeof to "unsigned long",
it's simply easier to cast it to "int". In the instant case, we know
that the result of "sizeof(float)" is guaranteed to fit within type
int (again, in the real world).
(Topic change.) This "int vs. size_t" question makes me remember what I
don't really like about the printf() family:
- The return value signals the number of characters transmitted (if no
error occurred). While strlen() returns a size_t, printf() and co.
return an int.
- Same for %n.
- Same for the "*" field width and precision.
(C99 7.19.6.1 The fprintf function, p15:
----v----
Environmental limits
The number of characters that can be produced by any single conversion
shall be at least 4095.
----^----
Does this mean one can't portably pass a string to a single %s if
strlen() returns at least 4096 for that string?)
I have the (very superficial) impression that unsigned integers are
historically very under-used in favor of signed integers. For example,
the (not standard C) BSD socket interfaces historically took a lot of
"size parameters" (obvious candidates for the sizeof operator) as int's:
- accept(): 3rd param
- bind(): 3rd param
- connect(): 3rd param
- getpeername(): 3rd param
- getsockname(): 3rd param
- getsockopt(): 5th param
- recvfrom(): 6th param
- sendto(): 6th param
- setsockopt(): 5th param
If one opens the manual page for accept() on a GNU/Linux distribution,
something like this should come up:
----v----
The third argument of accept was originally declared as an `int *' (and
is that under libc4 and libc5 and on many other systems like BSD 4.*,
SunOS 4, SGI); a POSIX 1003.1g draft standard wanted to change it into a
`size_t *', and that is what it is for SunOS 5. Later POSIX drafts
have `socklen_t *', and so do the Single Unix Specification and glibc2.
Quoting Linus Torvalds: _Any_ sane library _must_ have "socklen_t" be
the same size as int. Anything else breaks any BSD socket layer stuff.
POSIX initially _did_ make it a size_t, and I (and hopefully others,
but obviously not too many) complained to them very loudly indeed.
Making it a size_t is completely broken, exactly because size_t very
seldom is the same size as "int" on 64-bit architectures, for example.
And it _has_ to be the same size as "int" because that's what the BSD
socket interface is. Anyway, the POSIX people eventually got a clue,
and created "socklen_t". They shouldn't have touched it in the first
place, but once they did they felt it had to have a named type for some
unfathomable reason (probably somebody didn't like losing face over
having done the original stupid thing, so they silently just renamed
their blunder).
----^----
I believe historically wrong parameter types should be fixed in
standards, and it was right to change those parameter types to size_t in
the first place, because size_t is the return type of the sizeof
operator. If for whatever reason it was necessary to match int's size,
"unsigned" would still fit better than "int".
Now if one writes code simultaneously for SUSv1 (UNIX 95) and SUSv2
(UNIX 98) or later, all such function calls need preprocessor magic or
the following ugly but useful "technique":
{
struct msghdr dummy;
struct sockaddr_in addr;
int acc_sock;
dummy.msg_namelen = sizeof addr;
acc_sock = accept(sock, (struct sockaddr *)&addr, &dummy.msg_namelen);
}
Because the type of the msg_namelen member changed from size_t (SUSv1)
to socklen_t (SUSv2+) in parallel to the other parameters listed above.
(Not surprisingly, as msg_namelen communicates an address size
otherwise.)
Similarly, the not standard C fcntl() / F_SETFL lets or requires the
programmer to manipulate a bitmask. Why is the mask represented as an
int, instead of an unsigned? Let's suppose we possibly opened a FIFO in
nonblocking mode and now we want to ensure blocking behavior:
{
int opts;
opts = fcntl(fd, F_GETFL);
if (-1 == opts /* query could have been incorporated here */
|| -1 == fcntl(fd, F_SETFL, opts & ~O_NONBLOCK)) {
(void)fprintf(stderr, "%s: fcntl(): %s\n", progname, strerror(errno));
}
}
O_NONBLOCK is positive (because the value returned by fcntl() / F_GETFL
is positive if no error occurs). It must be representable by an int (see
return type again), thus it is not promoted above int within the bitwise
complement operator. ~O_NONBLOCK will be no trap representation, but it
will be probably negative. Then we BIT-AND that negative value with the
current flags. Ugly.
-o-
read() and write() take a size_t parameter for the number of bytes to be
read/written, but return ssize_t so that -1 can be returned to signal an
error. Consequently, only min { actual number of bytes, SSIZE_MAX } can
be passed.
I think all such functions should return -1 or 0 for error or success
correspondingly, and store the actual output through a
programmer-supplied pointer.
int made_up_fprintf(size_t *wr, FILE *strm, const char *fmt, ...);
int made_up_read(size_t *rd, int fd, void *buf, size_t nbyte);
("restrict" omitted for simplicity.)
fcntl() / F_GETFL takes a variable number of arguments anyway, so it
could return the current file status flags and file access modes through
a pointer using the current prototype. The type carrying the flags
should be unsigned int.
-o-
If I'm already talking about what I perceive as illogical interfaces,
the hierarchy of the BSD socket address structures / functions is wrong.
Consider:
{
struct sockaddr_in addr;
addr.sin_family = AF_INET;
addr.sin_port = htons(12345);
addr.sin_addr.s_addr = inet_addr("192.168.1.3");
}
Part of the TCP/IP (v4) protocol stack looks like this:
1) internet layer (eg. IP addresses and transport protocol selection)
2) transport layer (eg. TCP/UDP ports, dependent on the protocol
selected above)
The very existence of the "port" notion depends on the protocol selected
at the internet layer. There should be other IP-based protocols than TCP
and UDP, with a different or no "port" notion (eg. ICMP). The order of
"specialization" with an API should reflect the underlying protocol
structure.
1a) internet layer: address(es)
1b) internet layer: transport protocol
--------
2) transport layer: protocol-specific stuff, eg. port(s)
In reality, we have
socket(AF_INET, SOCK_STREAM, 0);
or socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
or
socket(AF_INET, SOCK_DGRAM, 0);
or socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
or similar. Then we provide local or remote IP address(es) and TCP/UDP
port(s) in the second step (bind() / connect()). This corresponds to
1b) internet layer: transport protocol
--------
1a) internet layer: address(es)
2) transport layer: protocol-specific stuff, eg. port(s)
I'm not saying this doesn't work -- it does and I like network
programming (see
http://lacos.hu for my few tiny toys that nevertheless
make huge use to me). I claim that most of the BSD socket interface is a
very non-intuitive black book of magic incantations. See "Hobbit"'s
comments in the netcat source (IIRC), for example.
.... I think this was a bit off-topic, sorry.
Cheers,
lacos