size_t, ssize_t and ptrdiff_t

James Kuyper · Oct 15, 2013

On 15-Oct-13 13:56, glen herrmannsfeldt wrote: ....
When you redirect with < or >, the OS connects stdin or stdout to the
named file rather than the console; it's still a _file_. Using "cat"
meant that stdin and stdout were connected to a _pipe_ instead, which
gives fseek() and ftell() well-defined behavior that apparently didn't
crash the program.

I had thought that pipes were, in the relevant senses, equivalent to
files. I can't say that I've every knowingly used either '<' or '|' to
send input to a program that would use fseek() on it's input file. In my
experience, programs that do that sort of thing don't do it to either
stdin or stdout - they open the relevant file by name. What precisely is
the relevant difference between those two methods of passing to stdin,
in terms of what's supposed to happen when fseek() is called?

AFAIK, there was no need for programs to be "linked with a special
option" to get access to fseek64()/ftell64(); those should have been
included in the normal 32-bit libc as soon as the OS itself supported
large files. Likewise, the 64-bit libc should have supported large
files from the start, via both interfaces.

fseek64() and ftell64() are not reserved names as far as C is concerned.
Strictly conforming code can use such identifiers for functions with
external linkage, without worrying about conflicting with the POSIX
functions of the same name. Whatever options are needed to make that
possible could not be used when building a program which actually needed
to use the POSIX versions.

There are a few possibilities I can see: ....
3. cat didn't use fseek() or ftell() at all.

I can't come up with any reason why it would need to.

glen herrmannsfeldt · Oct 15, 2013

(snip, I wrote)

(snip)

(snip)

I wasn't really looking for the symptoms, but the cause, and more
precisely, how the cause of those symptoms was fixed.

Yes, but I don't understand why that made a difference - I would have
thought that any fseek() or ftell() occurring in "program" above that
would cause problems when executing

The problem occurs even in programs that don't use fseek() or ftell().

Maybe someone was being too careful, but as well as I know it
(some years later) it was protecting against programs that might
use fseek() or ftell() even if they don't actually do it.

program < file1 > file2

would cause the exact same problem when doing

cat file1 | program | cat > file2

How was re-direction of program output in unix handled such that the way
"cat" is written determines whether or not an fseek() in "program" will
fail? I would not have expected the way "cat" was written to matter, so
long as it actually does what "cat" is supposed to do.

Why would "cat" ever need to use fseek64() or ftell64()? As far as I can
see, it never needs to keep more than one character of input in memory
at a time, and never has any need to skip forward or backward through
either the input or output files.

The program I wrote also didn't use fseek() or ftell() (or the 64
bit offset versions) but still failed at 2GB.

Seems it was a Solaris feature.

-- glen

James Kuyper · Oct 15, 2013

On 10/15/2013 04:56 PM, glen herrmannsfeldt wrote:
....

The problem occurs even in programs that don't use fseek() or ftell().

Maybe someone was being too careful, but as well as I know it
(some years later) it was protecting against programs that might
use fseek() or ftell() even if they don't actually do it.

So, what feature did "program" possess such that

program < file1 > file2

would succeed, while

cat file1 | program | cat > file2

would fail? I find it quite mysterious that the presence of "cat" in
that command line would make a difference, unless "cat" were
malfunctioning, and that doesn't seem to be what you're suggesting.

Malcolm McLean · Oct 15, 2013

On 10/15/2013 04:56 PM, glen herrmannsfeldt wrote:

So, what feature did "program" possess such that

program < file1 > file2

would succeed, while

cat file1 | program | cat > file2

would fail? I find it quite mysterious that the presence of "cat" in
that command line would make a difference, unless "cat" were
malfunctioning, and that doesn't seem to be what you're suggesting.

Under Unix, pipes fill until some limit, usually very large, is reached.
But only if you do IO in buffered mode, that is, using the fopen,
fclose, fputc and "as if" interface. If you use open and write, with a
file id rather than a FILE *, you turn the buffering off.
So depending on how cat and the shell are written, the buffering modes
could be different. Whilst everything should still work, if the files
are huge, somethign somewhere migth break on one but not the other.

Keith Thompson · Oct 15, 2013

James Kuyper said:
On 10/15/2013 04:56 PM, glen herrmannsfeldt wrote:
...

So, what feature did "program" possess such that

program < file1 > file2

would succeed, while

cat file1 | program | cat > file2

would fail? I find it quite mysterious that the presence of "cat" in
that command line would make a difference, unless "cat" were
malfunctioning, and that doesn't seem to be what you're suggesting.

Sounds like an OS bug.

Eric Sosman · Oct 15, 2013

Sounds like an OS bug.

Sounds like a hazy memory.

Les Cargill · Oct 15, 2013

Malcolm said:
Under Unix, pipes fill until some limit, usually very large, is reached.
But only if you do IO in buffered mode, that is, using the fopen,
fclose, fputc and "as if" interface. If you use open and write, with a
file id rather than a FILE *, you turn the buffering off.
So depending on how cat and the shell are written, the buffering modes
could be different. Whilst everything should still work, if the files
are huge, somethign somewhere migth break on one but not the other.

You can use setbuf() on FILE * to turn off buffering. It's clunky but it
works.

James Kuyper · Oct 15, 2013

Sounds like an OS bug.

Possibly - but it could also be a shell bug, since '<' and '|' are
features of the shell rather than of the OS itself.

But what I'm asking for is details about the bug.

Ken Brody · Oct 16, 2013

On 10/15/2013 4:56 PM, glen herrmannsfeldt wrote:
[...]

The program I wrote also didn't use fseek() or ftell() (or the 64
bit offset versions) but still failed at 2GB.

Seems it was a Solaris feature.

Either the filesystem itself couldn't handle >2GB files, or check out "man
ulimit".

However, that wouldn't explain why you could pipe to "cat >filename" and
have it work, since cat would have the same restrictions. Is it possible
that the pipe version also failed at 2GB, but cat didn't give any error?

Stephen Sprunk · Oct 16, 2013

Possibly - but it could also be a shell bug, since '<' and '|' are
features of the shell rather than of the OS itself.

The shell just reconnects stdin/stdout to the indicated place and then
fork()s and exec()s the indicated program; it's not in existence anymore
after that point, so it's unlikely that it could cause said program to
crash (or not crash).

S

Stephen Sprunk · Oct 16, 2013

I had thought that pipes were, in the relevant senses, equivalent to
files.

Unix's "everything's a file" abstraction is quite leaky: it holds only
so long as the file operations you're performing on a non-file make
sense for that type of non-file. Most programs just do simple reads or
writes, so you can redirect them to non-files without encountering these
leaks, which is why the abstraction is so powerful.

What precisely is the relevant difference between those two methods
of passing to stdin, in terms of what's supposed to happen when
fseek() is called?

"program < foo" connects stdin to a real file, whereas "cat foo |
program" connects "program"'s stdin to a pipe masquerading as a file.

IIRC, if you try to fseek() on a pipe, socket, device, etc. (i.e.
anything that isn't really a file), it is defined to be a no-op. There
might be an error code, but it won't (directly) crash the program.

I can't say that I've every knowingly used either '<' or '|'
to send input to a program that would use fseek() on it's input file.
In my experience, programs that do that sort of thing don't do it to
either stdin or stdout - they open the relevant file by name.

Many Unix programs will interpret the filename "-" as stdin/stdout or
default to using stdin/stdout if no filename is given. The logic that
deals with the data is usually elsewhere and might assume it was dealing
with a real file (due to the abstraction), including doing things like
fseek().

fseek64() and ftell64() are not reserved names as far as C is
concerned. Strictly conforming code can use such identifiers for
functions with external linkage, without worrying about conflicting
with the POSIX functions of the same name. Whatever options are
needed to make that possible could not be used when building a
program which actually needed to use the POSIX versions.

It's been ages since I've developed on Solaris, but the usual Unix
practice is to put nearly everything into libc as "weak" symbols. If
you have a function of your own called "fseek64()", that will be a
"strong" symbol. As you might have guessed from the names, the linker
will prefer a "strong" symbol over a "weak" one when resolving a call.
That way, everything works as expected.

Headers (even the Standard ones!) often include some non-standard
functions and types by default; you must #define various things to slim
them down (if desired). But most of the cruft gets stuffed into other
headers, e.g. ones defined by POSIX, even if the functions themselves
reside in libc.

I can't come up with any reason why it would need to.

Nor can I, but I am regularly amazed at the "creativity" of other
programmers. I'm too lazy to go read the source for Solaris's cat, if
it's even available, so I'm hedging my bets.

S

James Kuyper · Oct 16, 2013

The shell just reconnects stdin/stdout to the indicated place and then
fork()s and exec()s the indicated program; it's not in existence anymore
after that point, so it's unlikely that it could cause said program to
crash (or not crash).

Until I get a more detailed explanation of how it failed, I can't rule
out the possibility that incorrect handling of the process you describe
might be part of the problem. I know a little bit about Unix internals,
but what I thought I knew is inconsistent with the described symptoms,
so there's presumably something I understand incorrectly - and I still
haven't seen an explanation that makes it clear what it is that I've
misunderstood.

glen herrmannsfeldt · Oct 16, 2013

(snip regarding files larger than (or equal to) 2GB.)

Either the filesystem itself couldn't handle >2GB files,
or check out "man ulimit".

This would have been Solaris 2.6 or 2.7, both SPARC and IA32.
(We had both running, with all files on a common NFS server.)
At that time both ufs and NFS3 supported files larger than 2GB.

However, that wouldn't explain why you could pipe to
"cat >filename" and have it work, since cat would have the
same restrictions. Is it possible that the pipe version also
failed at 2GB, but cat didn't give any error?

It was a feature. To avoid breaking existing programs that only
could fseek()/ftell() with signed 32 bit values, such programs
were only allowed to write (or, I believe, read) files smaller
than 2GB. As the OS doesn't know in advance when a program might
fseek() or ftell(), it seems that they didn't wait until it
was too late. System programs, such as cat, were rewritten (or maybe
just reompiled). I believe that they have to use fopen64()
instead of regular fopen().

If you search for "large file summit" and maybe also solaris, I
believe it is well described, though maybe not this detail.

-- glen

Keith Thompson · Oct 16, 2013

Stephen Sprunk said:
"program < foo" connects stdin to a real file, whereas "cat foo |
program" connects "program"'s stdin to a pipe masquerading as a file.

IIRC, if you try to fseek() on a pipe, socket, device, etc. (i.e.
anything that isn't really a file), it is defined to be a no-op. There
might be an error code, but it won't (directly) crash the program.

Calling fseek() on a non-seekable file-like thing is an error,
and fseek() will report that error. Not doing so would be very
bad behavior, and would make it difficult or impossible for some
programs to operate correctly.

ISO C only requires it to return 0 on success, and some non-zero
value for a request that cannot be satisfied, and does not mention
setting errno, though like any library function it's permitted to
set errno.

POSIX also specifies that fseek() returns 0 on success; if
it fails, it returns -1 and sets errno to indicate the error.
For a non-seekable device, errno will be set to ESPIPE. (As for
any library function, the value of errno after a successful call
is meaningless.)

[...]

Stephen Sprunk · Oct 17, 2013

It was a feature. To avoid breaking existing programs that only
could fseek()/ftell() with signed 32 bit values, such programs
were only allowed to write (or, I believe, read) files smaller
than 2GB. As the OS doesn't know in advance when a program might
fseek() or ftell(), it seems that they didn't wait until it
was too late. System programs, such as cat, were rewritten (or maybe
just reompiled). I believe that they have to use fopen64()
instead of regular fopen().

If you search for "large file summit" and maybe also solaris, I
believe it is well described, though maybe not this detail.

For the gory details, from the Solaris OS team no less:
http://unix.business.utah.edu/doc/os/solaris/misc/largefiles.pdf

In a nutshell, a 32-bit* program using open()/fopen() on a large file
would fail with EOVERFLOW, whereas using open64()/fopen64() would
succeed. 64-bit* programs could use either.

(* In this case, bitness refers to the width of a long, so ILP64 or
I32LP64 systems would count as 64-bit, but IL32LLP64 ones wouldn't.)

S

Seebs · Oct 17, 2013

In a nutshell, a 32-bit* program using open()/fopen() on a large file
would fail with EOVERFLOW, whereas using open64()/fopen64() would
succeed. 64-bit* programs could use either.

I once introduced a bug similar to that with pseudo, although that's
getting far into the "well of course that doesn't work" category.

-s

Albert van der Horst · Oct 25, 2013

In a new language, you don't really want untidy features such as these. I
think even in C itself, they were bolted on decades later. The problems they
are trying to solve can be appreciated, but how do other languages deal with
them?

A higher level of abstraction. It is surprising to see how good the very old
definition of ALGOL68 (that was from before you were born) holds up.
It even caters for multiprocessing in calculations with exact specifications
about what part of statements are executed concurrently. We are just entering
that stage about now.

Groetjes Albert

Plauger, size_t and ptrdiff_t	26	Feb 17, 2006
size_t and ptrdiff_t	1	Feb 27, 2014
ssize_t and size_t	8	May 19, 2009
ptrdiff_t	13	Dec 3, 2004
size_t or ssize_t	11	Feb 16, 2006
Strange result from ptrdiff_t and size_t	10	Feb 1, 2007
size_t in a struct	24	May 20, 2011
error: conflicting declaration 'typedef int32_t ssize_t' (mingw versus berkeley db)	5	Nov 26, 2011

size_t, ssize_t and ptrdiff_t

James Kuyper

glen herrmannsfeldt

James Kuyper

Malcolm McLean

Keith Thompson

Eric Sosman

Les Cargill

James Kuyper

Ken Brody

Stephen Sprunk

Stephen Sprunk

James Kuyper

glen herrmannsfeldt

Keith Thompson

Stephen Sprunk

Seebs

Albert van der Horst

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads