Standard compliant seek with long long?

D

David Mathog

I recently ran into a problem where a data file downloaded
from another site contained more than 4Gb of data and so
the index file to items within that data went from unsigned
4 byte integers to unsigned 8 byte integers. Naturally this
broke my code which uses fseek(), and can only offset by
longs, which on the target OS is a 4 byte integer.

There are ways around this using OS calls, but
as far as I can tell the C99 standard offers no way to write
code that can jump to an arbitrary offset in this type of
large data file. Is there any movement in the standards
community towards solving this problem in the not too
distant future?

Solaris offers fseeko (fseek with offset of type
off_t) and various other lf64 extensions, but the man pages
didn't indicate that it was anything other than a Sun
specific solution.

Thanks,

David Mathog
(e-mail address removed)
 
G

Gordon Burditt

I recently ran into a problem where a data file downloaded
from another site contained more than 4Gb of data and so
the index file to items within that data went from unsigned
4 byte integers to unsigned 8 byte integers. Naturally this
broke my code which uses fseek(), and can only offset by
longs, which on the target OS is a 4 byte integer.

There are ways around this using OS calls, but
as far as I can tell the C99 standard offers no way to write
code that can jump to an arbitrary offset in this type of
large data file. Is there any movement in the standards
community towards solving this problem in the not too
distant future?

There are a couple of different standards-based solutions
(both of which require possibly-incompatible changes to
the implementation):

(1) make long bigger than 32 bits.
or
(2) fgetpos() and fsetpos(), which allow for the position to be
contained in what might be a struct containing a (maybe 256k-bit)
track, sector, cylinder, disk number, IPv8 IP address, etc.

Gordon L. Burditt
 
T

tedu

Gordon said:
There are a couple of different standards-based solutions
(both of which require possibly-incompatible changes to
the implementation):

(1) make long bigger than 32 bits.
or
(2) fgetpos() and fsetpos(), which allow for the position to be
contained in what might be a struct containing a (maybe 256k-bit)
track, sector, cylinder, disk number, IPv8 IP address, etc.

since you can only fsetpos() to a location where you once called
fgetpos(), this doesn't really help with random seeking unless you
fseek() through recording positions.

fseeko() is part of posix, so it should be available on most platforms
(but outside strict C).
 
D

David Mathog

tedu said:
fseeko() is part of posix, so it should be available on most platforms
(but outside strict C).

Thanks, I guess allowing posix extensions isn't too much of a stretch.

Still...

Can any of the standards folks explain why ANSI C does not
have fseeko (or equivalent) so that we can write
standard compliant ANSI C code that can randomly access the largest
files supported on a given OS?

I can understand the historical basis for fseek() using only
longs but not, at this late date, why something like fseeko is
not part of the current C standard.

Regards,

David Mathog
(e-mail address removed)
 
K

Keyser Soze

tedu said:
since you can only fsetpos() to a location where you once called
fgetpos(), this doesn't really help with random seeking unless you
fseek() through recording positions.

fseeko() is part of posix, so it should be available on most platforms
(but outside strict C).

Remember that fseek() can move the file position relative to the BEGINNING, END, or CURRENT position of the file.

You only run into the 4Gb file size limit when the fseek() orign is set to SEEK_SET or SEEK_END. When using SEEK_CUR you can move
the file position forward by (2^31-1) bytes and backwards by (2^31) bytes from the current position.

By combining fseek() with fgetpos() and fsetpos() it should be possible to index any file in 4Gb regions.

You will need to check that C run time of your target OS supports files larger than 4Gb.

To avoid some of the nasty suprises that fseek() has you should open the file as a binary stream. Text file streams tend to have
issues when using fseek().
 
W

Walter Roberson

Remember that fseek() can move the file position relative to the BEGINNING, END, or CURRENT position of the file.
You only run into the 4Gb file size limit when the fseek() orign is set to SEEK_SET or SEEK_END. When using SEEK_CUR you can move
the file position forward by (2^31-1) bytes and backwards by (2^31) bytes from the current position.
By combining fseek() with fgetpos() and fsetpos() it should be possible to index any file in 4Gb regions.

"A binary stream need not meaningfully support fseek calls with
a whence value of SEEK_END."

"For a text stream, either offset shall be zero, or offset shall be
a value returned by an earlier call to the ftell function on the
same stream and whence shall be SEEK_SET."


The first of these means that you cannot move to arbitrary positions
relative to the end of a binary stream.

The second of these means that you cannot index text streams in 4 GB
regions -- you only said that text streams had "issues".


I suppose it would be possible to count the number of times that one
must seek forward by 2^31-1 bytes, and then the further count forward
one must go relative to that, but tis would seem to be a poor way to
win a war.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,763
Messages
2,569,562
Members
45,038
Latest member
OrderProperKetocapsules

Latest Threads

Top