fseek past the eof

X

Xenos

Can anyone tell me what the standard says about using fseek (on a binary
file) to seek past the end-of-file? I can't find anything in my (draft)
copy of the standard, nor in the FAQ.

Thanks
 
J

Joe Wright

Xenos said:
Can anyone tell me what the standard says about using fseek (on a binary
file) to seek past the end-of-file? I can't find anything in my (draft)
copy of the standard, nor in the FAQ.

Thanks

I'm curious why you would, and what you would expect to find,
seeking past end-of-file. Perhaps I misunderstand.

Still curious, how would you know where end-of-file is and that you
are seeking past it?

The FAQ not withstanding, the semantics of fseek deal with the
beginning of the file, the end of the file and relative offsets
within the file. The actual address is of type long. Assuming a
properly opened file (FILE *fp) of some size, ..

long bof = 0; /* no calc needed. files begin at 0 */
long eof; /* just define it for now */
if (fseek(fp, 0, SEEK_END) != 0) puts("fseek failed, die"), exit(9);
eof = ftell(fp);

Now eof is actually the file length. It 'points' one byte after the
last byte in the file, actually to where the next byte might be
written.

But 'past end-of-file' has no meaning to me. end-of-file + 3?
 
G

Gordon Burditt

Can anyone tell me what the standard says about using fseek (on a binary
I'm curious why you would, and what you would expect to find,
seeking past end-of-file. Perhaps I misunderstand.

Under UNIX it is possible to seek past end-of-file, write something
(which moves the end-of-file to the end of what you wrote), and
later seek again and read it. Reading unwritten data reads back
binary 0 bytes. On some versions of UNIX disk "blocks" that were
wholly unwritten were also not allocated, leaving the possibility
of the gigabyte-sized file that fits on a floppy-sized filesystem.
Still curious, how would you know where end-of-file is and that you
are seeking past it?

Many uses for this odd kind of indexed file don't CARE where the
end-of-file is. You compute a hash code for the data record key,
multiply it by the data block size, and that's where you put the
data. If you want to read such a record, you hash the key, multiply
it by the data block size, and try to read it. If you get all bytes
zero or end-of-file, there is no record written there. Hash collisions
are dealt with by putting multiple entries in a "data block", or
by using more bits in the hash to locate a different block.

The above is a crude description of how a UNIX "dbm" file works.
It typically occupies about 1/4 of the disk space than what you'd
expect based on its size.
beginning of the file, the end of the file and relative offsets
within the file. The actual address is of type long. Assuming a
properly opened file (FILE *fp) of some size, ..

long bof = 0; /* no calc needed. files begin at 0 */
long eof; /* just define it for now */
if (fseek(fp, 0, SEEK_END) != 0) puts("fseek failed, die"), exit(9);
eof = ftell(fp);

Now eof is actually the file length. It 'points' one byte after the
last byte in the file, actually to where the next byte might be
written.

But 'past end-of-file' has no meaning to me. end-of-file + 3?

fseek(fp, 32767, SEEK_END);

Gordon L. Burditt
 
R

Richard Bos

Xenos said:
Can anyone tell me what the standard says about using fseek (on a binary
file) to seek past the end-of-file?

Nothing, AFAICT. Which means that, unless I missed something, it's
undefined behaviour. Which is odd; I'd have expected it to be
unspecified: either return non-zero from fseek(), or allow it and do
something system-specific but non-crashing to the file.

Richard
 
C

Chris Croughton

Nothing, AFAICT. Which means that, unless I missed something, it's
undefined behaviour. Which is odd; I'd have expected it to be
unspecified: either return non-zero from fseek(), or allow it and do
something system-specific but non-crashing to the file.

I hadn't realised that either. It doesn't even say that the behaviour
is undefined or implementation defined (although some implementations of
the library do define it, for instance on many Unix systems where it
allows writes (but not reads) past the end of file leaving 'holes').
For that matter since the type of the offset is (signed) long int it is
theoretically valid to use a negative number with SEEK_SET.

POSIX.1 does specify the action for writing beyond end of file, if one
is using a POSIX-compliant system:

The fseek() function shall allow the file-position indicator to be
set beyond the end of existing data in the file. If data is later
written at this point, subsequent reads of data in the gap shall
return bytes with the value 0 until data is actually written into
the gap.

Chris C
 
D

Dan Pop

In said:
Nothing, AFAICT.

That's because you can't read.

2 The fseek function sets the file position indicator for the
stream pointed to by stream.

Which part of this statement doesn't cover the case when the new file
position is beyond the current end of file?

Since *any* fseek call may fail, seeking past the current end of file
*may* fail too.

In other words, the standard doesn't treat seeking past the current end
of file as a special case. It's entirely up to the implementor to decide
whether to support such requests or not.

Dan
 
X

Xenos

Joe Wright said:
I'm curious why you would, and what you would expect to find,
seeking past end-of-file. Perhaps I misunderstand.

Still curious, how would you know where end-of-file is and that you
are seeking past it?
I'm not. We have some vendor code that does, which is having problems.
Before I talk to them about it, I want to understand the standard. I don't
want to run my mouth without knowing what the hell I'm talking about.

Why do it? It creates what is called a "sparse file," a file with holes.
A lot of executables to it (but for what reason, I do not know). It can be
done on Windows and Unix, but I don't know if its implementation specific or
part of the standard.

Thanks.
 
X

Xenos

Dan Pop said:
In <[email protected]> (e-mail address removed)
(Richard Bos) said:
In other words, the standard doesn't treat seeking past the current end
of file as a special case. It's entirely up to the implementor to decide
whether to support such requests or not.
Thanks, Dan. That's the information I needed.

DrX
 
C

Chris Croughton

Why do it? It creates what is called a "sparse file," a file with holes.
A lot of executables to it (but for what reason, I do not know). It can be
done on Windows and Unix, but I don't know if its implementation specific or
part of the standard.

It's part of the POSIX.1 standard, not the C standard. It doesn't
guarantee that there are 'holes' (not possible on some filesystems, for
example FAT) but it does guarantee that:

The fseek() function shall allow the file-position indicator to be
set beyond the end of existing data in the file. If data is later
written at this point, subsequent reads of data in the gap shall
return bytes with the value 0 until data is actually written into
the gap.

(POSIX.1, IEEE Std 1003.1, 2004 Edition)

Win32 has a POSIX layer, I don't know how standard compliant it is.

Chris C
 
R

Richard Tobin

Xenos said:
Why do it? It creates what is called a "sparse file," a file with holes.
A lot of executables to it (but for what reason, I do not know). It can be
done on Windows and Unix, but I don't know if its implementation specific or
part of the standard.

It's not part of the *C* standard, but it's part of Posix, and perhaps
of some other standards. Very few useful programs use only the C
standard.

-- Richard
 
X

Xenos

Richard Tobin said:
It's not part of the *C* standard, but it's part of Posix, and perhaps
of some other standards. Very few useful programs use only the C
standard.

-- Richard

Thanks.
 
X

Xenos

It's part of the POSIX.1 standard, not the C standard. It doesn't
guarantee that there are 'holes' (not possible on some filesystems, for
example FAT) but it does guarantee that:

The fseek() function shall allow the file-position indicator to be
set beyond the end of existing data in the file. If data is later
written at this point, subsequent reads of data in the gap shall
return bytes with the value 0 until data is actually written into
the gap.

(POSIX.1, IEEE Std 1003.1, 2004 Edition)

Win32 has a POSIX layer, I don't know how standard compliant it is.

Chris C
Thank you.

The file system its being attempted on is FAT. Do you happen to know what
behavior will (should) be exhibited?
 
C

CBFalconer

Gordon said:
Under UNIX it is possible to seek past end-of-file, write something
(which moves the end-of-file to the end of what you wrote), and
later seek again and read it. Reading unwritten data reads back
binary 0 bytes. On some versions of UNIX disk "blocks" that were
wholly unwritten were also not allocated, leaving the possibility
of the gigabyte-sized file that fits on a floppy-sized filesystem.

This is generally known as a sparse file. Many systems, including
CP/M, have had this capability. It is very handy for databases.
Of course Microsoft eliminated it in their file systems, because as
usual they didn't understand.
 
F

Flash Gordon

Thank you.

The file system its being attempted on is FAT. Do you happen to know
what behavior will (should) be exhibited?

Yes I do know what should happen.

Now I suggest you ask somewhere the request is topical since if I'm
wrong I would not expect it to be corrected here.
 
C

Chris Croughton

The file system its being attempted on is FAT. Do you happen to know what
behavior will (should) be exhibited?

If it's running on a POSIX compliant system (like WinXP) then you should
just get a big file with lots of zeros (instead of a file which looks
big but is actually mostly not there on most *ix systems).

(This is not on-topic here, and you don't have a real email address, so
mail me for more chat about it...)

Chris C
 
X

Xenos

Chris Croughton said:
On Tue, 30 Nov 2004 12:23:29 -0500, Xenos

If it's running on a POSIX compliant system (like WinXP) then you should
just get a big file with lots of zeros (instead of a file which looks
big but is actually mostly not there on most *ix systems).

(This is not on-topic here, and you don't have a real email address, so
mail me for more chat about it...)

Nope, that will suffice. Thanks a bunch.
 
X

Xenos

Yes I do know what should happen.

Now I suggest you ask somewhere the request is topical since if I'm
wrong I would not expect it to be corrected here.

Good for you. Didn't ask you. The gentleman to whom it was directed was
polite enough to answer, so you don't have to worry about the veracity of
your information.

DrX
 
C

CBFalconer

Xenos said:
The file system its being attempted on is FAT. Do you happen to
know what behavior will (should) be exhibited?

FAT file systems are inherently incapable of creating sparse
files. Thus any attempts to seek past EOF should either fail, or,
if the file is open for writing, write a possibly ungodly number of
zero bytes to extend the file.
 
F

Flash Gordon

Good for you. Didn't ask you.

When you post to a news group you are asking EVERYONE who reads the
group. If you want to ask a private question do it by email.
The gentleman to whom it was directed
was polite enough to answer, so you don't have to worry about the
veracity of your information.

I don't have to worry about the veracity of my information since I have
an MSDN subscribtion.
 
C

Craig Barkhouse

Chris Croughton said:
Win32 has a POSIX layer, I don't know how standard compliant it is.

No, Windows has a Win32 layer and a POSIX layer. The two are mutually
exclusive. They both reside above the native API.

Applications are only POSIX compliant if they target the POSIX layer. For
all other apps (which is 99.9% of them) POSIX isn't even a factor.

I know this is offtopic. Please don't flame me. :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top