Programming in standard c

Kelsey Bjarnason · Jan 2, 2008

[snips]

How ridiculous. This is true for many things. Always "file size" in
"portable C" is useless. This is what Jacob is alluding to.

Yes, but Jacob persists in missing the obvious: this is not a limitation
of C, but of the very notion of determining file sizes. It doesn't matter
what language or OS you use, if there is *any* possibility of the file
being modified other than by the singular instance of the application in
question, you still face the same problems.

For some reason, Jacob seems to want to focus on supposed limitations of
C, without bothering to examine the underlying problem at all.

You seem to miss the point. filesize(fp) is portable if it's in the
standard. How it is implemented per platform is another issue.

And what actual use it is has yet to be determined.

Rubbish. All files have a byte size.

Good. So tell me the size of, oh, /var/log/apache/access.log. Oh,
whoops, it changed - a new log entry was written. Oh, whoops, it changed,
logrotate just archived it and set it to an empty file. The "size" you
got has absolutely no relation to the size now, so tell us what size the
file is, in bytes, in any meaningful sense: is it the 100K your function
reported? The 102K from a second ago? Or the 0K it is right now? Oh,
but you're going to allocate 100K, try to read 100K and get zero bytes
read - is that an error condition? A read failure? Or is that correct
behaviour, just completely inconsistent with the size you recorded?

How silly.

"correct" would be platform specific. The API need not be.

So which is the correct size of a partially compressed sparse file? The
uncompressed size it would be if it was actually "full"? The compressed
size, based on what's actually in it? The size it currently occupies on
the disk? If the latter, keep in mind that it bears absolutely no
relationship to the actual number of data bytes in the file.

So do tell, which is the "correct" size.

Keith Thompson · Jan 2, 2008

I disagree that it's no worse.
If you use fseek() and ftell(), and try to build and run your program
on a system where the assumptions don't hold, it will silently accept
it until you run into a file that breaks one or more of the
assumptions. If you use fstat() and try to do the same thing, you'll
get a warning or error at compile time, and shouldn't have too much
trouble working out what it's trying to do and how to do it on the
system you're porting to, which ends up taking rather less total effort
to get the port working.

So unless you have an automated way to track the assumptions you're
making and verify that they hold on all the systems you try to build
on, you're probably better off with the explicitly nonportable code
than with code that makes nonportable assumptions about standard
interfaces.

You're right.

Kelsey Bjarnason · Jan 2, 2008

[snips]

No. Not ignore them. Discard them as the kind of pedantic oneupsmanship
that has place in the real world where real, practical solutions are
required.

So what's the real, practical size of /var/log/apache/access.log?
Calculate the file size _now_, you get, say, 100K. Wait half a second,
it's 102K. Wait another second while logrotate reaps it, it's now 0K. So
which size is it - the 100K you read, the 102K it was almost immediately
after, or the 0K it is now?

Presumably the Java functions will do the same thing the equivalent C
functions would do - report size at time of inspection. They, however,
still face the same problem, that the size reported has absolutely bugger
all to do with the size _now_.

You call this - real-world problems faced by real-world programs -
"pedantic oneupsmanship"; we look at it as simply one more problem which
has not been dealt with.

One way to deal with it is to do what Java apparently did: accept that you
can only get a value for file size at time of querying, which is a
reasonable position to take, but doing so also ignores the fact that the
size may change between querying the size and making use of the size.
Some applications might need to track the changes; others won't. That
some don't does not mean no application needs to, nor does it mean that
the issue is simply one of pedanticism; it is, rather, a question of
determining what the actual problem to be solved is.

Jacob's exemplar of reading a file into memory is a good example. He uses
whatever function to calculate the size, so far so good. He allocates a
buffer, still all good. So his size recorded is, say, 100K and he
allocated 100K, all's well.

When he actually reads the file, though, and discovers zero bytes to read
- because the file has been rotated and emptied - is this an error, or is
this as expected?

The reality is that it is as expected... but it is also reality that
unless his code is particularly smart - *and* knows the details of the
system upon which it is being executed - it is very likely to see this as
an error condition and fail in some manner.

Is this an error? Not really. Is his code going to be smart enough to
recognise this as being something other than an error? Possibly, if he
knows about this exact file, but probably not in the general case,
particularly if his code is used on multiple disparate systems. You want
to write off such issues as simple pedanticism, yet they're not - they're
real-world issues which come up in real-world systems, which need to be
dealt with by real-world code. Simply hand-waving them away doesn't work.

So rather than hand-waving them away, how about explaining how, exactly,
you're going to deal with them? How do you plan to determine the "size"
of a file, when that size keeps changing? How do you plan to determine
the "size" of a compressed, or partially compressed, or sparse file? Or
worse, sparse, partially compressed files? What actual "size" matters,
and how do you determine that this is the size that matters?

Determining the size - if you plan to - can be critical. For example,
take a sparse file. One size value says the file is 10GB, another says
it's 200K. One reports "theoretical" size, the other reports actual
usage. Here's the kicker: if you want to copy this file to another file
system, you _may_ be able to do it onto a file system with 200K free, if
it supports sparse files, or you may require 10GB, if it doesn't. So
which size is the proper one - the 200K actually allocated on disk, or the
10GB which is the "data size" required to store it on another file system
without risking losing data?

This doesn't even address the issue of translation - is that 100K size
that you read still going to be 100K if you want to read the file as a
text file? Or is it only applicable if you read it in binary mode?
Probably the latter, but if it is, in fact, a text file and you wish to
process it as such, how much space do you need to allocate to handle it?
You can't tell this from the size on disk, now can you?

The whole concept of "file size" is one where the pat answer sounds good
and even is good in many cases, but it is also one where the pat answer
falls on its face in far too many cases to naively rely upon it, something
the more experienced folks keep trying to point out, yet some folks, for
some reason, seem to want to ignore this.

Perhaps the simplest way to approach the issue is to ponder those
real-world cases where the pat answer doesn't work well and consider what,
if any, solution you would propose, a solution which actually covers those
cases.

I suggested one such, but it involves reporting not a single answer, but
several: size on disk, size reserved (eg how big a sparse file "actually
is"), size of contained compressed data, size of contained data after
decompression. That's four; perhaps we need to add another four,
basically the same values but after translation to text mode - though I
suspect this is going to make file management awfully inefficient. Yet
that's 8 different size values to describe *one* file, and still doesn't
deal with the fact the file size - any of those sizes - can change at a
moment's notice.

So which of the eight is "the size of the file"? Size on disk has little
bearing on what you'd need to allocate to store the contents in memory.
Size after decompression would be closer, but doesn't tell you how much
overhead is involved due to translation. Neither of these tells you how
much space will be required if you want to copy the file to a file system
which doesn't support sparse files.

Oh, and then there are "forks". Can't forget those; several file systems
include them. Those make things even more interesting. With forks, the
reported size on disk may well represent the total data in all forks, but
if you allocate a buffer that size and try to read the file, you're liable
to get somewhat less than you'd expect, since you're liable to only be
reading one fork, not all of them. So is that an error? Or is that an
expected situation? Does your code know the difference between reported
size on disk and expected size when reading?

There are so many issues to consider, yet some folks want to treat it as
if the only consideration in the world is getting a singular value
representing "the size of the file", without ever stopping to consider -
or worse, writing off as "pedanticism" - the fact that "file size" is a
meaningless concept in all but a few cases.

So explain to us, you - or the others who think this issue is so trivial -
which of the 8 values I mentioned, the ones which don't even deal with
forked files, is "the size of a file".

Chances are, the best answer you'll come up with is "size on disk", but
this value is useless for virtually all purposes, other than determining -
in *some* cases - whether you have enough room on another disk to store a
copy of the file, and can fail miserably even there.

C doesn't have a standardized filelength that does anything useful? Okay,
great, it doesn't. What would *you* put in its place, though? Which of
the umpteen possible size values would you report for a given file? How
would you have it determine - and report - whether that size meant size on
disk, size required to store another copy, size before or after
decompression, etc, etc, etc?

Java designers apparently decided to pick one particular value and report
that. Fine, great, wonderful and all, but it doesn't deal with all the
other case. It is one possible answer, and presumably a perfectly
acceptable one - for the cases to which it applies. It is going to be
considerably less useful for other cases.

Kelsey Bjarnason · Jan 2, 2008

[snips]

But they take 2GB to read into memory, if that is the number of
interest.

So which size is reported as "file size"? 2GB, or 200K? Suppose it's
200K - actual space consumed. Reading 200K of consecutive data is going
to get you bogus data, most likely, if the writes were done in chunks less
than 200K and strewn about the sparse space.

But the uncompressed size is the relevant number for space to read it
into memory.

Sure. So which gets reported, size on disk, or size after decompression?

Mandatory file locking is one way to accomplish this, but it has other
problems. It opens the system up to denial-of-service attacks by a
program that locks lots of important system files to keep administrators
out, then proceeds to do something evil.

Yeah, and that's sorta the point to all this, some folks seem to want to
hand-wave away such issues.

Kelsey Bjarnason · Jan 2, 2008

And what do you expect that fread will put into the buffer on such a
filesystem? The compressed or decompressed data?

Depends on the system, now don't it?

Take something like, oh, drivespace. Or compressed directories. Or other
variations on the theme. What gets read is decompressed data. Yet there
are now two distinct "size" values in play: size "apparent" on disk
(compressed size) and size of data in file. Which is reported? If you
get one when you were expecting the other, have fun.

jacob navia · Jan 2, 2008

Kelsey said:
[snips]

Look "CJ" whoever you are:

You know NOTHING of where I have programmed, or what I am doing.
Versions of lcc-win run in DSPs with 80k of memory, and only
20 usable.

Click to expand...

And Win98 on a 486?

lcc-win runs perfectly in that environment. The debugger however, has
problems, probably because of bugs in Win98. By the way those
systems are no longer supported by Microsoft. I do not support
them either.

Kelsey Bjarnason · Jan 2, 2008

[snips]

You have a file system, or I do not know where /dev/sda2 comes from...
You have no file system in that particular disk, but there is a file
system to access that device as a single file...

How are you defining "file system"? If I'm opening /dev/sda or /dev/sda2,
I'm not dealing with anything I'd consider a file system, I'm dealing with
a disk or a partition. The fact I'm using file-like semantics to do so
doesn't seem to justify the suggestion there's a file system involved.

santosh · Jan 2, 2008

Kelsey said:
[snips]

You have a file system, or I do not know where /dev/sda2 comes
from... You have no file system in that particular disk, but there is
a file system to access that device as a single file...

Click to expand...

How are you defining "file system"? If I'm opening /dev/sda or
/dev/sda2, I'm not dealing with anything I'd consider a file system,
I'm dealing with
a disk or a partition. The fact I'm using file-like semantics to do
so doesn't seem to justify the suggestion there's a file system
involved.

There may be no filesystem on the partition denoted by /dev/sda2, but
there is a file system present, otherwise a pathname like /dev/sda2
would be meaningless. Under normal operation UNIX systems need to mount
at least one filesystem under / to function properly.

Kelsey Bjarnason · Jan 2, 2008

[snips]

lcc-win runs perfectly in that environment. The debugger however, has
problems, probably because of bugs in Win98. By the way those
systems are no longer supported by Microsoft. I do not support
them either.

MS has a reason to not support Win98: by not supporting it, they encourage
people to spend money buying newer versions, thus increasing the profits.

By contrast, unless you have some compelling reason to use features
specific to later versions of Windows, there's no reason for you to not
support such a configuration.

Or, put differently, by sticking with the maximally usable set of Windows
functionality, you gain the ability to market to anyone who has need to
use such a system and likely cannot find a competing compiler that will
work for them. Even if it's a small market, you could be the most
significant player in it, unless there is some compelling benefit to using
the new functionality which simply isn't available on such machines.

Kelsey Bjarnason · Jan 2, 2008

[snips]

There may be no filesystem on the partition denoted by /dev/sda2, but
there is a file system present, otherwise a pathname like /dev/sda2
would be meaningless. Under normal operation UNIX systems need to mount
at least one filesystem under / to function properly.

Sure, but if / is on, say, hda and I'm accessing hdb, a different physical
device, the fact there's a file system on the one drive says nothing about
the presence of a file system on the other - yet I can access the drive
using file-like semantics, whether it has a file system or no.

jacob navia · Jan 2, 2008

Kelsey said:
[snips]

lcc-win runs perfectly in that environment. The debugger however, has
problems, probably because of bugs in Win98. By the way those
systems are no longer supported by Microsoft. I do not support
them either.

Click to expand...

MS has a reason to not support Win98: by not supporting it, they encourage
people to spend money buying newer versions, thus increasing the profits.

By contrast, unless you have some compelling reason to use features
specific to later versions of Windows, there's no reason for you to not
support such a configuration.

Or, put differently, by sticking with the maximally usable set of Windows
functionality, you gain the ability to market to anyone who has need to
use such a system and likely cannot find a competing compiler that will
work for them. Even if it's a small market, you could be the most
significant player in it, unless there is some compelling benefit to using
the new functionality which simply isn't available on such machines.

You are right, but I have a limited budget.
Supporting win98 needs a system where I can test it, a machine with that
system, time to set it up, time to debug, etc.

I tried last year to setup a system with a virtual machine but the
installation of windows 98 needs a DOS disquette, and I do not have a
floppy any more... It is quite a lot of work really...

Syren Baran · Jan 2, 2008

Sure, but if / is on, say, hda and I'm accessing hdb, a different physical
device, the fact there's a file system on the one drive says nothing about
the presence of a file system on the other - yet I can access the drive
using file-like semantics, whether it has a file system or no.

You should skip the "-like" here. Either its a file with an approriate
handle or i can obtain the handle via an open with a char*. The char*
may represent something that can be interpreted via a filesystem, but
that is no requirement.
I cease may case by quoting an old DOS example:
type "Does this damn fucking thing at least work via command line?(1)">lpt

(1) sentance may vary, usually dependend on previously invested time.

Kenny McCormack · Jan 2, 2008

I tried last year to setup a system with a virtual machine but the
installation of windows 98 needs a DOS disquette, and I do not have a
floppy any more... It is quite a lot of work really...[/QUOTE]

It doesn't (require a DOS disk), and never has. In fact, using VMWare,
you don't even need a CD drive (you can make an ISO file from the CD -
say on another machine) - and then install from the ISO file.

All very off-topic here, but if you're interested in discussing this,
shoot me an email.

Bart C · Jan 2, 2008

Kelsey Bjarnason said:
[snips]

So what's the real, practical size of /var/log/apache/access.log?
Calculate the file size _now_, you get, say, 100K. Wait half a second,
it's 102K. Wait another second while logrotate reaps it, it's now 0K. So

Taking the size of a rapidly changing file like that is asking for problems.
But they need not be serious. Ask the OS to copy that file to a unique
filename. Then read that new file using any method you like. If there are
discrepancies then they are the OS's fault.

Jacob's exemplar of reading a file into memory is a good example. He uses
whatever function to calculate the size, so far so good. He allocates a
buffer, still all good. So his size recorded is, say, 100K and he
allocated 100K, all's well.

When he actually reads the file, though, and discovers zero bytes to read
- because the file has been rotated and emptied - is this an error, or is
this as expected?

It's a discrepancy: if the file should have been static, then an error can
be raised. If it's known that possible live files could be being read, then
write some different code. Dealing with such files raises some difficulties
but getting rid of simplistic (but normally very useful) file functions
won't help.

How do you plan to determine the "size"
of a file, when that size keeps changing?

How do you do *anything* with the file when it keeps changing? I gave one
idea above.

Determining the size - if you plan to - can be critical. For example,
take a sparse file. One size value says the file is 10GB, another says
it's 200K. One reports "theoretical" size, the other reports actual

Actually I thought files (on Windows for example) were already sparse; if I
create an empty file, write the first byte, then write the ten billionth,
will the OS really fill in all those intermediate blocks?

If the OS is responsible for sparse/compressed files, then I would expect
them to be transparent. It should report the full size. After all it
shouldn't take long to read in non-existent blocks! And it wouldn't do me
much good to have a sparse/compressed file of unknown format in my memory
space.

(Someone said the OS may not know the full size of compressed files. I
would call that a broken OS)

usage. Here's the kicker: if you want to copy this file to another file
system, you _may_ be able to do it onto a file system with 200K free, if
it supports sparse files, or you may require 10GB, if it doesn't. So

OK, so it might need 10GB. This is an OS not a C issue.

This doesn't even address the issue of translation - is that 100K size
that you read still going to be 100K if you want to read the file as a
text file? Or is it only applicable if you read it in binary mode?

Text mode files are a pecularity of C; if there is a filesize() function
then it may need to be told whether the size is wanted in binary or text
mode, and to do the extra work to find out. Ideally one would forget text
mode.

I suggested one such, but it involves reporting not a single answer, but
several: size on disk, size reserved (eg how big a sparse file "actually
is"), size of contained compressed data, size of contained data after
decompression. That's four; perhaps we need to add another four,

The most useful is the number of data bytes seen by the application.
Compressed files, total bytes allocated, that's all OS stuff, of no interest
unless writing the OS, or doing some clever manipulations, then you wouldn't
be using the standard file functions.

Oh, and then there are "forks". Can't forget those; several file systems
include them. Those make things even more interesting. With forks, the ....
There are so many issues to consider, yet some folks want to treat it as
if the only consideration in the world is getting a singular value
representing "the size of the file", without ever stopping to consider -
or worse, writing off as "pedanticism" - the fact that "file size" is a
meaningless concept in all but a few cases.

The world could do with simplifying. Why not have a concept of 'filesize',
then define what it might mean under all your extreme examples?

I don't know what forks are, but if people had been happy dealing with files
their way before they came along, why can't they continue to do so? The
introduction of forks will not break existing code surely?

Whatever benefits 'forks' bestow surely can be reaped without affecting
naive applications that know nothing about them.

So explain to us, you - or the others who think this issue is so trivial -
which of the 8 values I mentioned, the ones which don't even deal with
forked files, is "the size of a file".

As mentioned, the one reported by a typical OS on file listings. Some OSs
apparently only know about complete blocks, in the case, the total of all
those blocks. To software that is aware of this, not a problem.

Chances are, the best answer you'll come up with is "size on disk", but

Correct.

I did a little test in Windows: slowly writing file A while, in a command
window, asking the OS to copy A to B. The result: I got a partial copy of A
in B, which represented where it had got to in writing A.

Another test where the copying was done by an appl calling C functions. The
same result. Was this an error? If it was then the OS is in error too.

Using 'naive' file functions like this works 99% of the time when used
sensibly. In a few cases, where there is unexpected/malicious write access
to an appl's support files, they could fail; but the appl would stop anyway.

Perhaps they should be protected from the complexities of modern file
systems instead of simply eliminating them, which is not a solution.

Bart

Nick Keighley · Jan 3, 2008

I'll let that go.

This is a code fragment. Assume headers are included for all C library
calls.

I wasn't aware there was much to go wrong. But I will have a look. At worst
it will return the wrong file size; I'll make it return all 0's or or 1's or
something.

ug. in band error signalling. How do you tell the difference between
a zero length) or very large file and an error?

I'll change the types to long too, since it can't go past 2GB-1 anyway.
maybe

Close. The code existed in a non-C language, but using the C runtime,
without access to the headers and it was easiest to plug in these constants.
Converted to C (and tested) for the post.

My assumption is the file is in binary mode; and my wrapper of the fopen()
function ensures that.

this might surprise someone who used your function without
your wrapper.

I don't get these. There are known issues when used with files currently
open for writing. And I know the host OS in *my* case. But yes, anyone
attempting to use this code on their OS should be aware of limitations.

yes, buts this means it isn't portable

Thanks (?)

Kelsey Bjarnason · Jan 3, 2008

[snips]

Taking the size of a rapidly changing file like that is asking for problems.
But they need not be serious. Ask the OS to copy that file to a unique
filename.

This assumes you can. If the file is larger than available free space,
how do you plan to manage this?

It's a discrepancy: if the file should have been static, then an error
can be raised.

And how does he know it's supposed to be static? Simple example: a text
file viewer/editor. It's the user's call what file to use it on, how does
the code know whether the file is supposed to be static?

Actually I thought files (on Windows for example) were already sparse;
if I create an empty file, write the first byte, then write the ten
billionth, will the OS really fill in all those intermediate blocks?

Actually, most systems simply won't let you write the billionth byte
unless you've already written all the bytes before it - meaning you have a
billion bytes on disk. Sparse files generally require special handling,
which is why the problem comes up: if you use the special sparse-file
functions or modes, you can, in fact, seek to byte 1 billion and write,
without actually storing a billion bytes on disk, but a "naive" file copy
routine, one which reads a block then writes it, will read every
intervening byte - it will copy a billion bytes.

And that's kinda the point: the file has maybe 100K of actual data in it,
the rest is "virtual zero" or some equivalent. Depending which "file
size" value you get, you see either 100K - in which case you're almost
certain to lose data, if it's strewn about the file - or you see 1 billion
bytes, in which case you're copying a billion bytes to get 100,000.
Neither is particularly good, but any concept of "file size" which isn't
smart enough to deal with such cases is going to face at least one of
those two problems - though, that said, any remotely portable file read is
also probably not going to be able to use the sparse file as anything
other than a "normal" file anyhow, and see it as a billion bytes, so it
had better hope it never sees the "data size" being reported.

If the OS is responsible for sparse/compressed files, then I would
expect them to be transparent.

They are and they aren't. A naive file copy can, indeed, copy such a
file, but it will "see" a file a billion bytes long, where one that uses
the sparse file functions may be able to tell there's only certain regions
which contain data, and copy those instead.

Certainly an application written specifically for the file - eg if this
file is some sort of data storage specific to the app - will know or be
able to tell, via indexing or the like, which portions of the file are
used and thus seek to offset 750,493 to read record 19.541 or whatever -
it can use the sparse file as a sparse file, where the naive routines use
the sparse file as a "flat" file, and the OS fills in the gaps, usually
with zeros.

It should report the full size. After all
it shouldn't take long to read in non-existent blocks!

Presumably less time than reading them off disk, but the fact is, the
blocks _do_ exist. They're just not stored on disk. Think of it as a
copy-on-write deal. Until written, the "sectors" don't exist. Once
written, they're mapped into the file and stored. On reading, the ones
which don't exist yet return zeroes - full "sectors", just with all bytes
zero - rather than simply skipping over them.

And it wouldn't
do me much good to have a sparse/compressed file of unknown format in my
memory space.

However, the issue here was one of knowing the file size so you can read
the file in. If you're writing a file duplicator, or a file editor, for
example, you may want to read in some portion of the file, even all of it
if it's small enough, yet here you're faced with a file with at least two
distinctly different "size" values, each legitimate. One describes the
"theoretical" size - say 4GB. The other describes the "actual" size -
size actually occupied on disk, size of actual data stored to the file,
say 200K. Which is the "correct" value? Both are correct.

(Someone said the OS may not know the full size of compressed files. I
would call that a broken OS)

Why? If you're storing a file on a compressed file system, there are
again two perfectly legitimate "size" values - size of original file, and
size as recorded to disk. If you're asking for "the size of the file",
which do you want? Depends; for some purposes, you'd want the size of the
compressed file, for others, the size of the file before compression.
Neither is "the" correct file size; each is correct - yet each is
different.

The most useful is the number of data bytes seen by the application.

And which value is that? IIRC, some folks have pointed out that not all
OSen even record a file size, per se. Thus you could, presumably, get
whatever value is reported for the current size of the file - 6 "blocks" -
write a chunk of data to the end, get the new size - again, 6 "blocks" -
and by naive reasoning conclude that no data had been written. The fact
that the system is reporting size by "block" count and your additional
data didn't spill into the next block means your size is of more than
questionable utility for many purposes.

Compressed files, total bytes allocated, that's all OS stuff

All file size values are OS stuff, which is kinda the point here. If
someone is going to say "I want the size of the file", he's going to have
to explain what he means, while taking all this sort of thing into
consideration. What *is* the size of the file? There's too many possible
answers to that to give a useful response even in a comparatively simple
case, never mind as a general case.

The world could do with simplifying. Why not have a concept of
'filesize', then define what it might mean under all your extreme
examples?

Exactly what I'm saying: if "you" want a function that determines the size
of a file, how about "you" define what "the size of a file" means. Oddly,
the ones most insistent upon having such a function refuse to solve these
issues.

I don't know what forks are, but if people had been happy dealing with
files their way before they came along, why can't they continue to do
so?

Who is "they"?

The introduction of forks will not break existing code surely?

The code won't break. What happens to the data, though?

Whatever benefits 'forks' bestow surely can be reaped without affecting
naive applications that know nothing about them.

You'd think so. I've been bitten by them before, though; a file which you
didn't realise was forked, you read/copy and subsequently delete it, oops,
sorry, you only actually copied the "default" fork, not all the data in
the file.

As mentioned, the one reported by a typical OS on file listings.

The one which is arguably of the least possible value of all the possible
results. Yeah, I'm gonna rush right out and use that one.

I did a little test in Windows: slowly writing file A while, in a
command window, asking the OS to copy A to B. The result: I got a
partial copy of A in B, which represented where it had got to in writing
A.

Another test where the copying was done by an appl calling C functions.
The same result. Was this an error? If it was then the OS is in error
too.

Just for giggles, did you also try it using the OS-specific file APIs?

Kelsey Bjarnason · Jan 3, 2008

[snips]

You should skip the "-like" here. Either its a file with an approriate
handle or i can obtain the handle via an open with a char*.

Really? So a hard drive is a file. And a sound card is a file. And a
PS/2 connector is a file.

Every one of those can be accessed in a file-like manner. Does this make
them files?

Walter Roberson · Jan 3, 2008

1) You can't open a file with
fopen("name","a+")
since somebody else could grow the file after the file is positioned at
EOF, so you would overwrite his data.

C89 4.9.5.3 The fopen Function

Opening a file with append mode ('a' as the first character in
the mode argument) causes all subsequent writes to the file
to be forced to the then current end-of-file, regardless of
intervening calls to the fseek function.

Therefore, if the implementation allows an I/O interruption after
the file is automatically repositioned, but before the writing
happens, the implementation is not conformant to the C standards,
as the writing would not be to the "then current end-of-file".

Syren Baran · Jan 3, 2008

Kelsey said:
[snips]

You should skip the "-like" here. Either its a file with an approriate
handle or i can obtain the handle via an open with a char*.

Click to expand...

Really? So a hard drive is a file. And a sound card is a file. And a
PS/2 connector is a file.

Sure, thats one of the nice and simple things about unices. Once you
have a file handle, how could you tell the difference?

Every one of those can be accessed in a file-like manner. Does this make
them files?

You say file-like again. What is the the difference between "file
manner" and "file-like manner"?
Problem is, the term "file" is not well defined.
Is a 1:1 copy of there entire contents of a harddrive a file, e.g. "dd
if=/dev/hdd of=hardrive.backup"?
Is an archive (e.g. zip-file, tar-file) a file or a filesystem? Does it
automagicly change its status if an implementation of open accepts
something like "/home/me/archive.zip/folder/somefile"?

Bart C · Jan 3, 2008

I wasn't aware there was much to go wrong. But I will have a look. At
worst
it will return the wrong file size; I'll make it return all 0's or or 1's
or
something.

)ug. in band error signalling. How do you tell the difference between
)a zero length) or very large file and an error?

Having an error condition equate to a zero-length file is workable when the
error is likely rare and unimportant.

But yes all 1's is better, and in this case won't clash with the largest
size returnable.

Bart

C Programming functions	2	Dec 3, 2021
Unable to read input from keyboard, in below C code, for a BST.	0	Jul 20, 2025
Anyone wants to make this programming language? (in C)	0	Jun 1, 2022
Rich Text Format (RTF) Document Builder in C++: Code and Features	0	Sep 28, 2025
Write your own isascii in c programming	0	Nov 7, 2020
Looking to change programming direction	1	Aug 10, 2022
C exercise	1	Feb 3, 2022
Asynchronous programming using standard C	2	Sep 2, 2012

Programming in standard c

Kelsey Bjarnason

Keith Thompson

Kelsey Bjarnason

Kelsey Bjarnason

Kelsey Bjarnason

jacob navia

Kelsey Bjarnason

santosh

Kelsey Bjarnason

Kelsey Bjarnason

jacob navia

Syren Baran

Kenny McCormack

Bart C

Nick Keighley

Kelsey Bjarnason

Kelsey Bjarnason

Walter Roberson

Syren Baran

Bart C

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads