how can i find the size of a binary file

M

mark

In said:
Call fread() in a loop and keep track of how many total bytes were read.

thanks for ur answer but this will b very inefficent, my question is -
what is the builtin filesize function in c

thnx
 
J

jacob navia

Le 11/11/11 22:37, mark a écrit :
thanks for any help

This function figures out the size of a file, allocates a buffer
and returns the contents of the file.


#include <stdio.h>
#include <stdlib.h>
char *FileToRam(char *fname)
{
FILE *f = fopen(fname,"rb");
long siz;
char *result;
if (f == NULL)
return NULL;
/* Position yourself at the end of the file,
then get the current position. This gives
you the current position in all systems
except in the DS 9000 or if the file is
longer than what a long can hold */
fseek(f,0,SEEK_END);
siz = ftell(f);
fseek(f,0,SEEK_SET);
/* Now allocate a buffer, fill it and
return it.
result = calloc(1,siz+1);
if (result) {
fread(result,1,siz,f);
}
fclose(f);
return result;
}
 
J

jacob navia

Le 11/11/11 22:53, mark a écrit :
thanks for ur answer but this will b very inefficent, my question is -
what is the builtin filesize function in c

thnx

See my reply in this same thread.
 
K

Keith Thompson

mark said:
thanks for any help

Please include the question in the body of your post.

"how can i find the size of a binary file"

<There is no reliable way to do this in portable standard C. You can
read through the file, adding up how many bytes you've read, but
that's both slow and not 100% reliable. An implementation is allowed
to treat a binary file as if it had some implementation-defined
number of null bytes append to it (C99 7.19.2p3), though I don't
know of any implementations that actually do that.

You can open the file (in binary mode), then fseek() to the end of
it, then use ftell() to get the current position. That's *usually*
going to be the size of the file, but it's still not 100% portable
for the reasons stated above. Furthermore, ftell() returns a long
int; if long int is 32 bits on your system, it's not going to work
for files that are 2 GiB or bigger.

Your operating system probably provides a way to get this information
directly. On Unix-like systems, stat() does this ("man 2 stat"
for details). On other systems, consult your documentation or ask
in a system-specific forum.

This happens to be one of those things that's much easier to do in
a system-specific way than by using portable C.

And watch out for race conditions. Whatever method you use will tell you
the size of the file at the moment when you did the query. The file can
grow, shrink, or even vanish between that and the time when try to do
something with the information.
 
J

James Kuyper

thanks for ur answer but this will b very inefficent, my question is -
what is the builtin filesize function in c

There isn't one. That was considered to be too OS-dependent to justify
standardizing it. For example, on some operating systems, the only thing
that you can quickly determine is how much space has been allocated to
store a file; how much of that space has actually been used can only be
determined by some procedure equivalent to the fread() method given
above. POSIX provides stat(), lstat(), and fstat(); other OSs provide
other methods.

One approach that works on many systems is fseek(file, 0, SEEK_END)
followed by ftell(file). However, make sure to check for an error return
from fseek() - "A binary stream need not meaningfully support fseek
calls with a whence value of SEEK_END." (7.19.9.2p3).

An extended discussion started the last time someone asked something
like this. A popular contention was that it's pointless to ask how big a
file is, because at best, the answer you'll get is how big it was at
some time in the past; it might be a different size now. That point of
view has some validity, but it ignores two things:

1. You might be explicitly looking for the current value of a time
dependent quantity, such as keeping track of how fast a file is growing.

2. You might have done something to make sure that the file shouldn't
change in size. This is extremely common, in my experience. There's
often only one unprivileged userid currently authorized to change a
given file. If that userid is mine, it's reasonably safe to assume that
if I'm not currently changing the file, it's size won't change.
 
K

Keith Thompson

mark said:
thanks for ur answer but this will b very inefficent, my question is -
what is the builtin filesize function in c

thnx

I think you mean:
Thanks for your answer, but this will be very inefficient. My question is,
what is the builtin file size function in C?

Thanks.

If you take the time to spell out words, it will make it easier for the
rest of us to read what you have to say (especially those for whom
English is not a first language) and will generally make us more
inclined to help you.

In answer to your question, there is none; see my other followup for
details.
 
B

Ben Pfaff

mark said:
thanks for any help

I'm surprised that no one else has cited the FAQ, so far:

19.12: How can I find out the size of a file, prior to reading it in?

A: If the "size of a file" is the number of characters you'll be
able to read from it in C, it is difficult or impossible to
determine this number exactly.

Under Unix, the stat() call will give you an exact answer.
Several other systems supply a Unix-like stat() which will give
an approximate answer. You can fseek() to the end and then use
ftell(), or maybe try fstat(), but these tend to have the same
sorts of problems: fstat() is not portable, and generally tells
you the same thing stat() tells you; ftell() is not guaranteed
to return a byte count except for binary files. Some systems
provide functions called filesize() or filelength(), but these
are obviously not portable, either.

Are you sure you have to determine the file's size in advance?
Since the most accurate way of determining the size of a file as
a C program will see it is to open the file and read it, perhaps
you can rearrange the code to learn the size as it reads.

References: ISO Sec. 7.9.9.4; H&S Sec. 15.5.1; PCS Sec. 12 p.
213; POSIX Sec. 5.6.2.
 
K

Keith Thompson

James Kuyper said:
An extended discussion started the last time someone asked something
like this. A popular contention was that it's pointless to ask how big a
file is, because at best, the answer you'll get is how big it was at
some time in the past; it might be a different size now. That point of
view has some validity, but it ignores two things:

1. You might be explicitly looking for the current value of a time
dependent quantity, such as keeping track of how fast a file is growing.

2. You might have done something to make sure that the file shouldn't
change in size. This is extremely common, in my experience. There's
often only one unprivileged userid currently authorized to change a
given file. If that userid is mine, it's reasonably safe to assume that
if I'm not currently changing the file, it's size won't change.

Agreed. But even so, your code should probably be robust enough that it
doesn't blow up in your face if the file size *has* changed.
 
B

Barry Schwarz

Le 11/11/11 22:37, mark a écrit :

This function figures out the size of a file, allocates a buffer
and returns the contents of the file.


#include <stdio.h>
#include <stdlib.h>
char *FileToRam(char *fname)
{
FILE *f = fopen(fname,"rb");
long siz;
char *result;
if (f == NULL)
return NULL;
/* Position yourself at the end of the file,
then get the current position. This gives
you the current position in all systems
except in the DS 9000 or if the file is

I wonder how long it took you to test on all the non-DS9000 systems.
longer than what a long can hold */
fseek(f,0,SEEK_END);

From 7.19.9.2-3: "A binary stream need not meaningfully support fseek
calls with a whence value of SEEK_END."
siz = ftell(f);
fseek(f,0,SEEK_SET);
/* Now allocate a buffer, fill it and
return it.
result = calloc(1,siz+1);

Why spend the time initializing a block of memory that will have all
its bytes immediately replaced with new values?

Since it is a binary file, what is the value of appending a '\0' at
the end? It is not likely that the file can be treated as a string.
if (result) {
fread(result,1,siz,f);

There is no guarantee that fread will actually read all the bytes
requested. How can the user determine this?
}
fclose(f);
return result;
}

Just because you could not allocate enough memory to hold the entire
file does not eliminate the OP's need to know the length of the file.
But then you never tell him that anyway.
 
B

Barry Schwarz

Please include the question in the body of your post.

"how can i find the size of a binary file"

<There is no reliable way to do this in portable standard C. You can
read through the file, adding up how many bytes you've read, but
that's both slow and not 100% reliable. An implementation is allowed
to treat a binary file as if it had some implementation-defined
number of null bytes append to it (C99 7.19.2p3), though I don't
know of any implementations that actually do that.

You can open the file (in binary mode), then fseek() to the end of
it, then use ftell() to get the current position. That's *usually*

From 7.19.9.2-3: "A binary stream need not meaningfully support fseek
calls with a whence value of SEEK_END."
 
B

Barry Schwarz

I'm surprised that no one else has cited the FAQ, so far:

19.12: How can I find out the size of a file, prior to reading it in?

A: If the "size of a file" is the number of characters you'll be
able to read from it in C, it is difficult or impossible to
determine this number exactly.

Under Unix, the stat() call will give you an exact answer.
Several other systems supply a Unix-like stat() which will give
an approximate answer. You can fseek() to the end and then use
ftell(), or maybe try fstat(), but these tend to have the same

From 7.19.9.2-3: "A binary stream need not meaningfully support fseek
calls with a whence value of SEEK_END."
 
P

Phil Carmody

James Kuyper said:
There isn't one. That was considered to be too OS-dependent to justify
standardizing it. For example, on some operating systems, the only thing
that you can quickly determine is how much space has been allocated to
store a file; how much of that space has actually been used can only be
determined by some procedure equivalent to the fread() method given
above. POSIX provides stat(), lstat(), and fstat(); other OSs provide
other methods.

One approach that works on many systems is fseek(file, 0, SEEK_END)
followed by ftell(file). However, make sure to check for an error return
from fseek() - "A binary stream need not meaningfully support fseek
calls with a whence value of SEEK_END." (7.19.9.2p3).

An extended discussion started the last time someone asked something
like this. A popular contention was that it's pointless to ask how big a
file is, because at best, the answer you'll get is how big it was at
some time in the past; it might be a different size now. That point of
view has some validity, but it ignores two things:

1. You might be explicitly looking for the current value of a time
dependent quantity, such as keeping track of how fast a file is growing.

2. You might have done something to make sure that the file shouldn't
change in size. This is extremely common, in my experience. There's
often only one unprivileged userid currently authorized to change a
given file. If that userid is mine, it's reasonably safe to assume that
if I'm not currently changing the file, it's size won't change.

And some day in the future someone might even invent read-only media,
such that it's physically impossible for the file, and thus its size,
to be changed.

Phil
 
N

Nobody

An implementation is allowed to treat a
binary file as if it had some implementation-defined number of null bytes
append to it (C99 7.19.2p3), though I don't know of any implementations
that actually do that.

CP/M records the size of a file in sectors rather than in bytes. Text
files are terminated by a ^Z ('\x1a') character (and this behaviour
was inherited by DOS and then Windows). Binary files need their own
mechanism for determining where the data ends and the padding begins.
 
J

jacob navia

Le 13/11/11 03:47, Nobody a écrit :
CP/M records the size of a file in sectors rather than in bytes. Text
files are terminated by a ^Z ('\x1a') character (and this behaviour
was inherited by DOS and then Windows).

This is not true at least for the last 20 years for MSDOS
and windows...

But well, nothing is bad when fighting the "evil empire",
sure, not even lies...

I am in no way tied to Microsoft but it "should" have gotten
through that this is no longer the case for QUITE a long time.

The behavior is still there when typing from the console,
like the Ctrl-D of unix.

But if I start telling that Unix recognizes end of file when it
finds a Ctrl-D character I will be flamed (and rightly so).
 
B

Ben Bacarisse

jacob navia said:
Le 13/11/11 03:47, Nobody a écrit :

This is not true at least for the last 20 years for MSDOS
and windows...

But well, nothing is bad when fighting the "evil empire",
sure, not even lies...

I am in no way tied to Microsoft but it "should" have gotten
through that this is no longer the case for QUITE a long time.

You know far more about C on Windows than I do, so I'd appreciate your
input here. I just tried this program with lcc-win32:

#include <stdio.h>

int main(int argc, char *argv[])
{
FILE *fp;
if (argc > 1 && (fp = fopen(argv[1], "r")) != NULL) {
int n = 0;
while (fgetc(fp) != EOF)
n++;
printf("n=%d\n", n);
}
return 0;
}

and, when given the name of this file as argv[1],

$ hd data
00000000 61 62 63 0d 0a 1a 0d 0a 64 65 66 0d 0a |abc.....def..|
0000000d

it prints "n=4". I.e. with your C library, fgetc returns EOF when a ^Z
is seen in this text stream. Is this something to do with my odd setup
(I'm using a Windows emulator) or is it what you would expect to see?
The behavior is still there when typing from the console,
like the Ctrl-D of unix.

It's not really "the Ctrl-D of unix". What character you may type (if
any) to signal EOF to the tty driver is configurable, so the mechanism
is quite different from how Windows used to work. What I remember of
Windows was that the ^Z was simply passed to the running program like any
other character. Are you saying that this is not what happens on
Windows anymore?

<snip>
 
B

BartC

I generally use something like this:

long getfilesize(FILE* handle){
long p,size;

#if WINDOWS

p = ftell(handle); /* current position */
fseek(handle,0,SEEK_END); /* get EOF position */
size = ftell(handle); /* size in bytes */
fseek(handle,p,SEEK_SET); /* restore original position */
return size;

#else

puts("Sorry this system doesn't support quick file-size reporting");
exit(0);
return 0;
#endif
}

Notes:

o I've restricted this to work only for Windows, so WINDOWS must be set to 1
or 0 somewhere. You might try taking out this check to see what happens.

o The file must be open to determine the size

o If the system isn't Windows, or one where these calls will work, you can
try alternate code such as reading byte-by-byte; that will be slow but it
could work.

o The fseek() and such functions return an error code which I haven't
bothered to check (as I don't have error handling in this function)

o The functions I used are restricted to the range of 'long' (which I think
is 2GB in this case); I don't know what happens above 2GB, and don't have
files that big to test on. However being restricted to Windows, that could
have 64-bit versions available, as well as specialist functions which are
part of the OS rather than C.

o Making use of fstat() is also possible, but that is also frowned on here
so makes no difference.

o A file size of course could conceivably change by the time it is acted on.
But a file can also be deleted between calling fopen() and checking the
return value. So you can either give up programming right now, or just bear
these possibilities in mind.

o If I was interested in making this portable I might use a series of
#if/#elif checks for a range of platforms with appropriate code for each,
followed by an #else clause with some default code. But I'm not, and it
would probably turn out to be impossible anyway. So I don't worry about it.
 
B

BartC

o A file size of course could conceivably change by the time it is acted
on.
But a file can also be deleted between calling fopen() and checking the
return value.

Actually that might not be possible anymore (on Windows). But I believe
almost else anything can be done to the file, including deleting the entire
contents.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,570
Members
45,045
Latest member
DRCM

Latest Threads

Top