how can i find the size of a binary file

Discussion in 'C Programming' started by mark, Nov 11, 2011.

  1. mark

    mark Guest

    thanks for any help
    mark, Nov 11, 2011
    #1
    1. Advertising

  2. mark

    John Gordon Guest

    In <j9k4i8$tah$> mark <> writes:

    > thanks for any help


    Call fread() in a loop and keep track of how many total bytes were read.

    --
    John Gordon A is for Amy, who fell down the stairs
    B is for Basil, assaulted by bears
    -- Edward Gorey, "The Gashlycrumb Tinies"
    John Gordon, Nov 11, 2011
    #2
    1. Advertising

  3. mark

    mark Guest

    > In <j9k4i8$tah$> mark <> writes:
    >
    >> thanks for any help

    >
    > Call fread() in a loop and keep track of how many total bytes were read.


    thanks for ur answer but this will b very inefficent, my question is -
    what is the builtin filesize function in c

    thnx
    mark, Nov 11, 2011
    #3
  4. mark

    jacob navia Guest

    Le 11/11/11 22:37, mark a écrit :
    > thanks for any help


    This function figures out the size of a file, allocates a buffer
    and returns the contents of the file.


    #include <stdio.h>
    #include <stdlib.h>
    char *FileToRam(char *fname)
    {
    FILE *f = fopen(fname,"rb");
    long siz;
    char *result;
    if (f == NULL)
    return NULL;
    /* Position yourself at the end of the file,
    then get the current position. This gives
    you the current position in all systems
    except in the DS 9000 or if the file is
    longer than what a long can hold */
    fseek(f,0,SEEK_END);
    siz = ftell(f);
    fseek(f,0,SEEK_SET);
    /* Now allocate a buffer, fill it and
    return it.
    result = calloc(1,siz+1);
    if (result) {
    fread(result,1,siz,f);
    }
    fclose(f);
    return result;
    }
    jacob navia, Nov 11, 2011
    #4
  5. mark

    jacob navia Guest

    Le 11/11/11 22:53, mark a écrit :
    >> In<j9k4i8$tah$> mark<> writes:
    >>
    >>> thanks for any help

    >>
    >> Call fread() in a loop and keep track of how many total bytes were read.

    >
    > thanks for ur answer but this will b very inefficent, my question is -
    > what is the builtin filesize function in c
    >
    > thnx


    See my reply in this same thread.
    jacob navia, Nov 11, 2011
    #5
  6. mark <> writes:
    > thanks for any help


    Please include the question in the body of your post.

    "how can i find the size of a binary file"

    <There is no reliable way to do this in portable standard C. You can
    read through the file, adding up how many bytes you've read, but
    that's both slow and not 100% reliable. An implementation is allowed
    to treat a binary file as if it had some implementation-defined
    number of null bytes append to it (C99 7.19.2p3), though I don't
    know of any implementations that actually do that.

    You can open the file (in binary mode), then fseek() to the end of
    it, then use ftell() to get the current position. That's *usually*
    going to be the size of the file, but it's still not 100% portable
    for the reasons stated above. Furthermore, ftell() returns a long
    int; if long int is 32 bits on your system, it's not going to work
    for files that are 2 GiB or bigger.

    Your operating system probably provides a way to get this information
    directly. On Unix-like systems, stat() does this ("man 2 stat"
    for details). On other systems, consult your documentation or ask
    in a system-specific forum.

    This happens to be one of those things that's much easier to do in
    a system-specific way than by using portable C.

    And watch out for race conditions. Whatever method you use will tell you
    the size of the file at the moment when you did the query. The file can
    grow, shrink, or even vanish between that and the time when try to do
    something with the information.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Nov 11, 2011
    #6
  7. mark

    James Kuyper Guest

    On 11/11/2011 04:53 PM, mark wrote:
    >> In <j9k4i8$tah$> mark <> writes:
    >>
    >>> thanks for any help

    >>
    >> Call fread() in a loop and keep track of how many total bytes were read.

    >
    > thanks for ur answer but this will b very inefficent, my question is -
    > what is the builtin filesize function in c


    There isn't one. That was considered to be too OS-dependent to justify
    standardizing it. For example, on some operating systems, the only thing
    that you can quickly determine is how much space has been allocated to
    store a file; how much of that space has actually been used can only be
    determined by some procedure equivalent to the fread() method given
    above. POSIX provides stat(), lstat(), and fstat(); other OSs provide
    other methods.

    One approach that works on many systems is fseek(file, 0, SEEK_END)
    followed by ftell(file). However, make sure to check for an error return
    from fseek() - "A binary stream need not meaningfully support fseek
    calls with a whence value of SEEK_END." (7.19.9.2p3).

    An extended discussion started the last time someone asked something
    like this. A popular contention was that it's pointless to ask how big a
    file is, because at best, the answer you'll get is how big it was at
    some time in the past; it might be a different size now. That point of
    view has some validity, but it ignores two things:

    1. You might be explicitly looking for the current value of a time
    dependent quantity, such as keeping track of how fast a file is growing.

    2. You might have done something to make sure that the file shouldn't
    change in size. This is extremely common, in my experience. There's
    often only one unprivileged userid currently authorized to change a
    given file. If that userid is mine, it's reasonably safe to assume that
    if I'm not currently changing the file, it's size won't change.
    James Kuyper, Nov 11, 2011
    #7
  8. mark <> writes:
    >> In <j9k4i8$tah$> mark <> writes:
    >>
    >>> thanks for any help

    >>
    >> Call fread() in a loop and keep track of how many total bytes were read.

    >
    > thanks for ur answer but this will b very inefficent, my question is -
    > what is the builtin filesize function in c
    >
    > thnx


    I think you mean:

    > Thanks for your answer, but this will be very inefficient. My question is,
    > what is the builtin file size function in C?
    >
    > Thanks.


    If you take the time to spell out words, it will make it easier for the
    rest of us to read what you have to say (especially those for whom
    English is not a first language) and will generally make us more
    inclined to help you.

    In answer to your question, there is none; see my other followup for
    details.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Nov 11, 2011
    #8
  9. mark

    Ben Pfaff Guest

    mark <> writes:

    > thanks for any help


    I'm surprised that no one else has cited the FAQ, so far:

    19.12: How can I find out the size of a file, prior to reading it in?

    A: If the "size of a file" is the number of characters you'll be
    able to read from it in C, it is difficult or impossible to
    determine this number exactly.

    Under Unix, the stat() call will give you an exact answer.
    Several other systems supply a Unix-like stat() which will give
    an approximate answer. You can fseek() to the end and then use
    ftell(), or maybe try fstat(), but these tend to have the same
    sorts of problems: fstat() is not portable, and generally tells
    you the same thing stat() tells you; ftell() is not guaranteed
    to return a byte count except for binary files. Some systems
    provide functions called filesize() or filelength(), but these
    are obviously not portable, either.

    Are you sure you have to determine the file's size in advance?
    Since the most accurate way of determining the size of a file as
    a C program will see it is to open the file and read it, perhaps
    you can rearrange the code to learn the size as it reads.

    References: ISO Sec. 7.9.9.4; H&S Sec. 15.5.1; PCS Sec. 12 p.
    213; POSIX Sec. 5.6.2.

    --
    char a[]="\n .CJacehknorstu";int putchar(int);int main(void){unsigned long b[]
    ={0x67dffdff,0x9aa9aa6a,0xa77ffda9,0x7da6aa6a,0xa67f6aaa,0xaa9aa9f6,0x11f6},*p
    =b,i=24;for(;p+=!*p;*p/=4)switch(0[p]&3)case 0:{return 0;for(p--;i--;i--)case+
    2:{i++;if(i)break;else default:continue;if(0)case 1:putchar(a[i&15]);break;}}}
    Ben Pfaff, Nov 11, 2011
    #9
  10. mark

    Ike Naar Guest

    Ike Naar, Nov 11, 2011
    #10
  11. James Kuyper <> writes:
    [...]
    > An extended discussion started the last time someone asked something
    > like this. A popular contention was that it's pointless to ask how big a
    > file is, because at best, the answer you'll get is how big it was at
    > some time in the past; it might be a different size now. That point of
    > view has some validity, but it ignores two things:
    >
    > 1. You might be explicitly looking for the current value of a time
    > dependent quantity, such as keeping track of how fast a file is growing.
    >
    > 2. You might have done something to make sure that the file shouldn't
    > change in size. This is extremely common, in my experience. There's
    > often only one unprivileged userid currently authorized to change a
    > given file. If that userid is mine, it's reasonably safe to assume that
    > if I'm not currently changing the file, it's size won't change.


    Agreed. But even so, your code should probably be robust enough that it
    doesn't blow up in your face if the file size *has* changed.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Nov 11, 2011
    #11
  12. On Fri, 11 Nov 2011 22:59:06 +0100, jacob navia <>
    wrote:

    >Le 11/11/11 22:37, mark a écrit :
    >> thanks for any help

    >
    >This function figures out the size of a file, allocates a buffer
    >and returns the contents of the file.
    >
    >
    >#include <stdio.h>
    >#include <stdlib.h>
    >char *FileToRam(char *fname)
    >{
    > FILE *f = fopen(fname,"rb");
    > long siz;
    > char *result;
    > if (f == NULL)
    > return NULL;
    > /* Position yourself at the end of the file,
    > then get the current position. This gives
    > you the current position in all systems
    > except in the DS 9000 or if the file is


    I wonder how long it took you to test on all the non-DS9000 systems.

    > longer than what a long can hold */
    > fseek(f,0,SEEK_END);


    From 7.19.9.2-3: "A binary stream need not meaningfully support fseek
    calls with a whence value of SEEK_END."

    > siz = ftell(f);
    > fseek(f,0,SEEK_SET);
    > /* Now allocate a buffer, fill it and
    > return it.
    > result = calloc(1,siz+1);


    Why spend the time initializing a block of memory that will have all
    its bytes immediately replaced with new values?

    Since it is a binary file, what is the value of appending a '\0' at
    the end? It is not likely that the file can be treated as a string.

    > if (result) {
    > fread(result,1,siz,f);


    There is no guarantee that fread will actually read all the bytes
    requested. How can the user determine this?

    > }
    > fclose(f);
    > return result;
    >}


    Just because you could not allocate enough memory to hold the entire
    file does not eliminate the OP's need to know the length of the file.
    But then you never tell him that anyway.

    --
    Remove del for email
    Barry Schwarz, Nov 12, 2011
    #12
  13. On Fri, 11 Nov 2011 14:03:54 -0800, Keith Thompson <>
    wrote:

    >mark <> writes:
    >> thanks for any help

    >
    >Please include the question in the body of your post.
    >
    >"how can i find the size of a binary file"
    >
    ><There is no reliable way to do this in portable standard C. You can
    >read through the file, adding up how many bytes you've read, but
    >that's both slow and not 100% reliable. An implementation is allowed
    >to treat a binary file as if it had some implementation-defined
    >number of null bytes append to it (C99 7.19.2p3), though I don't
    >know of any implementations that actually do that.
    >
    >You can open the file (in binary mode), then fseek() to the end of
    >it, then use ftell() to get the current position. That's *usually*


    From 7.19.9.2-3: "A binary stream need not meaningfully support fseek
    calls with a whence value of SEEK_END."

    >going to be the size of the file, but it's still not 100% portable
    >for the reasons stated above. Furthermore, ftell() returns a long
    >int; if long int is 32 bits on your system, it's not going to work
    >for files that are 2 GiB or bigger.
    >
    >Your operating system probably provides a way to get this information
    >directly. On Unix-like systems, stat() does this ("man 2 stat"
    >for details). On other systems, consult your documentation or ask
    >in a system-specific forum.
    >
    >This happens to be one of those things that's much easier to do in
    >a system-specific way than by using portable C.
    >
    >And watch out for race conditions. Whatever method you use will tell you
    >the size of the file at the moment when you did the query. The file can
    >grow, shrink, or even vanish between that and the time when try to do
    >something with the information.


    --
    Remove del for email
    Barry Schwarz, Nov 12, 2011
    #13
  14. On Fri, 11 Nov 2011 14:32:13 -0800, (Ben Pfaff)
    wrote:

    >mark <> writes:
    >
    >> thanks for any help

    >
    >I'm surprised that no one else has cited the FAQ, so far:
    >
    >19.12: How can I find out the size of a file, prior to reading it in?
    >
    >A: If the "size of a file" is the number of characters you'll be
    > able to read from it in C, it is difficult or impossible to
    > determine this number exactly.
    >
    > Under Unix, the stat() call will give you an exact answer.
    > Several other systems supply a Unix-like stat() which will give
    > an approximate answer. You can fseek() to the end and then use
    > ftell(), or maybe try fstat(), but these tend to have the same


    From 7.19.9.2-3: "A binary stream need not meaningfully support fseek
    calls with a whence value of SEEK_END."

    > sorts of problems: fstat() is not portable, and generally tells
    > you the same thing stat() tells you; ftell() is not guaranteed
    > to return a byte count except for binary files. Some systems
    > provide functions called filesize() or filelength(), but these
    > are obviously not portable, either.
    >
    > Are you sure you have to determine the file's size in advance?
    > Since the most accurate way of determining the size of a file as
    > a C program will see it is to open the file and read it, perhaps
    > you can rearrange the code to learn the size as it reads.
    >
    > References: ISO Sec. 7.9.9.4; H&S Sec. 15.5.1; PCS Sec. 12 p.
    > 213; POSIX Sec. 5.6.2.


    --
    Remove del for email
    Barry Schwarz, Nov 12, 2011
    #14
  15. mark

    Phil Carmody Guest

    James Kuyper <> writes:
    > On 11/11/2011 04:53 PM, mark wrote:
    > >> In <j9k4i8$tah$> mark <> writes:
    > >>
    > >>> thanks for any help
    > >>
    > >> Call fread() in a loop and keep track of how many total bytes were read.

    > >
    > > thanks for ur answer but this will b very inefficent, my question is -
    > > what is the builtin filesize function in c

    >
    > There isn't one. That was considered to be too OS-dependent to justify
    > standardizing it. For example, on some operating systems, the only thing
    > that you can quickly determine is how much space has been allocated to
    > store a file; how much of that space has actually been used can only be
    > determined by some procedure equivalent to the fread() method given
    > above. POSIX provides stat(), lstat(), and fstat(); other OSs provide
    > other methods.
    >
    > One approach that works on many systems is fseek(file, 0, SEEK_END)
    > followed by ftell(file). However, make sure to check for an error return
    > from fseek() - "A binary stream need not meaningfully support fseek
    > calls with a whence value of SEEK_END." (7.19.9.2p3).
    >
    > An extended discussion started the last time someone asked something
    > like this. A popular contention was that it's pointless to ask how big a
    > file is, because at best, the answer you'll get is how big it was at
    > some time in the past; it might be a different size now. That point of
    > view has some validity, but it ignores two things:
    >
    > 1. You might be explicitly looking for the current value of a time
    > dependent quantity, such as keeping track of how fast a file is growing.
    >
    > 2. You might have done something to make sure that the file shouldn't
    > change in size. This is extremely common, in my experience. There's
    > often only one unprivileged userid currently authorized to change a
    > given file. If that userid is mine, it's reasonably safe to assume that
    > if I'm not currently changing the file, it's size won't change.


    And some day in the future someone might even invent read-only media,
    such that it's physically impossible for the file, and thus its size,
    to be changed.

    Phil
    --
    Unix is simple. It just takes a genius to understand its simplicity
    -- Dennis Ritchie (1941-2011), Unix Co-Creator
    Phil Carmody, Nov 12, 2011
    #15
  16. mark

    Nobody Guest

    On Fri, 11 Nov 2011 14:03:54 -0800, Keith Thompson wrote:

    > An implementation is allowed to treat a
    > binary file as if it had some implementation-defined number of null bytes
    > append to it (C99 7.19.2p3), though I don't know of any implementations
    > that actually do that.


    CP/M records the size of a file in sectors rather than in bytes. Text
    files are terminated by a ^Z ('\x1a') character (and this behaviour
    was inherited by DOS and then Windows). Binary files need their own
    mechanism for determining where the data ends and the padding begins.
    Nobody, Nov 13, 2011
    #16
  17. mark

    jacob navia Guest

    Le 13/11/11 03:47, Nobody a écrit :
    > On Fri, 11 Nov 2011 14:03:54 -0800, Keith Thompson wrote:
    >
    >> An implementation is allowed to treat a
    >> binary file as if it had some implementation-defined number of null bytes
    >> append to it (C99 7.19.2p3), though I don't know of any implementations
    >> that actually do that.

    >
    > CP/M records the size of a file in sectors rather than in bytes. Text
    > files are terminated by a ^Z ('\x1a') character (and this behaviour
    > was inherited by DOS and then Windows).


    This is not true at least for the last 20 years for MSDOS
    and windows...

    But well, nothing is bad when fighting the "evil empire",
    sure, not even lies...

    I am in no way tied to Microsoft but it "should" have gotten
    through that this is no longer the case for QUITE a long time.

    The behavior is still there when typing from the console,
    like the Ctrl-D of unix.

    But if I start telling that Unix recognizes end of file when it
    finds a Ctrl-D character I will be flamed (and rightly so).
    jacob navia, Nov 13, 2011
    #17
  18. jacob navia <> writes:

    > Le 13/11/11 03:47, Nobody a écrit :
    >> On Fri, 11 Nov 2011 14:03:54 -0800, Keith Thompson wrote:
    >>
    >>> An implementation is allowed to treat a
    >>> binary file as if it had some implementation-defined number of null bytes
    >>> append to it (C99 7.19.2p3), though I don't know of any implementations
    >>> that actually do that.

    >>
    >> CP/M records the size of a file in sectors rather than in bytes. Text
    >> files are terminated by a ^Z ('\x1a') character (and this behaviour
    >> was inherited by DOS and then Windows).

    >
    > This is not true at least for the last 20 years for MSDOS
    > and windows...
    >
    > But well, nothing is bad when fighting the "evil empire",
    > sure, not even lies...
    >
    > I am in no way tied to Microsoft but it "should" have gotten
    > through that this is no longer the case for QUITE a long time.


    You know far more about C on Windows than I do, so I'd appreciate your
    input here. I just tried this program with lcc-win32:

    #include <stdio.h>

    int main(int argc, char *argv[])
    {
    FILE *fp;
    if (argc > 1 && (fp = fopen(argv[1], "r")) != NULL) {
    int n = 0;
    while (fgetc(fp) != EOF)
    n++;
    printf("n=%d\n", n);
    }
    return 0;
    }

    and, when given the name of this file as argv[1],

    $ hd data
    00000000 61 62 63 0d 0a 1a 0d 0a 64 65 66 0d 0a |abc.....def..|
    0000000d

    it prints "n=4". I.e. with your C library, fgetc returns EOF when a ^Z
    is seen in this text stream. Is this something to do with my odd setup
    (I'm using a Windows emulator) or is it what you would expect to see?

    > The behavior is still there when typing from the console,
    > like the Ctrl-D of unix.


    It's not really "the Ctrl-D of unix". What character you may type (if
    any) to signal EOF to the tty driver is configurable, so the mechanism
    is quite different from how Windows used to work. What I remember of
    Windows was that the ^Z was simply passed to the running program like any
    other character. Are you saying that this is not what happens on
    Windows anymore?

    <snip>
    --
    Ben.
    Ben Bacarisse, Nov 13, 2011
    #18
  19. mark

    BartC Guest

    I generally use something like this:

    long getfilesize(FILE* handle){
    long p,size;

    #if WINDOWS

    p = ftell(handle); /* current position */
    fseek(handle,0,SEEK_END); /* get EOF position */
    size = ftell(handle); /* size in bytes */
    fseek(handle,p,SEEK_SET); /* restore original position */
    return size;

    #else

    puts("Sorry this system doesn't support quick file-size reporting");
    exit(0);
    return 0;
    #endif
    }

    Notes:

    o I've restricted this to work only for Windows, so WINDOWS must be set to 1
    or 0 somewhere. You might try taking out this check to see what happens.

    o The file must be open to determine the size

    o If the system isn't Windows, or one where these calls will work, you can
    try alternate code such as reading byte-by-byte; that will be slow but it
    could work.

    o The fseek() and such functions return an error code which I haven't
    bothered to check (as I don't have error handling in this function)

    o The functions I used are restricted to the range of 'long' (which I think
    is 2GB in this case); I don't know what happens above 2GB, and don't have
    files that big to test on. However being restricted to Windows, that could
    have 64-bit versions available, as well as specialist functions which are
    part of the OS rather than C.

    o Making use of fstat() is also possible, but that is also frowned on here
    so makes no difference.

    o A file size of course could conceivably change by the time it is acted on.
    But a file can also be deleted between calling fopen() and checking the
    return value. So you can either give up programming right now, or just bear
    these possibilities in mind.

    o If I was interested in making this portable I might use a series of
    #if/#elif checks for a range of platforms with appropriate code for each,
    followed by an #else clause with some default code. But I'm not, and it
    would probably turn out to be impossible anyway. So I don't worry about it.

    --
    Bartc
    BartC, Nov 13, 2011
    #19
  20. mark

    BartC Guest

    "BartC" <> wrote in message news:j9odfc$umt$...


    > o A file size of course could conceivably change by the time it is acted
    > on.
    > But a file can also be deleted between calling fopen() and checking the
    > return value.


    Actually that might not be possible anymore (on Windows). But I believe
    almost else anything can be done to the file, including deleting the entire
    contents.

    --
    Bartc
    BartC, Nov 13, 2011
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. ad
    Replies:
    2
    Views:
    855
  2. Arnold

    Getting file size of binary file

    Arnold, Jan 8, 2004, in forum: C Programming
    Replies:
    17
    Views:
    6,319
    glen herrmannsfeldt
    Jan 31, 2004
  3. Replies:
    2
    Views:
    370
    momobear
    Mar 29, 2007
  4. Abandoned
    Replies:
    2
    Views:
    271
    Adonis Vargas
    Dec 2, 2007
  5. Jason Cavett

    Preferred Size, Minimum Size, Size

    Jason Cavett, May 23, 2008, in forum: Java
    Replies:
    5
    Views:
    12,560
    Michael Jung
    May 25, 2008
Loading...

Share This Page