reading data from a file

Discussion in 'C Programming' started by John Smith, Jan 24, 2006.

  1. John Smith

    John Smith Guest

    I want to read data from a file and assign it to a dynamically
    allocated array. I don't know the number of data in advance. My
    approach has been to read the file twice, the first time to
    determine its size, the second for the actual assignment. Is
    there a more efficient way?
     
    John Smith, Jan 24, 2006
    #1
    1. Advertising

  2. John Smith

    Ico Guest

    John Smith <> wrote:
    > I want to read data from a file and assign it to a dynamically
    > allocated array. I don't know the number of data in advance. My
    > approach has been to read the file twice, the first time to
    > determine its size, the second for the actual assignment. Is
    > there a more efficient way?


    Open the file with fopen(), seek to the end with fseek(), get the
    current position with ftell(). This position is the size of the file.
    Allocate the buffer, go back to the beginning of the file with rewind()
    or fseek(), and use fread() to read the file into the buffer.
    Don't forget to add error checking where needed.

    --
    :wq
    ^X^Cy^K^X^C^C^C^C
     
    Ico, Jan 24, 2006
    #2
    1. Advertising

  3. John Smith

    Flash Gordon Guest

    John Smith wrote:
    > I want to read data from a file and assign it to a dynamically allocated
    > array. I don't know the number of data in advance. My approach has been
    > to read the file twice, the first time to determine its size, the second
    > for the actual assignment. Is there a more efficient way?


    That depends on what is efficient on your implementation. One possible
    way to to use realloc to increase the space allocated as you read the
    file, however you have to decide on an appropriate resizing algorithm,
    since growing it one byte at a time is unlikely to be efficient.
    --
    Flash Gordon
    Living in interesting times.
    Although my email address says spam, it is real and I read it.
     
    Flash Gordon, Jan 24, 2006
    #3
  4. Ico wrote:

    > John Smith <> wrote:
    >> I want to read data from a file and assign it to a dynamically
    >> allocated array. I don't know the number of data in advance. My
    >> approach has been to read the file twice, the first time to
    >> determine its size, the second for the actual assignment. Is
    >> there a more efficient way?

    >
    > Open the file with fopen(), seek to the end with fseek(), get the
    > current position with ftell(). This position is the size of the file.
    > Allocate the buffer, go back to the beginning of the file with
    > rewind() or fseek(), and use fread() to read the file into the buffer.


    This seems to refer to the case where the whole file is sucked into the
    memory at once. Following may be a better/more flexible idea.

    Presumably, data in the file is organised in some sort of records. You
    can read one record at a time, get its size (if you know their size in
    advance, so much the better), allocate the memory you need, and store
    the record into the newly allocated memory. This approach would also
    allow you to have file with records of variable size. Your dynamically
    allocated array would in fact be an array of pointers to dynamically
    allocated memory for individual elements. Some would call it a
    collection. ;-)

    > Don't forget to add error checking where needed.


    This, of course, is mandatory! ;-)

    Cheers

    Vladimir

    --
    Real computer scientists don't comment their code. The identifiers are
    so long they can't afford the disk space.
     
    Vladimir S. Oka, Jan 24, 2006
    #4
  5. John Smith

    Flash Gordon Guest

    Ico wrote:
    > John Smith <> wrote:
    >> I want to read data from a file and assign it to a dynamically
    >> allocated array. I don't know the number of data in advance. My
    >> approach has been to read the file twice, the first time to
    >> determine its size, the second for the actual assignment. Is
    >> there a more efficient way?

    >
    > Open the file with fopen(), seek to the end with fseek(),


    To quote from the standard, "A binary stream need not
    meaningfully support fseek calls with a whence value of SEEK_END." So
    that part of your suggestion is not portable if this is a binary file.

    > get the
    > current position with ftell().


    The, quoting from the section of the standard defining ftell we have,
    "For a text stream, its file position indicator contains unspecified
    information, usable by the fseek..." so on a text stream it may not be
    the file size, it is just some number fseek can use.

    > This position is the size of the file.


    So your suggestion is not guaranteed to work for either binary or text
    streams.

    > Allocate the buffer, go back to the beginning of the file with rewind()
    > or fseek(), and use fread() to read the file into the buffer.
    > Don't forget to add error checking where needed.


    I seriously think the realloc method is better in terms of portability.
    It will also then cope with any non-seekable stream, such as named pipes
    on systems that support such things.
    --
    Flash Gordon
    Living in interesting times.
    Although my email address says spam, it is real and I read it.
     
    Flash Gordon, Jan 24, 2006
    #5
  6. John Smith

    Jordan Abel Guest

    On 2006-01-24, Flash Gordon <> wrote:
    > Ico wrote:
    >> John Smith <> wrote:
    >>> I want to read data from a file and assign it to a dynamically
    >>> allocated array. I don't know the number of data in advance. My
    >>> approach has been to read the file twice, the first time to
    >>> determine its size, the second for the actual assignment. Is
    >>> there a more efficient way?

    >>
    >> Open the file with fopen(), seek to the end with fseek(),

    >
    > To quote from the standard, "A binary stream need not
    > meaningfully support fseek calls with a whence value of SEEK_END." So
    > that part of your suggestion is not portable if this is a binary file.


    does a text stream need to? you can't seek at all on stdout/stdin on
    some implementations

    >
    > > get the
    >> current position with ftell().

    >
    > The, quoting from the section of the standard defining ftell we have,
    > "For a text stream, its file position indicator contains unspecified
    > information, usable by the fseek..." so on a text stream it may not be
    > the file size, it is just some number fseek can use.
    >
    > > This position is the size of the file.

    >
    > So your suggestion is not guaranteed to work for either binary or text
    > streams.
    >
    >> Allocate the buffer, go back to the beginning of the file with rewind()
    >> or fseek(), and use fread() to read the file into the buffer.
    >> Don't forget to add error checking where needed.

    >
    > I seriously think the realloc method is better in terms of portability.
    > It will also then cope with any non-seekable stream, such as named pipes
    > on systems that support such things.
     
    Jordan Abel, Jan 24, 2006
    #6
  7. John Smith

    Flash Gordon Guest

    Jordan Abel wrote:
    > On 2006-01-24, Flash Gordon <> wrote:
    >> Ico wrote:
    >>> John Smith <> wrote:
    >>>> I want to read data from a file and assign it to a dynamically
    >>>> allocated array. I don't know the number of data in advance. My
    >>>> approach has been to read the file twice, the first time to
    >>>> determine its size, the second for the actual assignment. Is
    >>>> there a more efficient way?
    >>> Open the file with fopen(), seek to the end with fseek(),

    >> To quote from the standard, "A binary stream need not
    >> meaningfully support fseek calls with a whence value of SEEK_END." So
    >> that part of your suggestion is not portable if this is a binary file.

    >
    > does a text stream need to? you can't seek at all on stdout/stdin on
    > some implementations


    <snip>

    The call is allowed to fail, as with any library call, and is likely to
    fail if used on stdio/stdout. However, the quote I gave reads to me that
    even if a binary stream is seekable you can't rely on being able to seek
    to its end.

    There have been several discussions on here about finding a files size
    where all this has been mentioned before. I was just pointing out two
    parts of the standard which explicitly make what was suggested non-portable.
    --
    Flash Gordon
    Living in interesting times.
    Although my email address says spam, it is real and I read it.
     
    Flash Gordon, Jan 25, 2006
    #7
  8. John Smith

    Ico Guest

    Flash Gordon <> wrote:
    > Ico wrote:
    >> John Smith <> wrote:
    >>> I want to read data from a file and assign it to a dynamically
    >>> allocated array. I don't know the number of data in advance. My
    >>> approach has been to read the file twice, the first time to
    >>> determine its size, the second for the actual assignment. Is
    >>> there a more efficient way?

    >>
    >> Open the file with fopen(), seek to the end with fseek(),

    >
    > To quote from the standard, "A binary stream need not
    > meaningfully support fseek calls with a whence value of SEEK_END." So
    > that part of your suggestion is not portable if this is a binary file.


    I wasn't aware of that.

    >> get the current position with ftell().

    >
    > The, quoting from the section of the standard defining ftell we have,
    > "For a text stream, its file position indicator contains unspecified
    > information, usable by the fseek..." so on a text stream it may not be
    > the file size, it is just some number fseek can use.


    And didn't know that either.

    Thanks for pointing that out, yet another educational moment for me on
    c.l.c.


    --
    :wq
    ^X^Cy^K^X^C^C^C^C
     
    Ico, Jan 25, 2006
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Darrel
    Replies:
    3
    Views:
    689
    Kevin Spencer
    Nov 11, 2004
  2. Replies:
    0
    Views:
    799
  3. Karim Ali

    Reading a file and resuming reading.

    Karim Ali, May 25, 2007, in forum: Python
    Replies:
    2
    Views:
    383
    Hrvoje Niksic
    May 25, 2007
  4. Stephen Moon

    reading output file data as input data

    Stephen Moon, Feb 28, 2004, in forum: Perl Misc
    Replies:
    5
    Views:
    183
    Tad McClellan
    Feb 29, 2004
  5. Replies:
    5
    Views:
    95
    Chris Angelico
    May 14, 2014
Loading...

Share This Page