fseek on a file opened with _popen

Discussion in 'C Programming' started by thomas.mertes@gmx.at, Feb 28, 2008.

  1. Guest

    Hello

    Recently I discovered some problem. I have some C code
    which determines how many bytes are available in a
    file. Later I use this information to malloc a buffer of
    the correct size before I read the bytes.
    Determining the number of bytes available in a
    file is done in 5 steps:

    1. Use tell(aFile) to get the current position.
    2. Use fseek(aFile, 0, SEEK_END) to move to the end.
    3. Get the current position with tell(aFile) (this is the
    size of the file in bytes).
    4. I move to the position which I got in step 1 with fseek().
    5. Subtract the current position from the file size to
    get the number of bytes available.

    This code is certainly not the most elegant solution but
    it is portable. The code works for normal files under
    windows and linux. The portability is also the reason
    why I use tell() and fseek() instead of windows specific
    code.

    When I open a file with _popen I get a different result:
    - Under linux the tell() of step 1 returns -1 which means
    the file is not seekable. I can recognice this situation
    and react accordingly (I cannot malloc the buffer beforehand.
    Instead I malloc a smaller buffer which is realloced until
    all bytes are read).
    - Under windows the tell() of step 1 returns 0 which
    means the file is seekable and is currently at position 0.
    The other calls of fseek() and ftell() succeed also and
    indicate that the number of available bytes is 0.
    Therefore my program thinks that there are no bytes
    available in the file opened with _popen.

    The information that it is a file opened with _popen is
    not available at that place in my program.

    Now my question:
    Is it possible to find out that a file (available in a
    variable of type FILE * ) was opened with _popen?

    Something like: Turn the FILE * into a handle and ask a
    function about the file type. It is no problem for me to
    insert windows specific code under an #ifdef

    Thanks in advance Thomas Mertes

    Seed7 Homepage: http://seed7.sourceforge.net
    Seed7 - The extensible programming language: User defined statements
    and operators, abstract data types, templates without special
    syntax, OO with interfaces and multiple dispatch, statically typed,
    interpreted or compiled, portable, runs under linux/unix/windows.
     
    , Feb 28, 2008
    #1
    1. Advertising

  2. Mark Bluemel Guest

    wrote:
    > Hello
    >
    > Recently I discovered some problem. I have some C code
    > which determines how many bytes are available in a
    > file.


    [snip]

    > The code works for normal files under
    > windows and linux.


    [snip]

    > When I open a file with _popen I get a different result:
    > - Under linux the tell() of step 1 returns -1 which means
    > the file is not seekable. I can recognice this situation
    > and react accordingly (I cannot malloc the buffer beforehand.
    > Instead I malloc a smaller buffer which is realloced until
    > all bytes are read).


    I hope you don't grow it a byte at a time :)

    > - Under windows the tell() of step 1 returns 0 which
    > means the file is seekable and is currently at position 0.
    > The other calls of fseek() and ftell() succeed also and
    > indicate that the number of available bytes is 0.
    > Therefore my program thinks that there are no bytes
    > available in the file opened with _popen.


    > The information that it is a file opened with _popen is
    > not available at that place in my program.


    Could you consider making it available, by providing a wrapper
    mechanism? (That would probably be my favoured approach, rather
    than looking for platform specifics...)

    > Now my question:
    > Is it possible to find out that a file (available in a
    > variable of type FILE * ) was opened with _popen?


    As _popen is not part of the C standard, it's not really
    something we would consider here. You'd probably do better
    asking in a Windows newsgroup.
     
    Mark Bluemel, Feb 28, 2008
    #2
    1. Advertising

  3. wrote:
    > Hello
    >
    > Recently I discovered some problem. I have some C code
    > which determines how many bytes are available in a
    > file. Later I use this information to malloc a buffer of
    > the correct size before I read the bytes.
    > Determining the number of bytes available in a
    > file is done in 5 steps:
    >
    > 1. Use tell(aFile) to get the current position.

    Don't you man ftell() rather than tell()?
    If not you're most ├╝probably lost here as that won't be a standard function.

    > 2. Use fseek(aFile, 0, SEEK_END) to move to the end.
    > 3. Get the current position with tell(aFile) (this is the
    > size of the file in bytes).
    > 4. I move to the position which I got in step 1 with fseek().
    > 5. Subtract the current position from the file size to
    > get the number of bytes available.
    >
    > This code is certainly not the most elegant solution but
    > it is portable. The code works for normal files under
    > windows and linux. The portability is also the reason
    > why I use tell() and fseek() instead of windows specific
    > code.
    >
    > When I open a file with _popen I get a different result:

    no function _popen() in standard C (I think). In POSIX there's popen() (i.e.
    without the leading underscore)

    > - Under linux the tell() of step 1 returns -1 which means
    > the file is not seekable. I can recognice this situation
    > and react accordingly (I cannot malloc the buffer beforehand.
    > Instead I malloc a smaller buffer which is realloced until
    > all bytes are read).
    > - Under windows the tell() of step 1 returns 0 which
    > means the file is seekable and is currently at position 0.
    > The other calls of fseek() and ftell() succeed also and
    > indicate that the number of available bytes is 0.
    > Therefore my program thinks that there are no bytes
    > available in the file opened with _popen.
    >
    > The information that it is a file opened with _popen is
    > not available at that place in my program.
    >
    > Now my question:
    > Is it possible to find out that a file (available in a
    > variable of type FILE * ) was opened with _popen?
    >
    > Something like: Turn the FILE * into a handle and ask a
    > function about the file type. It is no problem for me to
    > insert windows specific code under an #ifdef

    OT here (I think) but "int filno(FILE *stream);" might be what you're
    looking for

    Bye, Jojo
     
    Joachim Schmitz, Feb 28, 2008
    #3
  4. In article <fq6abe$2rm$>,
    Mark Bluemel <> wrote:
    >> (I cannot malloc the buffer beforehand.
    >> Instead I malloc a smaller buffer which is realloced until
    >> all bytes are read).


    >I hope you don't grow it a byte at a time :)


    That's not much of a problem with most malloc() implementations.

    -- Richard
    --
    :wq
     
    Richard Tobin, Feb 28, 2008
    #4
  5. In article <>,
    <> wrote:

    >- Under linux the tell() of step 1 returns -1 which means
    > the file is not seekable. I can recognice this situation
    > and react accordingly (I cannot malloc the buffer beforehand.
    > Instead I malloc a smaller buffer which is realloced until
    > all bytes are read).


    Why not use this strategy always?

    As an optimisation, you could use the ftell() strategy to determine
    the initial size to malloc().

    -- Richard
    --
    :wq
     
    Richard Tobin, Feb 28, 2008
    #5
  6. Guest

    On 28 Feb., 13:44, Mark Bluemel <> wrote:
    > wrote:
    > > Hello

    >
    > > Recently I discovered some problem. I have some C code
    > > which determines how many bytes are available in a
    > > file.

    >
    > [snip]
    >
    > > The code works for normal files under
    > > windows and linux.

    >
    > [snip]
    >
    > > When I open a file with _popen I get a different result:
    > > - Under linux the tell() of step 1 returns -1 which means
    > > the file is not seekable. I can recognice this situation
    > > and react accordingly (I cannot malloc the buffer beforehand.
    > > Instead I malloc a smaller buffer which is realloced until
    > > all bytes are read).

    >
    > I hope you don't grow it a byte at a time :)


    Actually I grow it in steps of 4096.

    > > - Under windows the tell() of step 1 returns 0 which
    > > means the file is seekable and is currently at position 0.
    > > The other calls of fseek() and ftell() succeed also and
    > > indicate that the number of available bytes is 0.
    > > Therefore my program thinks that there are no bytes
    > > available in the file opened with _popen.
    > > The information that it is a file opened with _popen is
    > > not available at that place in my program.

    >
    > Could you consider making it available, by providing a wrapper
    > mechanism? (That would probably be my favoured approach, rather
    > than looking for platform specifics...)


    A simplified version of the function using this functionality
    is (please don't start nitpicking):

    -----------------------------------------------
    #include "stdlib.h"
    #include "stdio.h"

    #define READ_BLOCK_SIZE 4096
    #define SIZ_STRI(len) ((sizeof(struct stristruct) - \
    sizeof(unsigned char)) + (len) * sizeof(unsigned char))

    typedef struct stristruct {
    unsigned long int size;
    unsigned char mem[1];
    } *stritype;

    stritype filGets (FILE *aFile, long length)

    {
    long current_file_position;
    unsigned long int bytes_requested;
    unsigned long int bytes_there;
    unsigned long int read_size_requested;
    unsigned long int block_size_read;
    unsigned long int allocated_size;
    unsigned long int result_size;
    unsigned char *memory;
    stritype resized_result;
    stritype result;

    /* filGets */
    if (length < 0) {
    result = NULL;
    } else {
    bytes_requested = (unsigned long int) length;
    allocated_size = bytes_requested;
    result = (stritype) malloc(SIZ_STRI(allocated_size));
    if (result == NULL) {
    /* Determine how many bytes are available in aFile */
    if ((current_file_position = ftell(aFile)) != -1) {
    fseek(aFile, 0, SEEK_END);
    bytes_there = (ftell(aFile) - current_file_position);
    fseek(aFile, current_file_position, SEEK_SET);
    /* Now we know that bytes_there bytes are available
    in aFile */
    if (bytes_there < bytes_requested) {
    allocated_size = bytes_there;
    result = (stritype) malloc(SIZ_STRI(allocated_size));
    if (result == NULL) {
    return(NULL);
    } /* if */
    } else {
    return(NULL);
    } /* if */
    } /* if */
    } /* if */
    if (result != NULL) {
    /* We have allocated at least as many bytes as
    are available in the file */
    result_size = (unsigned long int) fread(result->mem, 1,
    (size_t) allocated_size, aFile);
    } else {
    /* We do not know how many bytes are avaliable therefore we
    read blocks of READ_BLOCK_SIZE until we reach EOF */
    allocated_size = READ_BLOCK_SIZE;
    result = (stritype) malloc(SIZ_STRI(allocated_size));
    if (result == NULL) {
    return(NULL);
    } else {
    read_size_requested = READ_BLOCK_SIZE;
    if (read_size_requested > bytes_requested) {
    read_size_requested = bytes_requested;
    } /* if */
    block_size_read = fread(result->mem, 1,
    read_size_requested, aFile);
    result_size = block_size_read;
    while (block_size_read == READ_BLOCK_SIZE &&
    result_size < bytes_requested) {
    allocated_size = result_size + READ_BLOCK_SIZE;
    resized_result = (stritype)
    realloc(result, SIZ_STRI(allocated_size));
    if (resized_result == NULL) {
    free(result);
    return(NULL);
    } else {
    result = resized_result;
    memory = (unsigned char *) result->mem;
    read_size_requested = READ_BLOCK_SIZE;
    if (result_size + read_size_requested >
    bytes_requested) {
    read_size_requested = bytes_requested - result_size;
    } /* if */
    block_size_read = fread(&memory[result_size], 1,
    read_size_requested, aFile);
    result_size += block_size_read;
    } /* if */
    } /* while */
    } /* if */
    } /* if */
    result->size = result_size;
    if (result_size < allocated_size) {
    resized_result = (stritype)
    realloc(result, SIZ_STRI(result_size));
    if (resized_result == NULL) {
    free(result);
    return(NULL);
    } else {
    result = resized_result;
    } /* if */
    } /* if */
    } /* if */
    return(result);
    } /* filGets */
    -------------------------------------------

    The function _popen() is not a standard function, but popen()
    is. Btw.: Under windows I use MinGW and there the function
    is also popen(). The problem stays open:

    if you open a file with popen() (MinGW probably also cygwin)
    under windows and you do a ftell() or fseek() you just
    succeed as if it is an empty file. If you do the same in
    linux the ftell() and fseek() functions return -1 which
    indicate that the file is not seekable.

    If someone has an idea: Please help.

    Greetings Thomas Mertes

    Seed7 Homepage: http://seed7.sourceforge.net
    Seed7 - The extensible programming language: User defined statements
    and operators, abstract data types, templates without special
    syntax, OO with interfaces and multiple dispatch, statically typed,
    interpreted or compiled, portable, runs under linux/unix/windows.
     
    , Feb 28, 2008
    #6
  7. Guest

    On 28 Feb., 14:19, (Richard Tobin) wrote:
    > In article <>,
    >
    > <> wrote:
    > >- Under linux the tell() of step 1 returns -1 which means
    > > the file is not seekable. I can recognice this situation
    > > and react accordingly (I cannot malloc the buffer beforehand.
    > > Instead I malloc a smaller buffer which is realloced until
    > > all bytes are read).

    >
    > Why not use this strategy always?
    >
    > As an optimisation, you could use the ftell() strategy to determine
    > the initial size to malloc().


    This is just what I want. But for a pipe created with popen this
    strategy is not possible: You cannot know how big a pipe can
    grow. Therefore ftell() and fseek() return -1 for pipes.
    Under windows it does not work for files (pipes) opened with
    _popen() since ftell() and fseek() return 0 instead of -1.
    Therefore I look for a possibility to recognize this situation.

    Greetings Thomas Mertes

    Seed7 Homepage: http://seed7.sourceforge.net
    Seed7 - The extensible programming language: User defined statements
    and operators, abstract data types, templates without special
    syntax, OO with interfaces and multiple dispatch, statically typed,
    interpreted or compiled, portable, runs under linux/unix/windows.
     
    , Feb 28, 2008
    #7
  8. Guest

    On 28 Feb., 13:48, "Joachim Schmitz" <>
    wrote:
    > wrote:
    > > Hello

    >
    > > Recently I discovered some problem. I have some C code
    > > which determines how many bytes are available in a
    > > file. Later I use this information to malloc a buffer of
    > > the correct size before I read the bytes.
    > > Determining the number of bytes available in a
    > > file is done in 5 steps:

    >
    > > 1. Use tell(aFile) to get the current position.

    >
    > Don't you man ftell() rather than tell()?

    Yes you are right: I mean ftell()

    > If not you're most ├╝probably lost here as that won't be a standard function.
    >
    > > 2. Use fseek(aFile, 0, SEEK_END) to move to the end.
    > > 3. Get the current position with tell(aFile) (this is the
    > > size of the file in bytes).
    > > 4. I move to the position which I got in step 1 with fseek().
    > > 5. Subtract the current position from the file size to
    > > get the number of bytes available.

    >
    > > This code is certainly not the most elegant solution but
    > > it is portable. The code works for normal files under
    > > windows and linux. The portability is also the reason
    > > why I use tell() and fseek() instead of windows specific
    > > code.

    >
    > > When I open a file with _popen I get a different result:

    >
    > no function _popen() in standard C (I think). In POSIX there's popen() (i.e.
    > without the leading underscore)


    I looked at the popen() more closely and I use popen()
    under linux (gcc) and under windows (MinGW). the only place
    using _popen() would be under windows (MSVC). But the actual
    problem occours under windows(MinGW). So I can claim that
    I actually use the POSIX popen().

    > > - Under linux the tell() of step 1 returns -1 which means
    > > the file is not seekable. I can recognice this situation
    > > and react accordingly (I cannot malloc the buffer beforehand.
    > > Instead I malloc a smaller buffer which is realloced until
    > > all bytes are read).
    > > - Under windows the tell() of step 1 returns 0 which
    > > means the file is seekable and is currently at position 0.
    > > The other calls of fseek() and ftell() succeed also and
    > > indicate that the number of available bytes is 0.
    > > Therefore my program thinks that there are no bytes
    > > available in the file opened with _popen.

    >
    > > The information that it is a file opened with _popen is
    > > not available at that place in my program.

    >
    > > Now my question:
    > > Is it possible to find out that a file (available in a
    > > variable of type FILE * ) was opened with _popen?

    >
    > > Something like: Turn the FILE * into a handle and ask a
    > > function about the file type. It is no problem for me to
    > > insert windows specific code under an #ifdef

    >
    > OT here (I think) but "int filno(FILE *stream);" might be what you're
    > looking for


    Does the fileno() function return a file handle under
    windows?

    May be I can use fstat and check for S_ISFIFO.
    If that works MinGW has a bug.

    Greetings Thomas Mertes

    Seed7 Homepage: http://seed7.sourceforge.net
    Seed7 - The extensible programming language: User defined statements
    and operators, abstract data types, templates without special
    syntax, OO with interfaces and multiple dispatch, statically typed,
    interpreted or compiled, portable, runs under linux/unix/windows.
     
    , Feb 28, 2008
    #8
  9. In article <>,
    <> wrote:

    >> >- Under linux the tell() of step 1 returns -1 which means
    >> > the file is not seekable. I can recognice this situation
    >> > and react accordingly (I cannot malloc the buffer beforehand.
    >> > Instead I malloc a smaller buffer which is realloced until
    >> > all bytes are read).


    >> Why not use this strategy always?


    >> As an optimisation, you could use the ftell() strategy to determine
    >> the initial size to malloc().


    >This is just what I want. But for a pipe created with popen this
    >strategy is not possible: You cannot know how big a pipe can
    >grow.


    You misunderstand. *Don't* try to recognise the situation. *Always*
    use the grow-the-buffer-as-you-read approach, so that you don't have
    to know the size in advance.

    But use the result of the ftell() strategy for the initial size.
    It will be wrong if it happens to be a pipe, but it doesn't matter
    that it's wrong - you'll just start with a buffer of zero bytes and
    grow it to the right size as you read.

    -- Richard
    --
    :wq
     
    Richard Tobin, Feb 28, 2008
    #9
  10. Mark Bluemel Guest

    wrote:

    >...So I can claim that
    > I actually use the POSIX popen().


    Which is a POSIX standard (see comp.unix.programmer) not a C standard.
     
    Mark Bluemel, Feb 28, 2008
    #10
  11. wrote:
    > On 28 Feb., 13:48, "Joachim Schmitz" <>
    > wrote:
    >> wrote:

    <snip>
    >>> Now my question:
    >>> Is it possible to find out that a file (available in a
    >>> variable of type FILE * ) was opened with _popen?

    >>
    >>> Something like: Turn the FILE * into a handle and ask a
    >>> function about the file type. It is no problem for me to
    >>> insert windows specific code under an #ifdef

    >>
    >> OT here (I think) but "int filno(FILE *stream);" might be what you're
    >> looking for

    >
    > Does the fileno() function return a file handle under
    > windows?

    No idea, but it does in POSIX
    $ man fileno
    ....
    fileno - Maps a stream pointer to a file descriptor
    ....
    The fileno() function returns the file descriptor of a stream

    > May be I can use fstat and check for S_ISFIFO.

    Indeed. But you could also use stat(), which works on a filename rather than
    on a file descriptor.

    > If that works MinGW has a bug.


    Bye, Jojo
     
    Joachim Schmitz, Feb 28, 2008
    #11
  12. Mark Bluemel wrote:
    > wrote:
    >
    >> ...So I can claim that
    >> I actually use the POSIX popen().

    >
    > Which is a POSIX standard (see comp.unix.programmer) not a C standard.

    Which he said.

    Bye, Jojo
     
    Joachim Schmitz, Feb 28, 2008
    #12
  13. Guest

    On 28 Feb., 15:53, (Richard Tobin) wrote:
    > In article <>,
    >
    > <> wrote:
    > >> >- Under linux the tell() of step 1 returns -1 which means
    > >> > the file is not seekable. I can recognice this situation
    > >> > and react accordingly (I cannot malloc the buffer beforehand.
    > >> > Instead I malloc a smaller buffer which is realloced until
    > >> > all bytes are read).
    > >> Why not use this strategy always?
    > >> As an optimisation, you could use the ftell() strategy to determine
    > >> the initial size to malloc().

    > >This is just what I want. But for a pipe created with popen this
    > >strategy is not possible: You cannot know how big a pipe can
    > >grow.

    >
    > You misunderstand. *Don't* try to recognise the situation. *Always*
    > use the grow-the-buffer-as-you-read approach, so that you don't have
    > to know the size in advance.


    You are right: I missunderstand you, sorry.

    > But use the result of the ftell() strategy for the initial size.
    > It will be wrong if it happens to be a pipe, but it doesn't matter
    > that it's wrong - you'll just start with a buffer of zero bytes and
    > grow it to the right size as you read.


    Sounds not bad, I will think over that.
    The function does not always read the rest of a file.
    It gets a length limit. The prototype of filGets is:

    stritype filGets (FILE *aFile, long length)

    My general strategy to the function is:

    A) Do a malloc() for the requested length
    B) Attempt to read the requested amount of bytes
    (not all requested bytes may be available).
    C) Realloc() the malloced area to the actual size.

    So it is quite simple in the normal case.
    But this function is also used to read whole files.
    This is done by using very high values for 'length'.
    Now two things can happen.

    - The malloc() succeeds: The general strategy works.
    - The malloc() fails: This is the case I was talking
    about in this discussion.

    If the malloc() fails it still would have higher
    performance to just use one malloc() and one fread().
    Therefore I started to write code to find out the
    available bytes. I belived that the ftell()/fseek()
    strategy would work exactly for all files where it
    is possible to determine the available bytes. Well,
    this was theory and windows under MinGW is something
    different.

    For me is the 'read from the file in small chunks"
    strategy only the last resort. Not because I think
    that the reading would be slower, but because it
    needs lots of reallocs for a probably very big
    buffer. So some bad things can happen:

    a) The reallocs cost time.
    b) It may fail because the heap was thrashed to
    much (a single malloc would have succeeded).

    Btw.: In the meantime I tried to use fstat() and
    S_ISREG() and use the ftell()/fseek() strategy only
    for regular files.

    Greetings Thomas Mertes

    Seed7 Homepage: http://seed7.sourceforge.net
    Seed7 - The extensible programming language: User defined statements
    and operators, abstract data types, templates without special
    syntax, OO with interfaces and multiple dispatch, statically typed,
    interpreted or compiled, portable, runs under linux/unix/windows.
     
    , Feb 28, 2008
    #13
  14. Guest

    On 28 Feb., 15:57, "Joachim Schmitz" <>
    wrote:
    > wrote:
    > > On 28 Feb., 13:48, "Joachim Schmitz" <>
    > > wrote:
    > >> wrote:

    > <snip>
    > >>> Now my question:
    > >>> Is it possible to find out that a file (available in a
    > >>> variable of type FILE * ) was opened with _popen?

    >
    > >>> Something like: Turn the FILE * into a handle and ask a
    > >>> function about the file type. It is no problem for me to
    > >>> insert windows specific code under an #ifdef

    >
    > >> OT here (I think) but "int filno(FILE *stream);" might be what you're
    > >> looking for

    >
    > > Does the fileno() function return a file handle under
    > > windows?

    >
    > No idea, but it does in POSIX
    > $ man fileno
    > ...
    > fileno - Maps a stream pointer to a file descriptor
    > ...
    > The fileno() function returns the file descriptor of a stream
    >
    > > May be I can use fstat and check for S_ISFIFO.

    >
    > Indeed. But you could also use stat(), which works on a filename rather than
    > on a file descriptor.


    If I would know the filename at this place, I would
    probably also know the type of the file without
    referring to fstat().

    Btw.: I tested with fstat() and it works under
    linux and windows. Currently I do the ftell()/fstat()
    strategy to determine the size of a file only for
    regular files.

    > > If that works MinGW has a bug.


    Since my solution works I would say that
    MinGw has a bug when using ftell()/fseek() for pipes:
    Instead of -1 the functions return 0 for pipes (at least for
    the pipes opened with popen() ).

    Greetings Thomas Mertes

    Seed7 Homepage: http://seed7.sourceforge.net
    Seed7 - The extensible programming language: User defined statements
    and operators, abstract data types, templates without special
    syntax, OO with interfaces and multiple dispatch, statically typed,
    interpreted or compiled, portable, runs under linux/unix/windows.
     
    , Feb 28, 2008
    #14
  15. >Recently I discovered some problem. I have some C code
    >which determines how many bytes are available in a
    >file. Later I use this information to malloc a buffer of
    >the correct size before I read the bytes.
    >Determining the number of bytes available in a
    >file is done in 5 steps:
    >
    >1. Use tell(aFile) to get the current position.


    Do you mean ftell() here?

    >2. Use fseek(aFile, 0, SEEK_END) to move to the end.


    If the file is a binary file, SEEK_END need not be meaningfully supported.
    Also, shouldn't that second argument of fseek() be 0L, not 0?

    >3. Get the current position with tell(aFile) (this is the
    > size of the file in bytes).


    Do you mean ftell() here? ftell() on a text file need not return
    a number of anything; it might be a bitfield combination of
    track, train, sector, offset within sector, cylinder, etc., and
    subtracting two of them may not give any meaningful result.

    >4. I move to the position which I got in step 1 with fseek().
    >5. Subtract the current position from the file size to
    > get the number of bytes available.



    >This code is certainly not the most elegant solution but
    >it is portable. The code works for normal files under
    >windows and linux. The portability is also the reason
    >why I use tell() and fseek() instead of windows specific
    >code.


    >When I open a file with _popen I get a different result:


    There are a number of things that look like an open file but aren't
    seekable. Tty devices (the console or serial ports), and pipes are
    included. Sockets aren't seekable either. Certain magnetic tape
    devices might have spotty support for seeking beyond rewind.

    >- Under linux the tell() of step 1 returns -1 which means
    > the file is not seekable. I can recognice this situation
    > and react accordingly (I cannot malloc the buffer beforehand.
    > Instead I malloc a smaller buffer which is realloced until
    > all bytes are read).


    I claim you need to deal with the possibility of a growing file
    *anyway*, so the approach of malloc()/realloc() always needs to
    be used.

    >- Under windows the tell() of step 1 returns 0 which
    > means the file is seekable and is currently at position 0.
    > The other calls of fseek() and ftell() succeed also and
    > indicate that the number of available bytes is 0.
    > Therefore my program thinks that there are no bytes
    > available in the file opened with _popen.


    If you know you opened the file with popen(), then you know it's
    a pipe. Deal with it.

    >The information that it is a file opened with _popen is
    >not available at that place in my program.
    >
    >Now my question:
    >Is it possible to find out that a file (available in a
    >variable of type FILE * ) was opened with _popen?


    I suggest that if you skip steps 1 through 5 and replace it with
    estimated_file_size = 4096;
    then use your strategy of malloc()/realloc(), you cover all cases
    without needing that information. The number 4096 for an initial
    buffer size is chosen to be small enough to not be a huge waste of
    memory on small files, and to be a reasonably efficient block size
    for reading files. 3 bytes is too small and 50 megabytes is way
    too big for typical files (except maybe video or large databases).
    Tune for your application as appropriate. You can also tune how much
    more memory to get each time when the initial buffer isn't enough.

    I'll mention here that the assumption that a file will fit in memory,
    especially on a 32-bit system, is somewhat shaky. Consider especially
    that a DVD holds 4.7GB (more for DL), which is bigger than the
    address space of a 32-bit system. Whether this is a problem depends
    on the type of files your application uses.

    >Something like: Turn the FILE * into a handle and ask a
    >function about the file type. It is no problem for me to
    >insert windows specific code under an #ifdef


    For POSIX systems, look up fileno() and fstat(), with particular
    attention to the st_mode structure field.
     
    Gordon Burditt, Feb 29, 2008
    #15
  16. Guest

    On 29 Feb., 01:13, (Gordon Burditt) wrote:
    > >Recently I discovered some problem. I have some C code
    > >which determines how many bytes are available in a
    > >file. Later I use this information to malloc a buffer of
    > >the correct size before I read the bytes.
    > >Determining the number of bytes available in a
    > >file is done in 5 steps:

    >
    > >1. Use tell(aFile) to get the current position.

    >
    > Do you mean ftell() here?


    Yes I mean ftell(aFile).

    > >2. Use fseek(aFile, 0, SEEK_END) to move to the end.

    >
    > If the file is a binary file, SEEK_END need not be meaningfully supported.
    > Also, shouldn't that second argument of fseek() be 0L, not 0?
    >
    > >3. Get the current position with tell(aFile) (this is the
    > > size of the file in bytes).

    >
    > Do you mean ftell() here?


    Yes I mean ftell().

    > ftell() on a text file need not return
    > a number of anything; it might be a bitfield combination of
    > track, train, sector, offset within sector, cylinder, etc., and
    > subtracting two of them may not give any meaningful result.


    On linux/unix/bsd/windows my approach works at least for
    regular files. In which operating systems does ftell() return
    a bitfield in the way you said?

    > >4. I move to the position which I got in step 1 with fseek().
    > >5. Subtract the current position from the file size to
    > > get the number of bytes available.
    > >This code is certainly not the most elegant solution but
    > >it is portable. The code works for normal files under
    > >windows and linux. The portability is also the reason
    > >why I use tell() and fseek() instead of windows specific
    > >code.
    > >When I open a file with _popen I get a different result:

    >
    > There are a number of things that look like an open file but aren't
    > seekable. Tty devices (the console or serial ports), and pipes are
    > included. Sockets aren't seekable either. Certain magnetic tape
    > devices might have spotty support for seeking beyond rewind.


    Yes, I know that there are open files which are not seekable.
    I expect such files to return -1 on ftell() and fseek().

    > >- Under linux the tell() of step 1 returns -1 which means
    > > the file is not seekable. I can recognice this situation
    > > and react accordingly (I cannot malloc the buffer beforehand.
    > > Instead I malloc a smaller buffer which is realloced until
    > > all bytes are read).

    >
    > I claim you need to deal with the possibility of a growing file
    > *anyway*, so the approach of malloc()/realloc() always needs to
    > be used.


    I consider the malloc()/realloc() approach more time
    consuming and I assume that the heap could also be thrashed.
    Therefore I want to use this approach also as last resort.

    > >- Under windows the tell() of step 1 returns 0 which
    > > means the file is seekable and is currently at position 0.
    > > The other calls of fseek() and ftell() succeed also and
    > > indicate that the number of available bytes is 0.
    > > Therefore my program thinks that there are no bytes
    > > available in the file opened with _popen.

    >
    > If you know you opened the file with popen(), then you know it's
    > a pipe. Deal with it.


    I am talking about a function which gets a FILE * parameter.
    This function is part of a library. This library is used in
    an interpreter or is linked to in compiled programs. If you
    are interested: I am talking about the Seed7 interpreter
    and about compiled Seed7 programs. The Seed7 programs are
    compiled to C, further compiled with a C compiler and then
    the library is linked to it. Therefore the programmer of
    the Seed7 program knows that he is opening a pipe with
    popen(), but the library has not this information.

    For now I took the approach to find out in the makefile
    (when using 'make depend') if ftell() works correct for
    a pipe opened with popen(). In that case I replace the
    ftell() with my own version which checks the filetype with
    fstat() and calls the original ftell() only for a regurar
    file and returns -1 otherwise.

    I am just angry that I have to code around a bug of
    windows/mingw.

    > >The information that it is a file opened with _popen is
    > >not available at that place in my program.

    >
    > >Now my question:
    > >Is it possible to find out that a file (available in a
    > >variable of type FILE * ) was opened with _popen?

    >
    > I suggest that if you skip steps 1 through 5 and replace it with
    > estimated_file_size = 4096;
    > then use your strategy of malloc()/realloc(), you cover all cases
    > without needing that information. The number 4096 for an initial
    > buffer size is chosen to be small enough to not be a huge waste of
    > memory on small files, and to be a reasonably efficient block size
    > for reading files. 3 bytes is too small and 50 megabytes is way
    > too big for typical files (except maybe video or large databases).
    > Tune for your application as appropriate. You can also tune how much
    > more memory to get each time when the initial buffer isn't enough.


    My malloc()/realloc() strategy works in steps of 4096.

    > I'll mention here that the assumption that a file will fit in memory,
    > especially on a 32-bit system, is somewhat shaky. Consider especially
    > that a DVD holds 4.7GB (more for DL), which is bigger than the
    > address space of a 32-bit system. Whether this is a problem depends
    > on the type of files your application uses.


    Reading a whole file is just a special case use of the
    function. The prototype of the library function is:

    stritype filGets (FILE *aFile, long length);

    This function can be used similar to the fgets() function
    of C. While fgets() gets the buffer as parameter filGets()
    mallocs the buffer instead of getting it. Therefore it
    can malloc a buffer of exact the right size. filGets() has
    also the parameters in a different sequence. The name of
    the function in a Seed7 program is gets(file, length).

    Recognizing an out of memory situation and raising an
    exception in Seed7 is part of the functions task. The
    Seed7 user program would get an exception instead of the
    characters read.

    > >Something like: Turn the FILE * into a handle and ask a
    > >function about the file type. It is no problem for me to
    > >insert windows specific code under an #ifdef

    >
    > For POSIX systems, look up fileno() and fstat(), with particular
    > attention to the st_mode structure field.


    This is what I do now.

    Greetings Thomas Mertes

    Seed7 Homepage: http://seed7.sourceforge.net
    Seed7 - The extensible programming language: User defined statements
    and operators, abstract data types, templates without special
    syntax, OO with interfaces and multiple dispatch, statically typed,
    interpreted or compiled, portable, runs under linux/unix/windows.
     
    , Feb 29, 2008
    #16
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Michael

    system() or _popen()?

    Michael, Nov 11, 2004, in forum: C++
    Replies:
    2
    Views:
    3,192
    Michiel Salters
    Nov 12, 2004
  2. Michael
    Replies:
    1
    Views:
    1,799
    Ron Natalie
    Dec 1, 2004
  3. Leslaw Bieniasz

    How to speed up ftell()/fseek()

    Leslaw Bieniasz, Jun 6, 2005, in forum: C++
    Replies:
    7
    Views:
    4,932
    Lionel B
    Jun 8, 2005
  4. Christopher Benson-Manica

    fseek

    Christopher Benson-Manica, Nov 7, 2003, in forum: C Programming
    Replies:
    62
    Views:
    2,579
    Alan Balmer
    Nov 17, 2003
  5. Replies:
    2
    Views:
    115
Loading...

Share This Page