Re: getting perl and C working together in a way that makes sense

Discussion in 'C Programming' started by Malcolm McLean, Feb 1, 2013.

  1. On Friday, February 1, 2013 7:19:06 AM UTC, Cal Dershowitz wrote:
    >
    > I think the intent of this program speaks for itself, and it cuts across
    > some of the greatest standing arguments in the C world: how do you
    > declare and handle the input of a string whose size you don't know?
    >

    You're reading the whole file as a single string, so you need a slurp() function.

    This is very easy to write if you know something about the platform you're
    writing on, very hard to write portably, because files can be bigger than
    the total memory, or even address space. Also because there's no standard
    filesize() function, though there's almost always a platform-specfic one.
    Malcolm McLean, Feb 1, 2013
    #1
    1. Advertising

  2. Cal Dershowitz <> writes:
    > On 02/01/2013 01:50 AM, Malcolm McLean wrote:
    >> On Friday, February 1, 2013 7:19:06 AM UTC, Cal Dershowitz wrote:
    >>> I think the intent of this program speaks for itself, and it cuts across
    >>> some of the greatest standing arguments in the C world: how do you
    >>> declare and handle the input of a string whose size you don't know?
    >>>

    >> You're reading the whole file as a single string, so you need a
    >> slurp() function.
    >>
    >> This is very easy to write if you know something about the platform you're
    >> writing on, very hard to write portably, because files can be bigger than
    >> the total memory, or even address space. Also because there's no standard
    >> filesize() function, though there's almost always a platform-specfic one.

    >
    > right Malcolm:
    >
    > use File::Slurp;
    >
    > It solves the problem of filelength generally, as far as C is concerned.
    >
    > You have such an insight.


    I'm sure that, by "a slurp() function", Malcolm meant a function
    *written in C* that reads the contents of a file. He didn't mention
    Perl's File::Slurp package.

    The main problem is determining how big the buffer needs to be. Most
    systems have a non-portable way to determine the size of a file; the
    standard doesn't provide a portable way to do that, other than by
    reading the entire file. And the file size can change between the time
    you determine it and the time you read it. You can implement an
    expanding buffer with realloc(), but that can use memory ineffiently.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Working, but not speaking, for JetHead Development, Inc.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Feb 1, 2013
    #2
    1. Advertising

  3. Malcolm McLean

    BartC Guest

    "Keith Thompson" <> wrote in message
    news:...


    > The main problem is determining how big the buffer needs to be. Most
    > systems have a non-portable way to determine the size of a file; the
    > standard doesn't provide a portable way to do that, other than by
    > reading the entire file. And the file size can change between the time
    > you determine it and the time you read it.


    The file can change in size and content even while you're trying to read it
    from start to finish.

    But if you had to worry about all the possibilities, working with files
    becomes near impossible.

    Sometimes, you just have to assume that that 10-line configuration text file
    you're trying to read into memory is going to be better behaved than a
    multi-gigabyte continuously-updated database file on some remote network.

    --
    Bartc
    BartC, Feb 1, 2013
    #3
  4. "BartC" <> writes:
    > "Keith Thompson" <> wrote in message
    > news:...
    >
    >> The main problem is determining how big the buffer needs to be. Most
    >> systems have a non-portable way to determine the size of a file; the
    >> standard doesn't provide a portable way to do that, other than by
    >> reading the entire file. And the file size can change between the time
    >> you determine it and the time you read it.

    >
    > The file can change in size and content even while you're trying to read it
    > from start to finish.
    >
    > But if you had to worry about all the possibilities, working with files
    > becomes near impossible.
    >
    > Sometimes, you just have to assume that that 10-line configuration text file
    > you're trying to read into memory is going to be better behaved than a
    > multi-gigabyte continuously-updated database file on some remote network.


    Unless that assumption permits an attacker to break your system's
    security by violating it.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Working, but not speaking, for JetHead Development, Inc.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Feb 1, 2013
    #4
  5. Malcolm McLean

    Les Cargill Guest

    Keith Thompson wrote:
    > Cal Dershowitz <> writes:
    >> On 02/01/2013 01:50 AM, Malcolm McLean wrote:
    >>> On Friday, February 1, 2013 7:19:06 AM UTC, Cal Dershowitz wrote:
    >>>> I think the intent of this program speaks for itself, and it cuts across
    >>>> some of the greatest standing arguments in the C world: how do you
    >>>> declare and handle the input of a string whose size you don't know?
    >>>>
    >>> You're reading the whole file as a single string, so you need a
    >>> slurp() function.
    >>>
    >>> This is very easy to write if you know something about the platform you're
    >>> writing on, very hard to write portably, because files can be bigger than
    >>> the total memory, or even address space. Also because there's no standard
    >>> filesize() function, though there's almost always a platform-specfic one.

    >>
    >> right Malcolm:
    >>
    >> use File::Slurp;
    >>
    >> It solves the problem of filelength generally, as far as C is concerned.
    >>
    >> You have such an insight.

    >
    > I'm sure that, by "a slurp() function", Malcolm meant a function
    > *written in C* that reads the contents of a file. He didn't mention
    > Perl's File::Slurp package.
    >
    > The main problem is determining how big the buffer needs to be. Most
    > systems have a non-portable way to determine the size of a file; the
    > standard doesn't provide a portable way to do that, other than by
    > reading the entire file.


    The 'C' library provides two entry points, ftell() and fseek().
    For any random file system, they may not actually work reasonably
    ( especially for that 9 track tape drive ) but they are quite standard
    and if it's a spinning disk or a FLASH drive, they almost certainly will.

    So there exists a relatively simple pattern for this. Will they
    work on the filesystem before you? I recommend testing.


    > And the file size can change between the time
    > you determine it and the time you read it.


    Oh well.

    > You can implement an
    > expanding buffer with realloc(), but that can use memory ineffiently.
    >


    --
    Les Cargill
    Les Cargill, Feb 2, 2013
    #5
  6. On Fri, 01 Feb 2013 23:50:11 -0600, Les Cargill wrote:

    > Keith Thompson wrote:
    >> The main problem is determining how big the buffer needs to be. Most
    >> systems have a non-portable way to determine the size of a file; the
    >> standard doesn't provide a portable way to do that, other than by
    >> reading the entire file.

    >
    > The 'C' library provides two entry points, ftell() and fseek(). For any
    > random file system, they may not actually work reasonably ( especially
    > for that 9 track tape drive ) but they are quite standard and if it's a
    > spinning disk or a FLASH drive, they almost certainly will.


    Except that, according to the C standard, the result from ftell() on a
    text file does not have any meaning beyond "this is a position you can
    seek back to". It does not have to be related to the file size in any
    meaningful way.
    And fseek(0, SEEK_END, fp) is not required to work on a binary file.
    So, the combination of fseek and ftell guaranteed to work in determining
    how large a file is.

    >
    > So there exists a relatively simple pattern for this. Will they work on
    > the filesystem before you? I recommend testing.


    That might tell you if it works on the platforms you tested. I does not
    tell you anything about the platforms you did not test.

    Bart v Ingen Schenau
    Bart van Ingen Schenau, Feb 2, 2013
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Johann Klammer
    Replies:
    0
    Views:
    206
    Johann Klammer
    Feb 1, 2013
  2. Mark Bluemel
    Replies:
    2
    Views:
    206
    James Kuyper
    Feb 1, 2013
  3. Keith Thompson
    Replies:
    0
    Views:
    198
    Keith Thompson
    Feb 1, 2013
  4. Jens Schweikhardt
    Replies:
    24
    Views:
    462
    Jorgen Grahn
    Feb 4, 2013
  5. Jorgen Grahn
    Replies:
    0
    Views:
    215
    Jorgen Grahn
    Feb 2, 2013
Loading...

Share This Page