Reading text file contents to a character buffer

Discussion in 'C Programming' started by Navaneeth, Aug 2, 2010.

  1. Navaneeth

    Navaneeth Guest

    Hello,

    I am trying to load a text file into in-memory character buffer. I
    wrote the following code.

    static char *readcontent(const char *filename)
    {
    char *fcontent = NULL, *tmp = NULL;
    size_t result, elements_read = 0;
    int pagenum = 1;

    FILE *fp;
    fp = fopen(filename, "r");

    if(fp) {
    while(1) {
    tmp = (char*) realloc(fcontent, PAGE_SIZE * pagenum);
    if(tmp) {
    fcontent = tmp;
    }
    else {
    free(fcontent);
    fcontent = NULL;
    printf("failed to reallocate memory\n");
    break;
    }

    result = fread(fcontent + elements_read, 1, PAGE_SIZE,
    fp);

    if(result != PAGE_SIZE) {
    if(feof(fp)) {
    break;
    }
    else {
    printf("error reading file");
    break;
    }
    }
    elements_read += result;
    ++pagenum;
    }
    fclose(fp);
    }
    return fcontent;
    }

    This is working fine. Since I am new to C, I am wondering is this the
    correct approach to do it? Any suggestions to improve the code are
    welcome. This code is expected to compile atleast on GCC and MSVC.

    Any thoughts?

    Thanks
     
    Navaneeth, Aug 2, 2010
    #1
    1. Advertising

  2. Navaneeth <> writes:

    > I am trying to load a text file into in-memory character buffer. I
    > wrote the following code.
    >
    > static char *readcontent(const char *filename)
    > {


    From a purely functional point of view, I'd say that you should have
    some way to return how long the file was to the caller. It may not
    matter in your case but (a) you have the information so it's 'free' and
    (b) it makes the function more useful. A good way to do this is to
    supply a size_t * argument which if non-NULL get set to the length of
    the data written. Permitting a null pointer means that callers who
    don't care about the length don't have to allocate a dummy variable just
    for the purpose.

    If you don't do this (and quite possibly if you so) you should at least
    null-terminate the result. As it stands, the caller can't use the
    returned data without running off the end of the array.

    > char *fcontent = NULL, *tmp = NULL;
    > size_t result, elements_read = 0;
    > int pagenum = 1;
    >
    > FILE *fp;
    > fp = fopen(filename, "r");
    >
    > if(fp) {
    > while(1) {
    > tmp = (char*) realloc(fcontent, PAGE_SIZE * pagenum);


    I would not cast the return from realloc.

    A common trick is to increase the storage used by some factor (> 1)
    rather than but adding a fixed amount. This has the advantage of
    reducing the number of realloc calls at the expense of a little more
    data.

    > if(tmp) {
    > fcontent = tmp;
    > }
    > else {
    > free(fcontent);
    > fcontent = NULL;
    > printf("failed to reallocate memory\n");


    I would not print an error message to stdout. In fact, I would not
    print an error message at all! A utility function like should be usable
    by programs that might want to report errors in all sorts of other
    ways. In general, it is best to put error reporting as "high up" in the
    you program as is practical

    > break;
    > }
    >
    > result = fread(fcontent + elements_read, 1, PAGE_SIZE,
    > fp);
    >
    > if(result != PAGE_SIZE) {
    > if(feof(fp)) {
    > break;
    > }
    > else {
    > printf("error reading file");
    > break;
    > }


    I'd put the break here.

    > }
    > elements_read += result;
    > ++pagenum;
    > }
    > fclose(fp);
    > }
    > return fcontent;
    > }


    <snip>
    --
    Ben.
     
    Ben Bacarisse, Aug 2, 2010
    #2
    1. Advertising

  3. "Ben Bacarisse" <> wrote in message
    news:...
    > Navaneeth <> writes:
    >
    >> I am trying to load a text file into in-memory character buffer. I
    >> wrote the following code.
    >>
    >> static char *readcontent(const char *filename)
    >> {

    >
    > From a purely functional point of view, I'd say that you should have
    > some way to return how long the file was to the caller. It may not
    > matter in your case but (a) you have the information so it's 'free' and
    > (b) it makes the function more useful. A good way to do this is to
    > supply a size_t * argument which if non-NULL get set to the length of
    > the data written. Permitting a null pointer means that callers who
    > don't care about the length don't have to allocate a dummy variable just
    > for the purpose.
    >
    > If you don't do this (and quite possibly if you so) you should at least
    > null-terminate the result. As it stands, the caller can't use the
    > returned data without running off the end of the array.
    >


    yeah.


    >> char *fcontent = NULL, *tmp = NULL;
    >> size_t result, elements_read = 0;
    >> int pagenum = 1;
    >>
    >> FILE *fp;
    >> fp = fopen(filename, "r");
    >>
    >> if(fp) {
    >> while(1) {
    >> tmp = (char*) realloc(fcontent, PAGE_SIZE * pagenum);

    >
    > I would not cast the return from realloc.
    >


    I disagree. there is no harm done by the cast, and (among several other
    practices commonly disliked here) it also aides in writing code which works
    in both C and C++ compilers.


    > A common trick is to increase the storage used by some factor (> 1)
    > rather than but adding a fixed amount. This has the advantage of
    > reducing the number of realloc calls at the expense of a little more
    > data.
    >


    yeah, the 1.5^x curve works well in my case.
    every time one needs more memory, scale the prior size by 1.5...
    average case overhead is 25%.

    scaling by a fixed size leads to more realloc, which means more time, and
    greater memory fragmentation.
    fragmentation may well be a greater enemy than the raw anount of memory
    used.


    >> if(tmp) {
    >> fcontent = tmp;
    >> }
    >> else {
    >> free(fcontent);
    >> fcontent = NULL;
    >> printf("failed to reallocate memory\n");

    >
    > I would not print an error message to stdout. In fact, I would not
    > print an error message at all! A utility function like should be usable
    > by programs that might want to report errors in all sorts of other
    > ways. In general, it is best to put error reporting as "high up" in the
    > you program as is practical
    >


    yeah, status code and returning NULL being a common way to do this.

    less common ways include callbacks, event queuing, exceptions, longjmp, ...
    but each has drawbacks...

    I list exceptions here, but these are typically OS/compiler/framework
    specific (and not part of standard C), which is their main drawback.


    >> break;
    >> }
    >>
    >> result = fread(fcontent + elements_read, 1, PAGE_SIZE,
    >> fp);
    >>
    >> if(result != PAGE_SIZE) {
    >> if(feof(fp)) {
    >> break;
    >> }
    >> else {
    >> printf("error reading file");
    >> break;
    >> }

    >
    > I'd put the break here.
    >
    >> }
    >> elements_read += result;
    >> ++pagenum;
    >> }
    >> fclose(fp);
    >> }
    >> return fcontent;
    >> }

    >


    this function being a great example of CLC pedantics...


    personally, I would just be like "to hell with it" and just use fseek/ftell,
    since there is almost no way that a text file is going to be larger than 2GB
    and expected to just be read into memory like this (it wouldn't fit on 32
    bit systems anyways...).

    typically, on a 32-bit system, even 500MB malloc's are prone to fail, since
    often some random DLL will end up mapped right into the middle of an
    otherwise clean 1GB span of memory, of the theoretically 2GB or 3GB or so
    available to the app. (for those few apps which try to allocate memory in
    such large chunks...).

    but, memory usage patterns is its own complicated set of issues.


    however, given that the average size of most "sane" text files is measurable
    in kB, with a few larger ones being a few MB, usually there is little risk
    of size overflows just using the fseek/ftell strategy...

    (although, yes, if the file *is* larger than 2GB or so, then the size may
    come back negative or otherwise completely wrong...).


    but, oh well, whatever...
     
    BGB / cr88192, Aug 2, 2010
    #3
  4. Navaneeth

    ImpalerCore Guest

    On Aug 2, 2:36 pm, "BGB / cr88192" <> wrote:

    <snip>

    > this function being a great example of CLC pedantics...
    >
    > personally, I would just be like "to hell with it" and just use fseek/ftell,
    > since there is almost no way that a text file is going to be larger than 2GB
    > and expected to just be read into memory like this (it wouldn't fit on 32
    > bit systems anyways...).


    Unless you have a data file that's over 2GB and you convert it to an
    ASCII representation for 'grep-ability' and now it's ballooned to over
    9GB. Not saying that it's a good thing, but I have done it before
    when I've been in a pinch.

    > typically, on a 32-bit system, even 500MB malloc's are prone to fail, since
    > often some random DLL will end up mapped right into the middle of an
    > otherwise clean 1GB span of memory, of the theoretically 2GB or 3GB or so
    > available to the app. (for those few apps which try to allocate memory in
    > such large chunks...).


    I experience similar issues with 500MB malloc's failure, and I've not
    been able to determine why. I don't allocate memory that large, but I
    have some test programs that try to blow up memory and noticed the max
    was around 500MB.

    > but, memory usage patterns is its own complicated set of issues.
    >
    > however, given that the average size of most "sane" text files is measurable
    > in kB, with a few larger ones being a few MB, usually there is little risk
    > of size overflows just using the fseek/ftell strategy...
    >
    > (although, yes, if the file *is* larger than 2GB or so, then the size may
    > come back negative or otherwise completely wrong...).
    >
    > but, oh well, whatever...


    As soon as you say getting the size of a file, afaik it's outside the
    scope of the standard. I know in my environment I can't use ftell to
    get a 2GB+ NTFS file size since sizeof( long int ) is 4, and thus I
    have to resort to using a system specific method using uint64_t and
    GetFileAttributeExA. Unfortunately there is no completely portable
    method to get a file size. The best thing in my opinion is to make a
    wrapper function that encapsulates the non-portable behavior so that
    the non-portability is centralized in one spot if at all possible.

    Of course, I wouldn't fault someone for using the ftell/fseek combo if
    that is all that is needed to handle his range of input file sizes as
    long as that person doesn't claim it's portable.

    Best regards,
    John D.
     
    ImpalerCore, Aug 2, 2010
    #4
  5. "ImpalerCore" <> wrote in message
    news:...
    On Aug 2, 2:36 pm, "BGB / cr88192" <> wrote:

    <snip>

    > this function being a great example of CLC pedantics...
    >
    > personally, I would just be like "to hell with it" and just use
    > fseek/ftell,
    > since there is almost no way that a text file is going to be larger than
    > 2GB
    > and expected to just be read into memory like this (it wouldn't fit on 32
    > bit systems anyways...).


    <--
    Unless you have a data file that's over 2GB and you convert it to an
    ASCII representation for 'grep-ability' and now it's ballooned to over
    9GB. Not saying that it's a good thing, but I have done it before
    when I've been in a pinch.
    -->

    note: by "and" I meant T&T=T, not T|T=T.

    but, otherwise, potentially...
    this depends on the app then.

    I suspect this is likely to be a very rare case though, where most text one
    deals with is typically in the kB or low MB range...


    > typically, on a 32-bit system, even 500MB malloc's are prone to fail,
    > since
    > often some random DLL will end up mapped right into the middle of an
    > otherwise clean 1GB span of memory, of the theoretically 2GB or 3GB or so
    > available to the app. (for those few apps which try to allocate memory in
    > such large chunks...).


    <--
    I experience similar issues with 500MB malloc's failure, and I've not
    been able to determine why. I don't allocate memory that large, but I
    have some test programs that try to blow up memory and noticed the max
    was around 500MB.
    -->

    this is due mostly to address-space fragmentation...

    as the size gets larger, the probability of there being an unbroken run of a
    given size becomes increasingly smaller (although, the specifics depend on
    the exact memory layout and allocation behaviors).

    but, 500MB seems to be a good breakoff, as:
    it reprsents, typically, around 1/4 of the total app-usable address space;
    typically, the OS will load libraries, ... in ways which are not typically
    convinient for maintaining an unbroken space.


    there is much less of an issue on 64-bit systems, but there are still
    issues, such as trying to allocate a huge chunk of memory (several GB or
    more) will often horridly bog down the system IME (or, at least on Windows).

    on Windows, a better strategy was to use reserved/uncommitted memory (via
    VirtualAlloc), and then to commit it piecewise...

    in this particular case, I had allocated a 4GB chunk of memory, mostly as I
    needed a region of memory with a "window" where any point in the space could
    be reached via a 32-bit offset, whereas with the wider (64-bit) address
    space, I couldn't make any guerantees about distance (2GB would be maybe
    better though, as 2GB allows any point to be reached from any other via a
    signed 32-bit offset).

    I haven't tested on Linux to see what happens with large malloc or mmap
    calls.


    > but, memory usage patterns is its own complicated set of issues.
    >
    > however, given that the average size of most "sane" text files is
    > measurable
    > in kB, with a few larger ones being a few MB, usually there is little risk
    > of size overflows just using the fseek/ftell strategy...
    >
    > (although, yes, if the file *is* larger than 2GB or so, then the size may
    > come back negative or otherwise completely wrong...).
    >
    > but, oh well, whatever...


    <--
    As soon as you say getting the size of a file, afaik it's outside the
    scope of the standard. I know in my environment I can't use ftell to
    get a 2GB+ NTFS file size since sizeof( long int ) is 4, and thus I
    have to resort to using a system specific method using uint64_t and
    GetFileAttributeExA. Unfortunately there is no completely portable
    method to get a file size. The best thing in my opinion is to make a
    wrapper function that encapsulates the non-portable behavior so that
    the non-portability is centralized in one spot if at all possible.
    -->

    agreed...

    IME, one typically ends up with a lot of OS-specific wrappers, usually
    dedicated to their own source files (like, having various OS-specific source
    files, and then swapping them out for the OS in question).
    this is typically much cleaner than the use of fine-grained ifdef's
    (although, sometimes I have used coarse-grained ifdef's in these files,
    typically so that I don't have to jerk off the Makefile so much, and so the
    entire file contents disappear if not on the OS in question).


    <--
    Of course, I wouldn't fault someone for using the ftell/fseek combo if
    that is all that is needed to handle his range of input file sizes as
    long as that person doesn't claim it's portable.
    -->

    one can claim it is portable (since it *is* part of the standard), so long
    as they also don't claim it will work with larger files...
     
    BGB / cr88192, Aug 2, 2010
    #5
  6. "BGB / cr88192" <> writes:

    > "Ben Bacarisse" <> wrote in message
    > news:...
    >> Navaneeth <> writes:

    <snip>
    >>> tmp = (char*) realloc(fcontent, PAGE_SIZE * pagenum);

    >>
    >> I would not cast the return from realloc.

    >
    > I disagree. there is no harm done by the cast, and (among several other
    > practices commonly disliked here) it also aides in writing code which works
    > in both C and C++ compilers.


    You disagree that I would not cast the return from realloc? :)

    I tried to put it as mildly as possible with the sole purpose of
    encouraging the OP to think about it. I don't mind either way, but the
    argument about compiling with C++ is (or should be) a red herring. It
    is trivial to link C with C++ and preferable to compile the C with a C
    compiler. In cases like this there is no reason to do anything else.

    >> A common trick is to increase the storage used by some factor (> 1)
    >> rather than but adding a fixed amount. This has the advantage of
    >> reducing the number of realloc calls at the expense of a little more
    >> data.

    >
    > yeah, the 1.5^x curve works well in my case.
    > every time one needs more memory, scale the prior size by 1.5...
    > average case overhead is 25%.
    >
    > scaling by a fixed size leads to more realloc, which means more time, and
    > greater memory fragmentation.
    > fragmentation may well be a greater enemy than the raw anount of memory
    > used.


    Did you really mean to say "scaling" here? It sounds like you are
    arguing against it having agreed that it is often preferable.

    <snip>
    > this function being a great example of CLC pedantics...
    >
    > personally, I would just be like "to hell with it" and just use fseek/ftell,
    > since there is almost no way that a text file is going to be larger than 2GB
    > and expected to just be read into memory like this (it wouldn't fit on 32
    > bit systems anyways...).


    Putting aside the issue of pedantry, it is worth pointing out that ftell
    does not report the correct size to read in a whole text file on some
    very common platforms. It may usually be a "safe" lie (i.e. you
    get told a size larger than required) but it is, none the less, not the
    size you need to read.

    <snip>
    --
    Ben.
     
    Ben Bacarisse, Aug 3, 2010
    #6
  7. Navaneeth

    Ian Collins Guest

    On 08/ 3/10 11:16 AM, Ben Bacarisse wrote:
    > "BGB / cr88192"<> writes:
    >>
    >> personally, I would just be like "to hell with it" and just use fseek/ftell,
    >> since there is almost no way that a text file is going to be larger than 2GB
    >> and expected to just be read into memory like this (it wouldn't fit on 32
    >> bit systems anyways...).

    >
    > Putting aside the issue of pedantry, it is worth pointing out that ftell
    > does not report the correct size to read in a whole text file on some
    > very common platforms. It may usually be a "safe" lie (i.e. you
    > get told a size larger than required) but it is, none the less, not the
    > size you need to read.


    But on those systems there isn't a direct way of determining the correct
    size to read without scanning the file. Provided the code doesn't
    expect the result of a read to exactly match the size requested, all
    should be well.

    --
    Ian Collins
     
    Ian Collins, Aug 3, 2010
    #7
  8. On Mon, 02 Aug 2010 11:54:38 -0400, Navaneeth <>
    wrote:

    > printf("failed to reallocate memory\n");

    ....
    > if(result != PAGE_SIZE) {
    > if(feof(fp)) {
    > break;
    > }
    > else {
    > printf("error reading file");

    ....
    >
    > This is working fine. Since I am new to C, I am wondering is this the
    > correct approach to do it? Any suggestions to improve the code are
    > welcome.


    If you're going to output an error message, it's more appropriate to
    send it to stderr, e.g.
    fputs("failed to reallocate memory\n", stderr);

    stdout and stderr are very often the same device, but when they're not,
    users expect to see error messages on stderr, not stdout.
    --
    Morris Keesan --
     
    Morris Keesan, Aug 3, 2010
    #8
  9. Navaneeth

    Navaneeth Guest

    > yeah, the 1.5^x curve works well in my case.
    > every time one needs more memory, scale the prior size by 1.5...
    > average case overhead is 25%.
    >
    > scaling by a fixed size leads to more realloc, which means more time, and
    > greater memory fragmentation.
    > fragmentation may well be a greater enemy than the raw anount of memory
    > used.


    Thanks. I am not sure I followed you fully. Can you give me code
    sample of "realloc" which allocates in a way that you are explaining?
     
    Navaneeth, Aug 3, 2010
    #9
  10. On Aug 3, 7:16 pm, Navaneeth <> wrote:
    >
    >
    > Thanks. I am not sure I followed you fully. Can you give me code
    > sample of "realloc" which allocates in a way that you are explaining?


    >

    He means this

    /*
    untested
    */
    char *loadtext(FILE *fp)
    {
    char *answer;
    void *temp;
    int N = 0;
    int buffsize = 8;
    int ch;

    answer = malloc(buffsize);
    if(!answer)
    return 0;

    while( (ch = fgetc(fp)) != EOF)
    {
    if(N == buffsize -1)
    {
    buffsize = buffsize + buffsize/2;
    temp = realloc(answer, buffsize);
    if(!temp)
    {
    free(answer);
    return 0;
    }
    }
    answer[N++] = ch;
    }
    answer[N] = 0;
    return answer;
    }
     
    Malcolm McLean, Aug 3, 2010
    #10
  11. Navaneeth

    Navaneeth Guest

    On Aug 3, 9:36 pm, Malcolm McLean <>
    wrote:
    > On Aug 3, 7:16 pm, Navaneeth <> wrote:
    >
    >
    >
    > > Thanks. I am not sure I followed you fully. Can you give me code
    > > sample of "realloc" which allocates in a way that you are explaining?

    >
    > He means this
    >


    Thanks. That helped. Do you need to NULL terminate the string when you
    read from file? What about if it is read using "fread"?
     
    Navaneeth, Aug 3, 2010
    #11
  12. "Ben Bacarisse" <> wrote in message
    news:...
    > "BGB / cr88192" <> writes:
    >
    >> "Ben Bacarisse" <> wrote in message
    >> news:...
    >>> Navaneeth <> writes:

    > <snip>
    >>>> tmp = (char*) realloc(fcontent, PAGE_SIZE * pagenum);
    >>>
    >>> I would not cast the return from realloc.

    >>
    >> I disagree. there is no harm done by the cast, and (among several other
    >> practices commonly disliked here) it also aides in writing code which
    >> works
    >> in both C and C++ compilers.

    >
    > You disagree that I would not cast the return from realloc? :)
    >
    > I tried to put it as mildly as possible with the sole purpose of
    > encouraging the OP to think about it. I don't mind either way, but the
    > argument about compiling with C++ is (or should be) a red herring. It
    > is trivial to link C with C++ and preferable to compile the C with a C
    > compiler. In cases like this there is no reason to do anything else.
    >


    actually, I have done this on several cases, for different reasons:
    a library is written in C, but may be changed to/from C++ later (has
    happened a few times);
    writing C code to be compiler on .NET, since technically .NET doesn't
    support C (only C++/CLI), one has to write their C code in a C++ friendly
    way if they hope for it to work on .NET...

    ....

    and, most of the time, it is for really trivial crap, like, casting the
    result of malloc/realloc calls, or using "()" for no-argument functions
    rather than "(void)", ...

    even if many people here dislike them, this is often the most direct route
    to keeping them compatible, which in many cases does matter.


    >>> A common trick is to increase the storage used by some factor (> 1)
    >>> rather than but adding a fixed amount. This has the advantage of
    >>> reducing the number of realloc calls at the expense of a little more
    >>> data.

    >>
    >> yeah, the 1.5^x curve works well in my case.
    >> every time one needs more memory, scale the prior size by 1.5...
    >> average case overhead is 25%.
    >>
    >> scaling by a fixed size leads to more realloc, which means more time, and
    >> greater memory fragmentation.
    >> fragmentation may well be a greater enemy than the raw anount of memory
    >> used.

    >
    > Did you really mean to say "scaling" here? It sounds like you are
    > arguing against it having agreed that it is often preferable.
    >


    yeah, stupid wording sometimes...

    yeah, scaling by 1.5 works well.
    adding a fixed size each time is problematic...


    > <snip>
    >> this function being a great example of CLC pedantics...
    >>
    >> personally, I would just be like "to hell with it" and just use
    >> fseek/ftell,
    >> since there is almost no way that a text file is going to be larger than
    >> 2GB
    >> and expected to just be read into memory like this (it wouldn't fit on 32
    >> bit systems anyways...).

    >
    > Putting aside the issue of pedantry, it is worth pointing out that ftell
    > does not report the correct size to read in a whole text file on some
    > very common platforms. It may usually be a "safe" lie (i.e. you
    > get told a size larger than required) but it is, none the less, not the
    > size you need to read.
    >


    typically I read text files in binary mode, since this avoids the whole
    issue. it is then a matter of dealing with line-endings, ... oneself. the
    main use of text-mode then is in writing files.

    luckily, on nearly all architectures one is likely to deal with, ASCII (or
    UTF-8) is king and only a few different line-ending styles exist (LF, CR LF,
    and CR).
     
    BGB / cr88192, Aug 3, 2010
    #12
  13. Navaneeth

    Bartc Guest

    "Gordon Burditt" <> wrote in message
    news:...
    > >As soon as you say getting the size of a file, afaik it's outside the
    >>scope of the standard.

    >
    > One of the more serious practical problems of getting the size of
    > a file is that it can change size between the time that you get the
    > size and when you read the file. Don't underestimate this problem
    > - many programs that analyze things end up being applied to growing
    > log files.


    > Then there's the odd case where
    > on some systems you can create a terabyte-sized "sparse" file (which
    > reads as a terabyte of mostly zero bytes) on a filesystem on a
    > floppy disk (say, 1.44 megabytes) with space to spare.


    > You really need to deal with the problem of the file growing after
    > the you get the size, so you don't end up crashing the application
    > due to overrunning the memory you malloc()ed. One approach is to
    > use the "file size" as an initial estimate, and be prepared to grow
    > it if needed. Another approach is to simply not read past the
    > initially-determined file size.


    If you're going to worry about all these possibilities, even for reading a
    6-line config file, then you're never going to get anywhere.

    And you could do everything right, and the file *still* changes just before,
    during, or after you've read it. Or maybe all three.

    --
    Bartc
     
    Bartc, Aug 3, 2010
    #13
  14. "BGB / cr88192" <> writes:

    > "Ben Bacarisse" <> wrote in message
    > news:...
    >> "BGB / cr88192" <> writes:
    >>
    >>> "Ben Bacarisse" <> wrote in message
    >>> news:...
    >>>> Navaneeth <> writes:

    >> <snip>
    >>>>> tmp = (char*) realloc(fcontent, PAGE_SIZE * pagenum);
    >>>>
    >>>> I would not cast the return from realloc.
    >>>
    >>> I disagree. there is no harm done by the cast, and (among several other
    >>> practices commonly disliked here) it also aides in writing code which
    >>> works
    >>> in both C and C++ compilers.

    >>
    >> You disagree that I would not cast the return from realloc? :)
    >>
    >> I tried to put it as mildly as possible with the sole purpose of
    >> encouraging the OP to think about it. I don't mind either way, but the
    >> argument about compiling with C++ is (or should be) a red herring. It
    >> is trivial to link C with C++ and preferable to compile the C with a C
    >> compiler. In cases like this there is no reason to do anything else.
    >>

    >
    > actually, I have done this on several cases, for different reasons:
    > a library is written in C, but may be changed to/from C++ later (has
    > happened a few times);


    That sounds circular to me. It does not say why you change C code to
    C++ and then possible back again. Yes, if you switch compilers the code
    will need to be both, but you don't say why it makes sense to work this
    way in the first place.

    > writing C code to be compiler on .NET, since technically .NET doesn't
    > support C (only C++/CLI), one has to write their C code in a C++ friendly
    > way if they hope for it to work on .NET...


    C++/CLI is, of course, yet another language even further from C. Trying
    to get source code compatibility with C++/CLI seems to me to be doomed
    except for the simplest code fragments. And I am not sure what you mean
    by "does not support C". It was my understanding (patchy, I'll admit)
    that C++/CLI can link to native C functions just fine. I.e. the generic
    "compile as C and link" advice stands for C++/CLI as well.

    I think that writing C with an eye to C++ or, worse, to C++/CLI is not
    advisable in most cases.

    <snip>
    --
    Ben.
     
    Ben Bacarisse, Aug 3, 2010
    #14
  15. "Ben Bacarisse" <> wrote in message
    news:...
    > "BGB / cr88192" <> writes:
    >
    >> "Ben Bacarisse" <> wrote in message
    >> news:...
    >>> "BGB / cr88192" <> writes:
    >>>
    >>>> "Ben Bacarisse" <> wrote in message
    >>>> news:...
    >>>>> Navaneeth <> writes:
    >>> <snip>
    >>>>>> tmp = (char*) realloc(fcontent, PAGE_SIZE * pagenum);
    >>>>>
    >>>>> I would not cast the return from realloc.
    >>>>
    >>>> I disagree. there is no harm done by the cast, and (among several other
    >>>> practices commonly disliked here) it also aides in writing code which
    >>>> works
    >>>> in both C and C++ compilers.
    >>>
    >>> You disagree that I would not cast the return from realloc? :)
    >>>
    >>> I tried to put it as mildly as possible with the sole purpose of
    >>> encouraging the OP to think about it. I don't mind either way, but the
    >>> argument about compiling with C++ is (or should be) a red herring. It
    >>> is trivial to link C with C++ and preferable to compile the C with a C
    >>> compiler. In cases like this there is no reason to do anything else.
    >>>

    >>
    >> actually, I have done this on several cases, for different reasons:
    >> a library is written in C, but may be changed to/from C++ later (has
    >> happened a few times);

    >
    > That sounds circular to me. It does not say why you change C code to
    > C++ and then possible back again. Yes, if you switch compilers the code
    > will need to be both, but you don't say why it makes sense to work this
    > way in the first place.
    >


    usually, it is because code is very fluid typically.
    sometimes one will write it for one purpose, copy it somewhere else, and use
    it for another task in a new context. sometimes, one moves across language
    barriers as well.

    C <-> C++ is a common one, but I have also done copy-paste-edit to move code
    between C and Java a few times as well, ...


    >> writing C code to be compiler on .NET, since technically .NET doesn't
    >> support C (only C++/CLI), one has to write their C code in a C++ friendly
    >> way if they hope for it to work on .NET...

    >
    > C++/CLI is, of course, yet another language even further from C. Trying
    > to get source code compatibility with C++/CLI seems to me to be doomed
    > except for the simplest code fragments. And I am not sure what you mean
    > by "does not support C". It was my understanding (patchy, I'll admit)
    > that C++/CLI can link to native C functions just fine. I.e. the generic
    > "compile as C and link" advice stands for C++/CLI as well.
    >


    well, yes, but only if one is compiling the C or C++/CLI code as native
    code.
    one has to compile the code as C++/CLI though if one wants to have it be
    compiled to MSIL.


    > I think that writing C with an eye to C++ or, worse, to C++/CLI is not
    > advisable in most cases.
    >


    possibly...

    but, I also tend to write my code to be processed via fairly dumb automated
    tools as well, most of which have only partial understanding of C syntax
    (and typically impose many restrictions, such as limiting valid declaration
    syntax, modifier order, ...).

    writing code so that it can also be fed through C++/CLI is not asking
    much...


    hell, some people went and compiled Quake2 using C++/CLI (Quake2 is itself
    written in C), so it is probably not asking that much...
     
    BGB / cr88192, Aug 4, 2010
    #15
  16. Navaneeth

    Ian Collins Guest

    On 08/ 4/10 06:21 AM, BGB / cr88192 wrote:
    > "Ben Bacarisse"<> wrote:
    >>
    >> I tried to put it as mildly as possible with the sole purpose of
    >> encouraging the OP to think about it. I don't mind either way, but the
    >> argument about compiling with C++ is (or should be) a red herring. It
    >> is trivial to link C with C++ and preferable to compile the C with a C
    >> compiler. In cases like this there is no reason to do anything else.

    >
    > actually, I have done this on several cases, for different reasons:
    > a library is written in C, but may be changed to/from C++ later (has
    > happened a few times);


    It sounds like you need to organise your code to enable you to link C
    and C++ libraries, rather than copy and paste code.

    > writing C code to be compiler on .NET, since technically .NET doesn't
    > support C (only C++/CLI), one has to write their C code in a C++ friendly
    > way if they hope for it to work on .NET...


    Probably because there isn't a good reason for writing C on that
    platform. Why don't you just link C functions from a library?

    > and, most of the time, it is for really trivial crap, like, casting the
    > result of malloc/realloc calls, or using "()" for no-argument functions
    > rather than "(void)", ...
    >
    > even if many people here dislike them, this is often the most direct route
    > to keeping them compatible, which in many cases does matter.


    I'm one of the few people I know who compiles C as C++ and that's only
    for a specific scenario (testing). Even then, I'll keep the code that
    has to be C++ compatible separate from the rest of the source.

    --
    Ian Collins
     
    Ian Collins, Aug 4, 2010
    #16
  17. "Ian Collins" <> wrote in message
    news:...
    > On 08/ 4/10 06:21 AM, BGB / cr88192 wrote:
    >> "Ben Bacarisse"<> wrote:
    >>>
    >>> I tried to put it as mildly as possible with the sole purpose of
    >>> encouraging the OP to think about it. I don't mind either way, but the
    >>> argument about compiling with C++ is (or should be) a red herring. It
    >>> is trivial to link C with C++ and preferable to compile the C with a C
    >>> compiler. In cases like this there is no reason to do anything else.

    >>
    >> actually, I have done this on several cases, for different reasons:
    >> a library is written in C, but may be changed to/from C++ later (has
    >> happened a few times);

    >
    > It sounds like you need to organise your code to enable you to link C and
    > C++ libraries, rather than copy and paste code.
    >


    I go both routes...

    usually copy/paste happens if one is doing mix and match of code from
    different places, or codebases, in order to make a combined result...


    >> writing C code to be compiler on .NET, since technically .NET doesn't
    >> support C (only C++/CLI), one has to write their C code in a C++ friendly
    >> way if they hope for it to work on .NET...

    >
    > Probably because there isn't a good reason for writing C on that platform.
    > Why don't you just link C functions from a library?
    >


    the reason mostly is that of CPU arch:
    if one uses 32-bit native DLL's, they are stuck with x86;
    if they use MSIL, than the JIT can do its magic and compile as either 32-bit
    or 64-bit.

    otherwise, they would have to build separate 32 and 64 bit versions of their
    DLL's...

    but, mind you, I do make a lot of use of libraries as well, just libraries
    and copy/paste/edit each have their merits and drawbacks, and which one is
    better depends a lot on the situation...


    >> and, most of the time, it is for really trivial crap, like, casting the
    >> result of malloc/realloc calls, or using "()" for no-argument functions
    >> rather than "(void)", ...
    >>
    >> even if many people here dislike them, this is often the most direct
    >> route
    >> to keeping them compatible, which in many cases does matter.

    >
    > I'm one of the few people I know who compiles C as C++ and that's only for
    > a specific scenario (testing). Even then, I'll keep the code that has to
    > be C++ compatible separate from the rest of the source.
    >


    possibly, however, as making it convert easily is almost free, I see little
    reason not to do it wherever it makes sense...
     
    BGB / cr88192, Aug 4, 2010
    #17
  18. Navaneeth

    Ian Collins Guest

    On 08/ 4/10 04:06 PM, BGB / cr88192 wrote:
    > "Ian Collins"<> wrote:
    >> On 08/ 4/10 06:21 AM, BGB / cr88192 wrote:

    >
    >>> and, most of the time, it is for really trivial crap, like, casting the
    >>> result of malloc/realloc calls, or using "()" for no-argument functions
    >>> rather than "(void)", ...
    >>>
    >>> even if many people here dislike them, this is often the most direct
    >>> route
    >>> to keeping them compatible, which in many cases does matter.

    >>
    >> I'm one of the few people I know who compiles C as C++ and that's only for
    >> a specific scenario (testing). Even then, I'll keep the code that has to
    >> be C++ compatible separate from the rest of the source.

    >
    > possibly, however, as making it convert easily is almost free, I see little
    > reason not to do it wherever it makes sense...


    The most obvious reason not to do it is idiomatic C != idiomatic C++!

    --
    Ian Collins
     
    Ian Collins, Aug 4, 2010
    #18
  19. Navaneeth

    Navaneeth Guest

    >   untested
    > */
    > char *loadtext(FILE *fp)
    > {
    >   char *answer;
    >   void *temp;
    >   int N = 0;
    >   int buffsize = 8;
    >   int ch;
    >
    >   answer = malloc(buffsize);
    >   if(!answer)
    >     return 0;
    >
    >   while( (ch = fgetc(fp)) != EOF)
    >   {
    >     if(N == buffsize -1)
    >     {
    >       buffsize = buffsize + buffsize/2;
    >       temp = realloc(answer, buffsize);
    >       if(!temp)
    >       {
    >         free(answer);
    >         return 0;
    >       }
    >     }
    >     answer[N++] = ch;
    >   }
    >   answer[N] = 0;
    >   return answer;
    >
    > }
    >
    >


    Well, If I use this, I think I will have more allocations. Consider a
    10KB file. I am allocating 4KB (I have PAGE_SIZE set as 4096 bytes)
    first. If I use my code, I will allocate 8KB for the second iteration
    and 12KB for the last iteration. Now on your code, second allocation
    will be made for 6KB, third allocation for 9 and the last for 13. So
    my initial code finished in just 3 allocation while the new code takes
    4.

    I am not understanding how this is a better strategy for allocations?
     
    Navaneeth, Aug 4, 2010
    #19
  20. On 3 Aug, 19:21, "BGB / cr88192" <> wrote:
    > "Ben Bacarisse" <> wrote in message
    > news:...
    > > "BGB / cr88192" <> writes:
    > >> "Ben Bacarisse" <> wrote in message
    > >>news:...
    > >>> Navaneeth <> writes:


    <snip>

    > >> [...] code which works in both C and C++ compilers.


    <snip>

    > and, most of the time, it is for really trivial crap, like, casting the
    > result of malloc/realloc calls, or using "()" for no-argument functions
    > rather than "(void)", ...


    just curious, what's wrong with using (void) for an empty argment list
    in C++? I always thought this was just a style thing
     
    Nick Keighley, Aug 4, 2010
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. jazzy
    Replies:
    3
    Views:
    10,631
    Gordon Beaton
    May 25, 2004
  2. Replies:
    7
    Views:
    378
  3. Replies:
    4
    Views:
    355
    Default User
    Sep 6, 2006
  4. Replies:
    3
    Views:
    299
    David Thompson
    Mar 24, 2008
  5. deech
    Replies:
    6
    Views:
    615
    Default User
    Feb 12, 2010
Loading...

Share This Page