Strange C developments

Discussion in 'C Programming' started by Jase Schick, Jul 20, 2012.

  1. Jase Schick

    Jase Schick Guest

    Hi Can anyone explain why C has added support for pthreads, while NOT
    adding support for garbage collection? Convenient memory management would
    be a much greater enhancement than replicating a perfectly good existing
    library it seems to me.

    Jase
    Jase Schick, Jul 20, 2012
    #1
    1. Advertising

  2. Jase Schick

    Les Cargill Guest

    Jase Schick wrote:
    > Hi Can anyone explain why C has added support for pthreads, while NOT
    > adding support for garbage collection?


    GC in 'C' is a non sequitur.

    > Convenient memory management would
    > be a much greater enhancement than replicating a perfectly good existing
    > library it seems to me.
    >
    > Jase
    >


    I think you want Java, then.

    --
    Les Cargill
    Les Cargill, Jul 21, 2012
    #2
    1. Advertising

  3. Jase Schick

    Stefan Ram Guest

    "christian.bau" <> writes:
    >And sorry to say, but C wasn't designed with garbage collection in
    >mind.


    I believe that a GC is both important and nice, a great productivity
    enhancer, but for /higher-level languages/. C, on the other hand,
    is /intended to be/ a low-level language. A language that adds just
    a thin layer over the machine language. It is intended to be
    a language to possibly /implement a garbage collector in/, when
    it should be needed for a higher level language.

    This is about layering: One does not complain that features of a
    higher layer are not present in a lower layer, because this would
    break the layering. When one wants to use Perl, Java, or LISP
    instead of C, one is always free to do so.

    There also is the Boehm-Demers-Weiser conservative garbage collector.
    Stefan Ram, Jul 21, 2012
    #3
  4. Jase Schick

    Rui Maciel Guest

    Jase Schick wrote:

    > Hi Can anyone explain why C has added support for pthreads, while NOT
    > adding support for garbage collection? Convenient memory management would
    > be a much greater enhancement than replicating a perfectly good existing
    > library it seems to me.


    Why do you believe that garbage collection should be added to the C
    standard?


    Rui Maciel
    Rui Maciel, Jul 21, 2012
    #4
  5. Jase Schick

    BGB Guest

    On 7/20/2012 4:14 PM, Jase Schick wrote:
    > Hi Can anyone explain why C has added support for pthreads, while NOT
    > adding support for garbage collection? Convenient memory management would
    > be a much greater enhancement than replicating a perfectly good existing
    > library it seems to me.
    >


    more people can agree on the behavior of a threading library than they
    can on the behavior of a GC library?...

    a few possible issues with a GC:
    precise or conservative GC?
    how does it represent references?
    how does it interact with stack variables, global variables, or malloc?
    does it simply behave like malloc, or does it also preserve type
    information?
    ....

    some of these issues would need to be addressed, and as-is, people get
    by ok either using or implementing garbage-collection via libraries.
    BGB, Jul 21, 2012
    #5
  6. Jase Schick

    Nobody Guest

    On Fri, 20 Jul 2012 21:14:50 +0000, Jase Schick wrote:

    > Hi Can anyone explain why C has added support for pthreads, while NOT
    > adding support for garbage collection?


    GC requires a "walled garden", which C isn't.

    You can't just "add support" for GC, you have to design the language
    around the ability to enumerate references.
    Nobody, Jul 21, 2012
    #6
  7. Jase Schick

    jacob navia Guest

    Le 20/07/12 23:14, Jase Schick a écrit :
    > Hi Can anyone explain why C has added support for pthreads, while NOT
    > adding support for garbage collection? Convenient memory management would
    > be a much greater enhancement than replicating a perfectly good existing
    > library it seems to me.
    >
    > Jase
    >

    The lcc-win compiler offers a garbage collector (Boehm's) in its
    standard distribution. It is a very useful feature, used for instance in
    the debugger of lcc-win, in the IDE and several other applications. Of
    course it is used by many of the people that have downloaded lcc-win
    (more than 1 million)
    jacob navia, Jul 21, 2012
    #7
  8. Jase Schick

    Quentin Pope Guest

    On Sat, 21 Jul 2012 11:43:36 +0200, jacob navia wrote:
    > Le 20/07/12 23:14, Jase Schick a écrit :
    >> Hi Can anyone explain why C has added support for pthreads, while NOT
    >> adding support for garbage collection? Convenient memory management
    >> would be a much greater enhancement than replicating a perfectly good
    >> existing library it seems to me.
    >>
    >> Jase
    >>

    > The lcc-win compiler offers a garbage collector (Boehm's) in its
    > standard distribution. It is a very useful feature, used for instance in
    > the debugger of lcc-win, in the IDE and several other applications. Of
    > course it is used by many of the people that have downloaded lcc-win
    > (more than 1 million)


    Do you never get tired of spamming this group with advertising for your
    compiler?

    Adding garbage collection would break a large amount of existing code.

    Often the bottom couple of bits of pointers to memory with known
    alignment properties will be used to store information (the pointer than
    being and'd with ~0x3ul or similar prior to dereferencing).

    Many code protection methods rely on storing pointers xor'd with an
    obfuscating mask. GCs are not sophisticated enough to track such pointers.

    And what is the gain? With careful programming, there is no need
    whatsoever for this stupid overhead. Leave it for the kiddies programming
    JAVA.

    //QP
    Quentin Pope, Jul 21, 2012
    #8
  9. Jase Schick

    jacob navia Guest

    Le 21/07/12 12:23, Quentin Pope a écrit :
    > On Sat, 21 Jul 2012 11:43:36 +0200, jacob navia wrote:
    >> Le 20/07/12 23:14, Jase Schick a écrit :
    >>> Hi Can anyone explain why C has added support for pthreads, while NOT
    >>> adding support for garbage collection? Convenient memory management
    >>> would be a much greater enhancement than replicating a perfectly good
    >>> existing library it seems to me.
    >>>
    >>> Jase
    >>>

    >> The lcc-win compiler offers a garbage collector (Boehm's) in its
    >> standard distribution. It is a very useful feature, used for instance in
    >> the debugger of lcc-win, in the IDE and several other applications. Of
    >> course it is used by many of the people that have downloaded lcc-win
    >> (more than 1 million)

    >
    > Do you never get tired of spamming this group with advertising for your
    > compiler?
    >
    > Adding garbage collection would break a large amount of existing code.
    >


    To port existing code to a GC environment you do not need to change a
    single line. Just define malloc as gc_malloc and define free as a noop.


    > Often the bottom couple of bits of pointers to memory with known
    > alignment properties will be used to store information (the pointer than
    > being and'd with ~0x3ul or similar prior to dereferencing).
    >


    ??? That is not the case with the GC used by lcc-win.

    > Many code protection methods rely on storing pointers xor'd with an
    > obfuscating mask. GCs are not sophisticated enough to track such pointers.
    >


    Yes, that kind of code shouldn't be used with a GC.

    > And what is the gain?


    The gain is that instead of loosing endless hours tracking that dangling
    pointer in the debugger you can concentrate on your application instead.


    > With careful programming, there is no need
    > whatsoever for this stupid overhead.


    You fail to mention "With careful programming and not making any
    mistake. NEVER. A single moment of inattention and you are screwed.


    Leave it for the kiddies programming
    > JAVA.
    >


    JAVA, Lisp, C++, C, all the languages that can be used ith a collector.
    jacob navia, Jul 21, 2012
    #9
  10. בת×ריך ×™×•× ×©×‘×ª,21 ביולי 2012 11:23:52 UTC+1, מ×ת Quentin Pope:
    >
    >
    > Often the bottom couple of bits of pointers to memory with known
    > alignment properties will be used to store information (the pointer than
    > being anded with ~0x3ul or similar prior to dereferencing).
    >
    > Many code protection methods rely on storing pointers xored with an
    > obfuscating mask. GCs are not sophisticated enough to track such pointers..
    >

    Rarely do you need to do this sort of thing, particularly in a hosted environment.

    The gain is that far too much C code is concerned with handling memory allocation failures and clean-up that can't happen. For instance user provides a list of filenames in a configuration file. I need to read them in and return as a list of strings. If a memory allocation failure occurs halfway through building the list, I've got to deallocate a half-built list, and return an error condition, probably a null pointer. The code to handle this willprobably be about half the function, even though if I've 4GB of memory installed, and the total allocation is 1000 bytes, the computer is more likelyto suffer an electrical failure than it is to run out of memory.

    Thne you've got to write little function just to deallocate the list of strings, in normal use.

    The reason I don't use garbage collection is a) it's non-standard and b) most garbage collectors are unacceptably inefficient for high performance routines. But it's a blessing from the coding angle.
    Malcolm McLean, Jul 21, 2012
    #10
  11. Jase Schick

    BGB Guest

    On 7/21/2012 4:32 AM, Nobody wrote:
    > On Fri, 20 Jul 2012 21:14:50 +0000, Jase Schick wrote:
    >
    >> Hi Can anyone explain why C has added support for pthreads, while NOT
    >> adding support for garbage collection?

    >
    > GC requires a "walled garden", which C isn't.
    >
    > You can't just "add support" for GC, you have to design the language
    > around the ability to enumerate references.
    >


    actually, it can be done, but mostly in the form of conservative GCs,
    such as Boehm.

    I also use a GC, which is functionally similar to the Boehm GC as well,
    but differs mostly in the use of type-tagging (and some ability to
    assign special treatment to type-tags), so it is a little different.

    I ended up primarily using it with a hybrid strategy, where manual
    memory-management is used for the most part, and GC is used mostly to
    clean up for the case of memory leaks.
    BGB, Jul 21, 2012
    #11
  12. Jase Schick

    BGB Guest

    On 7/21/2012 8:57 AM, Gordon Burditt wrote:
    >>> Adding garbage collection would break a large amount of existing code.
    >>>

    >>
    >> To port existing code to a GC environment you do not need to change a
    >> single line. Just define malloc as gc_malloc and define free as a noop.

    >
    > Jacob, what rules would have to be added for application writers
    > to use the GC in lcc-win besides the C standard and "don't invoke
    > undefined behavior"? Hint: the answer is not "none". And that's
    > not intended as a put-down of lcc's GC or GC in general.
    >
    > Or, to put it another way which I'm sure you will still think of
    > as a personal attack, list the things an application programmer
    > could do to break GC that no sane programmer would do (someone will
    > come up with a sane-sounding reason for doing it anyway) but the C
    > standard still allows. If the C standard calls it undefined behavior
    > on the part of the application, GC is off the hook.
    >
    > I'll define "break GC" as any one or more of the following:
    > - Collecting non-garbage
    > - Aborting the program with things like segfaults.
    > - A 1,000,000% slowdown
    > - A GC so conservative it never collects anything.
    >


    in my case, the policy is basically just to give code the choice.
    if it wants to use malloc, it can use malloc;
    if it wants to use the GC, it can use the GC.

    the GC wont generally trace through memory allocated via malloc though.
    in my case, there is a "gcmalloc()" call, which is behaviorally similar
    to malloc (it wont automatically release the memory), except that the GC
    will trace through it.

    putting pointers to malloc-managed memory in GC-managed objects works,
    just the GC will ignore them.

    it is possible to register behavior hooks with the GC to allow
    custom-managed memory regions.


    usually, the GC wont collect anything until after a certain amount of
    memory is used (can be set by the program), so if the app keeps its
    memory use below this limit, the GC wont run (this currently leads to a
    case of setting the limit at something like 1GB, which makes the GC
    running, except in cases where the app "springs a leak" fairly unlikely).


    > I'm pretty sure that writing pointers to a (potentially terabyte)
    > temporary file, erasing the pointers from memory, and later reading
    > the pointers back from the file (all done by the same run of the
    > same program) will either confuse or horribly slow down GC. And I
    > think that's perfectly OK with the C standard.


    actually, more likely, it will become timing dependent:
    if the GC runs during the time the pointers no longer exist, the objects
    may be freed;
    if it does not, nothing happens (the pointers will just come back in
    pointing at the same objects).

    in the case where the pointers are read back in for objects which have
    been freed, then essentially they are just dangling pointers (and the GC
    will either ignore them, or treat them as if they point to whatever new
    object was allocated at that address).


    this case only really matters if the GC is being used to replace malloc
    or similar.

    otherwise, a person could declare that the case of temporarily hiding
    and then restoring pointers is itself undefined-behavior.


    >>> Often the bottom couple of bits of pointers to memory with known
    >>> alignment properties will be used to store information (the pointer than
    >>> being and'd with ~0x3ul or similar prior to dereferencing).
    >>>

    >>
    >> ??? That is not the case with the GC used by lcc-win.

    >
    > I take that to mean: often an application program will abuse the bottom
    > couple of bits of pointers to memory with known alignment properties to
    > store information, and this will break the GC. I think doing that to
    > pointers invokes undefined behavior, at least if you do stuff like
    > void *ptr;
    >
    > ptr |= 3;
    > or ptr &= ~3;
    >
    > so the GC has a perfectly good excuse for breaking.
    >


    my GC will handle this case just fine (my GC is more conservative, and
    so will by-default treat a pointer to anywhere in the object as if it
    were a GC reference). likewise, most GC operations will accept these
    pointers as well, and there is also a "gcGetBase()" operation mostly
    specially for this case: it allows taking a pointer into an object and
    getting the starting address of the object.


    I don't know how Boehm handles it exactly, but IIRC there are script VMs
    which use Boehm and use similar tagging schemes.


    I don't personally use this sort of tagging in my VM, instead having the
    GC keep track of the type, so whenever the VM needs to do something, it
    will fetch the type-name for an object, or maybe fetch an associated
    type-vtable (the VM has tables of function-pointers used to represent
    common operations on objects of various types).

    note that there is also a Class/Instance OO system, but this uses its
    own independent vtables, and these objects exist as a single type from
    the POV of the GC.


    >>> Many code protection methods rely on storing pointers xor'd with an
    >>> obfuscating mask. GCs are not sophisticated enough to track such pointers.

    >
    > Code protection methods are designed to break the application, but
    > the C standard doesn't forbid that.
    >


    simple: label the case of XOR'ed pointers + GC'ed objects as undefined.


    > I'm not sure the C standard allows that specific method of disguising
    > pointers, but I think you are allowed to format a pointer with
    > sprintf() and %p, erase the original pointer, then later feed the
    > buffer sprintf() wrote into sscanf() and %p, and get back the same
    > pointer. While it's in text form, you could do all sorts of things
    > to it (encrypt, store it in a file, etc.) , as long as it eventually
    > gets back to the original text.
    >


    possible, but this case can also be labeled as undefined with GC
    objects, or maybe, it can be restricted to certain cases, such as it is
    only allowed if the objects are pinned/locked beforehand, or if they
    were allocated either with malloc or a malloc-like call (and so will not
    be implicitly freed).


    >> Yes, that kind of code shouldn't be used with a GC.

    >
    > People want to know how to recognize "that kind of code" and not
    > try to use GC with it, rather than trying it and figuring it out
    > the hard way. And I don't really see a "don't disguise pointers"
    > rule as being too onerous if you can clearly define what *ISN'T*
    > disguised. Probably 99.9% of programs would need no change.
    >


    in my case, it is more like:
    pointers stored in global variables are visible (1);
    pointers stored in stack variables are visible (2);
    pointers stored in other GC managed objects.


    1: on both Windows and Linux, the GC will walk the list of loaded
    modules and scan the ".data" and ".bss" sections and similar. this also
    includes any "static" variables within functions.

    2: this can be a little harder. in my case, the GC supplies its own
    thread-creation functions, but it is possible to do so without doing
    this (for example, Boehm walks the OS thread list AFAIK). there is a
    difficulty with knowing the exact stack address on WoW64 targets (an
    issue for Boehm AFAIK), but my GC uses a different strategy: it just
    scans the stack until either the known limit is encountered, or Windows
    throws a SEH exception (when the scan function tries to scan into an
    uninitialized guard page), which the function catches and handles.


    >>> And what is the gain?

    >>
    >> The gain is that instead of loosing endless hours tracking that dangling
    >> pointer in the debugger you can concentrate on your application instead.

    >
    > Oh, yes, another rule for a working GC is that the GC has to
    > occasionally actually collect some garbage if there's any around
    > to collect. It doesn't have to catch everything immediately, but
    > code like:
    >
    > #define malloc(x) gc_malloc(x)
    > int main(void)
    > {
    > while(malloc(6)) { /* loop */ }
    > }
    > shouldn't terminate the loop because malloc() eventually returns NULL
    > due to insufficient collection of garbage.
    >
    >


    usually what happens in this case is that the caller thread will block
    until the GC thread can run and finish (running out of memory triggers
    an immediate GC).
    BGB, Jul 21, 2012
    #12
  13. BGB <> writes:

    > On 7/21/2012 4:32 AM, Nobody wrote:
    >> On Fri, 20 Jul 2012 21:14:50 +0000, Jase Schick wrote:
    >>
    >>> Hi Can anyone explain why C has added support for pthreads, while NOT
    >>> adding support for garbage collection?

    >>
    >> GC requires a "walled garden", which C isn't.
    >>
    >> You can't just "add support" for GC, you have to design the language
    >> around the ability to enumerate references.
    >>

    >
    > actually, it can be done, but mostly in the form of conservative GCs,
    > such as Boehm.


    People are, I think, talking at cross purposes. Those who say "it can
    be done" mean it can be done in a way that works for some (possibly
    large) set of programs. Those that say it can't be done mean that there
    are correct C program which will break when linked, unchanged, against a
    collecting implementation of malloc.

    The discussion would involve much less pointless back-and-forth if
    everyone started out by agreeing that it can work for a lot of programs
    but it can't work for all programs. The debate could then be about the
    kinds of program for which GC might or might not be useful.

    <snip>
    --
    Ben.
    Ben Bacarisse, Jul 21, 2012
    #13
  14. Jase Schick

    jacob navia Guest

    Le 21/07/12 15:57, Gordon Burditt a écrit :
    > Jacob, what rules would have to be added for application writers
    > to use the GC in lcc-win besides the C standard and "don't invoke
    > undefined behavior"? Hint: the answer is not "none". And that's
    > not intended as a put-down of lcc's GC or GC in general.


    1) Do not hide pointers to the collector, i.e. the collector will NOT
    search the window extra bytes (for instance) for pointers, or the hard
    disk. If you need to XOR pointers to hide them do not use the collector
    either.

    2) In modern machines a collector slows down the program for at most
    a milisecond in normal situations, that could be bigger but not much
    bigger since the collector tries to spread out the GC time in each
    allocation.

    That's all.

    jacob
    jacob navia, Jul 21, 2012
    #14
  15. Jase Schick

    jacob navia Guest

    Le 21/07/12 15:57, Gordon Burditt a écrit :
    > - Collecting non-garbage
    > - Aborting the program with things like segfaults.
    > - A 1,000,000% slowdown
    > - A GC so conservative it never collects anything.


    You can collect non garbage when you hide the pointers
    from the collector using XOR. Then the collector will
    assume that all the pointers not referenced are free
    and havoc will ensue.

    Also, if you always keep pointers to everything never setting them to
    NULL, and all pointers are global (or the roots are in the main()
    function, the collector will never collect anything.

    For instance a global list will never be collected if its root is a
    global pointer. To collect that you have to set that pointer to NULL.
    jacob navia, Jul 21, 2012
    #15
  16. Jase Schick

    BGB Guest

    On 7/21/2012 11:38 AM, Ben Bacarisse wrote:
    > BGB <> writes:
    >
    >> On 7/21/2012 4:32 AM, Nobody wrote:
    >>> On Fri, 20 Jul 2012 21:14:50 +0000, Jase Schick wrote:
    >>>
    >>>> Hi Can anyone explain why C has added support for pthreads, while NOT
    >>>> adding support for garbage collection?
    >>>
    >>> GC requires a "walled garden", which C isn't.
    >>>
    >>> You can't just "add support" for GC, you have to design the language
    >>> around the ability to enumerate references.
    >>>

    >>
    >> actually, it can be done, but mostly in the form of conservative GCs,
    >> such as Boehm.

    >
    > People are, I think, talking at cross purposes. Those who say "it can
    > be done" mean it can be done in a way that works for some (possibly
    > large) set of programs. Those that say it can't be done mean that there
    > are correct C program which will break when linked, unchanged, against a
    > collecting implementation of malloc.
    >


    and, I wasn't thinking about replacing "malloc()" in the first place,
    rather my GC works via its own calls ("gcalloc()", "gctalloc()",
    "gcmalloc()", "gcfree()", ...).

    in this case, linking most C programs unchanged against the GC wouldn't
    change or break anything, but they wouldn't be using the GC in this case
    either...


    > The discussion would involve much less pointless back-and-forth if
    > everyone started out by agreeing that it can work for a lot of programs
    > but it can't work for all programs. The debate could then be about the
    > kinds of program for which GC might or might not be useful.
    >


    or, possibly, what exact form the GC would take in the first place...
    BGB, Jul 21, 2012
    #16
  17. בת×ריך ×™×•× ×©×‘×ª,21 ביולי 2012 17:38:12 UTC+1, מ×ת Ben Bacarisse:
    > BGB &lt;&gt; writes:
    >
    > People are, I think, talking at cross purposes. Those who say 'it can
    > be done' mean it can be done in a way that works for some (possibly
    > large) set of programs. Those that say it can't be done mean that there
    > are correct C program which will break when linked, unchanged, against a
    > collecting implementation of malloc.
    >

    In C a pointer is valid if saved to a file in binary form then read back inagain. It's invalid if read back in to an second instance of the program. So the standard needs a tweak to make the first situation also invalid.
    I don't see that as a problem, except for very specialised memory paging software that lives deep down in the bowels of the operating system. That's generally written in a non-standard version of C anyway.
    Malcolm McLean, Jul 21, 2012
    #17
  18. Malcolm McLean <> writes:

    > בת×ריך ×™×•× ×©×‘×ª, 21 ביולי 2012 17:38:12 UTC+1, מ×ת Ben Bacarisse:
    >> BGB &lt;&gt; writes:
    >>
    >> People are, I think, talking at cross purposes. Those who say 'it can
    >> be done' mean it can be done in a way that works for some (possibly
    >> large) set of programs. Those that say it can't be done mean that there
    >> are correct C program which will break when linked, unchanged, against a
    >> collecting implementation of malloc.
    >>

    > In C a pointer is valid if saved to a file in binary form then read
    > back in again. It's invalid if read back in to an second instance of
    > the program. So the standard needs a tweak to make the first situation
    > also invalid.


    Why? What would be the benefit?

    <snip>
    --
    Ben.
    Ben Bacarisse, Jul 21, 2012
    #18
  19. Jase Schick

    Nobody Guest

    On Sat, 21 Jul 2012 20:13:28 +0100, Ben Bacarisse wrote:

    >> In C a pointer is valid if saved to a file in binary form then read back
    >> in again. It's invalid if read back in to an second instance of the
    >> program. So the standard needs a tweak to make the first situation also
    >> invalid.

    >
    > Why? What would be the benefit?


    What would be the benefit in writing a pointer to a file, or to making
    that invalid?

    Writing a pointer to a file may be useful for virtual memory systems such
    as that used by the Win16 API.

    Forbidding writing pointers to files would eliminate one possible
    mechanism whereby "transparent" GC would fail.
    Nobody, Jul 21, 2012
    #19
  20. Jase Schick

    Tim Rentsch Guest

    "christian.bau" <> writes:

    > On Jul 20, 10:14 pm, Jase Schick <> wrote:
    >> Hi Can anyone explain why C has added support for pthreads, while NOT
    >> adding support for garbage collection? Convenient memory management would
    >> be a much greater enhancement than replicating a perfectly good existing
    >> library it seems to me.

    >
    > Interestingly, Apple just killed garbage collection in their Objective-
    > C compilers and never moved support for GC to the iPhone.
    > So someone there thinks that garbage collection isn't _that_ useful.


    I believe that is consequent to a confluence of (a) wanting the
    environments on the iPhone and MacOS to be the same, (b) needing a
    certain level of real-time response on the iPhone, and (c) adopting a
    different approach to resource management that allows reference
    counting to be used rather than general GC.

    It isn't hard to write a high-performance, general-purpose GC. It's
    much harder to write a high-performance GC that observes severe
    real-time constraints.

    > And sorry to say, but C wasn't designed with garbage collection in
    > mind.


    But that doesn't precude GC being either feasible or practical in
    a C environment.
    Tim Rentsch, Jul 21, 2012
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.

Share This Page