Garbage Collection in C

J

jacob navia

William said:
Because although it will "work" without being more efficient, it may
have very long pauses, and may be so slow as to be useless.
You do not have to rewrite code to get it to work, but you
may have to rewrite code to get it to work usefully. (And worse, you
may be unable to rewrite code to get it to work usefully)

Excuse me but you tell us that your program uses 1GB of RAM.
This is not a normal application, and you may need to
optimize your usage.

Besides, you are having NOW a pause each time you do a free() since
free() will try to consolidate memory. If you use a bad malloc package
that doesn't do that, you have a lot of that GB of RAM in very small
chunks, unusable!

If your program is using 1GB of RAM you must pay attention to RAM usage
more so than in a normal application sorry.
And if you have large data structures that do contain pointers
you are SOL. Being told that "most programs" do not have this problem
is cold comfort.

What do you expect?

Most applications do not use 1GB RAM!
And this does not address the problem of memory allocation in
libraries not under your control (no you cannot simply link them
with another malloc/free as you will then have two mallocs working
on the same address space and same physical memory).

GC_malloc is compatible with malloc/free. You should not MIX both,
i.e. giving a GC_malloced chunk to free().
 
W

William Hughes

jacob said:
It is rewriting code. If you have an application that uses 1GB
of RAM, it is normal that you optimize RAM usage and the GC
if you use it.

I spend a lot of time optimizing RAM usage. Why would
I want to complicate my life to solve a non-problem?
Since you have said that the allocations do not happen
under your control, you shouldn't use the GC for that application
in my opinion. Your third party library is working, and since
you do not have the source code you are stuck with it.

Which was my point. This application is not suitable for
GC. Indeed most of the applications I work on are not suitable
for GC. Am I typical, no. Am I representative of a large enough
group to bring the statement "malloc/free should be replaced by GC"
into doubt. Yes. There is real code out there that is not
suitable for GC. There are many people for whom memory leaks
are a very small or nonexistent problem. GC has benefits for
some (and annecdotal evidence suggest these can be quite
large) but for others it has very little benefit.
It also has risks, and it is not suitable for all code. This does
not sound like a general purpose solution.

- William Hughes
 
W

William Hughes

jacob said:
Excuse me but you tell us that your program uses 1GB of RAM.
This is not a normal application, and you may need to
optimize your usage.

Besides, you are having NOW a pause each time you do a free() since
free() will try to consolidate memory. If you use a bad malloc package
that doesn't do that, you have a lot of that GB of RAM in very small
chunks, unusable!

If your program is using 1GB of RAM you must pay attention to RAM usage
more so than in a normal application sorry.


What do you expect?

Most applications do not use 1GB RAM!


GC_malloc is compatible with malloc/free. You should not MIX both,
i.e. giving a GC_malloced chunk to free().

This I don't understand. The major problem I see is not
the mixing of calls to free. How can GC_malloc operate correctly, if
there
is another malloc operating independently of it? How does GC_malloc
know where it is safe to allocate memory? What happens if the other
malloc (which does not know about GC_malloc) allocates memory
that GC_malloc has already allocated?

- William Hughes
 
C

CBFalconer

William said:
.... snip ...

And if you have large data structures that do contain pointers
you are SOL. Being told that "most programs" do not have this
problem is cold comfort.

And this does not address the problem of memory allocation in
libraries not under your control (no you cannot simply link
them with another malloc/free as you will then have two mallocs
working on the same address space and same physical memory).

My tendency is to have pointers in almost everything I malloc,
apart from strings. My code is usually peppered with
create/destroy pairs for headers to records.
 
C

CBFalconer

jacob said:
William Hughes wrote:
.... snip ...

Excuse me but you tell us that your program uses 1GB of RAM.
This is not a normal application, and you may need to
optimize your usage.

Besides, you are having NOW a pause each time you do a free()
since free() will try to consolidate memory. If you use a bad
malloc package that doesn't do that, you have a lot of that GB
of RAM in very small chunks, unusable!

I pointed out earlier (in this thread, I believe) that delays for
recombination are not intrinsic in the design of a malloc/free
system. It is possible to have the free operation be O(1),
regardless of the number of freed and allocated blocks present.
The proof is available to all in my nmalloc, available at:

<http://cbfalconer.home.att.net/download>

This is a fully standards conforming system, without the gotchas of
a GC. It is even available to all without licence fees, publishing
restrictions, etc. The implementation is not purely standard, but
minimizes the non-standard requirements.

Please don't come up with this excuse again.
 
C

CBFalconer

William said:
jacob navia wrote:
.... snip ...

This I don't understand. The major problem I see is not the mixing
of calls to free. How can GC_malloc operate correctly, if there
is another malloc operating independently of it? How does GC_malloc
know where it is safe to allocate memory? What happens if the other
malloc (which does not know about GC_malloc) allocates memory
that GC_malloc has already allocated?

This simply means that GC_malloc has to call malloc to get memory.
When the GC system decides that something is freeable, it has to
call free to do it. That means that all the foibles of the malloc
system are still present, including the free delays (see my post
elsethread though), except that those delays can occur at
uncontrolled times. In exchange you get a license to be sloppy
about releasing memory, provided you adhere to other unspecified
restrictions.
 
R

Richard Tobin

William Hughes said:
Because although it will "work" without being more efficient, it may
have very long pauses, and may be so slow as to be useless.
[more in the same vein deleted]

Perhaps you're mistaking me for someone who wants to persuade
you to use garbage collection?
And if you have large data structures that do contain pointers
you are SOL. Being told that "most programs" do not have this problem
is cold comfort.

Well if you find someone who tells you that, I suggest you enlighten
them.

-- Richard
 
R

Richard Tobin

This I don't understand. The major problem I see is not
the mixing of calls to free. How can GC_malloc operate correctly, if
there is another malloc operating independently of it?

It probably can't reclaim memory allocated by the other malloc.
How does GC_malloc
know where it is safe to allocate memory?

Perhaps it uses malloc(). Or (more likely) it works in a system
dependent way. Almost all operating systems have other ways to
allocate memory, and to be useful they have to interoperate with
malloc().

-- Richard
 
W

William Hughes

Richard said:
William Hughes said:
Because although it will "work" without being more efficient, it may
have very long pauses, and may be so slow as to be useless.
[more in the same vein deleted]

Perhaps you're mistaking me for someone who wants to persuade
you to use garbage collection?

No, but we are discussing this in a thread where the OP
makes the argument that GC should be the default.
The argument is not that GC can't be useful in some cases
(or indeed very useful in some cases), The argument is that GC
is not useful enough in enough cases.

Part of the OP's case is that the time needed for GC is
negligible, both in terms of the pauses and in terms
of overall run time. My point is that there is an important
class of programs for which this is not true.
(I argue that there are cases where the "negligible pauses"
are not negligibe by any reasonable use of the term. Others
have pointed out that the "negligible pauses" may not be
negligible in some cases.)

The general point, that GC can consume significant resources,
is important (note that CPU time is only one resource that GC
uses).

- William Hughes
 
J

jacob navia

CBFalconer said:
... snip ...



This simply means that GC_malloc has to call malloc to get memory.
When the GC system decides that something is freeable, it has to
call free to do it. That means that all the foibles of the malloc
system are still present, including the free delays (see my post
elsethread though), except that those delays can occur at
uncontrolled times. In exchange you get a license to be sloppy
about releasing memory, provided you adhere to other unspecified
restrictions.

The GC calls directly the OS. It knows where are the areas
of virtual memory that it has reserved memory, so it knows if a
word points into its memory areas.
 
W

William Hughes

jacob said:
The GC calls directly the OS. It knows where are the areas
of virtual memory that it has reserved memory, so it knows if a
word points into its memory areas.

This will probably work if the other implementation of malloc
does not assume that it is the only thing modifying the heap
and act accordingly. I don't know what the present
situation is (the C standard cannot address this problem as
the only way C defines to access the heap is malloc), but
I would guess that assuming that a malloc implementation is
safe to use with other independent memory allocators is
similar to the assumption that an unknown function is
multi-thread safe. I am very short on warm fuzzies.

-William Hughes
 
J

jacob navia

William said:
This will probably work if the other implementation of malloc
does not assume that it is the only thing modifying the heap
and act accordingly. I don't know what the present
situation is (the C standard cannot address this problem as
the only way C defines to access the heap is malloc), but
I would guess that assuming that a malloc implementation is
safe to use with other independent memory allocators is
similar to the assumption that an unknown function is
multi-thread safe. I am very short on warm fuzzies.

Look, if you use a brain dead malloc it is not somebody else's
fault. In windows, for instance, you can allocate using malloc, but
also with GlobalAlloc, VirtualAlloc, and a dozen of other functions.
Under unix you have sbrk() and many others.

Any malloc that assumes that all heap is allocated by malloc
is completely brain dead. Besides, WHY would such an assumption
need to be done?

As we advance in the discussion, your arguments become more and more
contrieved.
 
P

Paul Connolly

William Hughes said:
I do not understand. I thought that GC had to scan the entire
amount of memory used by the program. I regularly have programs
that use over a gibabyte of memory. How is this
scanned without a noticable pause?

Most modern GC uses "generational scavenging" (as opposed to the old full
stop-and-copy that some older pure functional programming languages had) - a
generation is a partition of the whole allocated memory - with the more
recently allocated memmory being in the first generation to be compacted -
generational scavenging is based on an assumption that the most recently
allocated memory is the most likely to require collection - this is often a
good bet and mitigates the problem of "pauses", but does not eliminate the
problem - well-written malloc/free code will be superior in terms of speed
and amount memory allocated.
 
P

Paul Connolly

jacob navia said:
Not really. Garbage is automatically collected. You can fine-tune this,
specifying the threshold or call a gc when you think it is a good moment
to do that.

Explicitly calling GC more often isn't free (excuse the pun) - I think it
will slow the execution of my program (I tried explicitly calling GC in an
ASP .NET application - it got slower - the manual warned me it might, but I
wouldn't be told)
No, it doesn't run slower. Allocation is a bit slower since each
allocation does a bit of a GC (incremental GC)

It doesn't run slower but allocation is slower - so does it make up the
ground on freeing? - I find that difficult to believe - the freeing takes
longer under automatic GC - one possibility is that locality of reference is
improved by compaction of the heap and processors with cache might benefit
from that - but is that enough to balance the other work that automatic GC
has to do? - my experience is that languages with automatic GC (ML,
Haskell, Prolog, C#) run slower (admittedly ML, Haskell, Prolog might also
suffer from lack of mutable data, Haskell from laziness, and Prolog from
backtracking)
There are pauses, but in modern workstations this is barely noticeable.


Write in assembly. That's a *real* language :)

Actually I wrote a _little_ bit of assembler this morning, which turned into
this afternoon, as it often does with assembler :)
 
P

Paul Connolly

Paul Connolly said:
Most modern GC uses "generational scavenging" (as opposed to the old full
stop-and-copy that some older pure functional programming languages had) -
a

I thought about this again - generational scavenging works best with pure
functional languages (where every structure refers to earlier created
structures) - assignment means that older structures can be made to refer to
newer ones - making it much more complicated and inefficient to do GC in a
language with destructive assignment - so that makes GC in C even more
inefficient than I first imagined.
 
F

Friedrich Dominicus

Paul Connolly said:
I thought about this again - generational scavenging works best with pure
functional languages (where every structure refers to earlier created
structures) - assignment means that older structures can be made to refer to
newer ones - making it much more complicated and inefficient to do GC in a
language with destructive assignment - so that makes GC in C even more
inefficient than I first imagined.
Strange that Smalltalk, Common Lisp etc have no troubles with GC
than...

Friedrich
 
C

Chris Thomasson

SM Ryan said:
(e-mail address removed)-cnrc.gc.ca (Walter Roberson) wrote:
# In article <[email protected]>,

[...]

Boehm-Demers garbage collection doesn't work for all programs on
all systems. The systems it has been ported to do work (ie they
have been verified with system libraries and compilers) if you
write your programs in the restricted language.

[...]

FWIW, here is an experimental protoyope of an atomically thread-safe
reference counted pointer:

http://appcore.home.comcast.net/vzoom/refcount/

It has strict C API and the ABI adheres to the C calling convention... You
can use it to do garbage collection like activities:

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/f28308f5e7188bb4


Here is some more information:

http://softwareforums.intel.com/ISN/Community/en-US/forums/thread/30225804.aspx


Enjoy!


;^)
 
R

Richard Tobin

Strange that Smalltalk, Common Lisp etc have no troubles with GC
than...

No, they just have more complicated garbage collectors than would be required
if all pointers pointed backwards.

Pure functional languages don't necessarily have that property - such
a language could perfectly well have a syntax for circular structures.

-- Richard
 
S

SM Ryan

# FWIW, here is an experimental protoyope of an atomically thread-safe
# reference counted pointer:

How well does it collect cyclic graphs?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,073
Latest member
DarinCeden

Latest Threads

Top