advice/thoughts on garbage collection?

mijoryx · Feb 13, 2009

Hello,

I'd appreciate any advice fellow bit surgeons may
have pertinent to a c language implementation
of a postscript interpreter/X11 graphics engine.

I've got the basic object model/scanner/executive
interface working quite well (if I do say so myself

,
and have begun to consider how best to handle
the virtual memory and garbage collection aspects.
I confess I haven't scoured the ghostscript code
for an answer to this for several reasons (it's so
ungodly huge; and my design is geared toward
simplicity rather than speed per se, so it seems
unlikely to reap much benefit for the days and
weeks it would require to digest), but I have
looked at several other postscript sources
with varying results: crispin goswell's interpreter
doesn't appear to do it at all, rbuss uses the GC
library with some painful interface (a brief
browsing reinforced my fear of ghostscript),
and the amiga programs are all in 68000 assembler.
I don't even have a reference for that.

So it appears my options boil down to:
1. Use GC.
pro: It's standard, it's good.
con: It's almost as big as my entire program!
I can manage the internal structures just fine,
using realloc for growing structures and freeing
display lists when they're replaced with new
ones. Most of these persist through the life
of the process anyway. So is it easy to
attach GC just to the object space and just
replace malloc in the object allocators?
Can I tell GC my object structure so it
knows exactly where my pointers may
be and under what specific circumstances
they may be pointers (from the type field)?
Will any of this help me with Virtual Memory
(scoped memory changes, allowing arbitrary
"backing out" of data changes)?

2. Do it all myself.
With all the specificity I'd have to feed to
GC, would it be simpler to write my own
tracing function? I could even add a quick
bit-field reference count check to the 'pop'
operator and free most garbage right away.
Would I front-end malloc and keep all the
pointers in a table?

I've got the book coming, but I wanted to
give my subconscious a jump-start on the
issues involved.

tia.

luser-ex-troll

Bartc · Feb 15, 2009

Hello,

I'd appreciate any advice fellow bit surgeons may
have pertinent to a c language implementation
of a postscript interpreter/X11 graphics engine.

I've got the basic object model/scanner/executive
interface working quite well (if I do say so myself ,
and have begun to consider how best to handle
the virtual memory and garbage collection aspects.
I confess I haven't scoured the ghostscript code
for an answer to this for several reasons (it's so
ungodly huge; and my design is geared toward
simplicity rather than speed per se, so it seems
unlikely to reap much benefit for the days and
weeks it would require to digest), but I have
looked at several other postscript sources
with varying results: crispin goswell's interpreter
doesn't appear to do it at all, rbuss uses the GC
library with some painful interface (a brief
browsing reinforced my fear of ghostscript),
and the amiga programs are all in 68000 assembler.
I don't even have a reference for that.

So it appears my options boil down to:
1. Use GC.

2. Do it all myself.

Perhaps look at whether a language as straightforward as Postscript (if I
recall) needs garbage collection all.

If the language variables don't share data, and it doesn't use pointers,
then you can just allocate/deallocate/reallocate as needed, as you seem to
have discovered already.

mijoryx · Feb 15, 2009

[snip!]

So it appears my options boil down to:
1. Use GC.
2. Do it all myself.

Click to expand...

Perhaps look at whether a language as straightforward as Postscript (if I
recall) needs garbage collection all.

If the language variables don't share data, and it doesn't use pointers,
then you can just allocate/deallocate/reallocate as needed, as you seem to
have discovered already.

Ah. But certain of the objects DO have (internal)
pointers.
The object structure boils down to this:

typedef struct s_object Object;
struct s_object {
enum {booleantype, realtype, arraytype, ...} type;
unsigned flags;
unsigned length;
union { int b; float f; Object *a; ...};
};

And there is a troublesome operator, 'getinterval',
which returns a subarray. From the manual:

array index count getinterval subarray
creates a new array ... whose value consists of some
subsequence of the original array .... The subsequence
consists of count elements starting at the specified
index in the original object. The elements in the
subsequence are shared between the original and new
objects.

so if i do:
[0 1] dup 0 1 getinterval %[0 1] [0]
0 2 put ==

should print [2 1].
the subarray reuses the appropriate portion of the
original array so they are sharing memory through
pointers. The difficulty here is when the subarray
is dropped from the stack (by 'put' in the example),
its pointer is not necessarily going to be present
in any cache of malloc'd memory (since it could be
offset to anywhere in the middle).

Actually I may have just answered my own question.
If I add an integer offset to the structure, then
the Object *a member can always hold the pointer
returned by malloc.

All of this get much more complicated with the
Display Postscript and NeWS extensions for
multithreading.

My work is scratched out for me, but a great deal
of engraving and painting remain.

luser-ex-troll

mijoryx · Feb 15, 2009

Garbage collection has to be built into the compiler. Barring lots of evil
hacks, there's no way of providing it as user code.
Jacob Navia has a C compiler that offers garbage collection as an extension.
However I don't think he has a version for Unix.

I appreciate the response but it simply cannot be true.
The Boehm GC package is an example. But conceptually,
anything in the compiler must have been user code first
(although it need not have been present in the most
recent source: ie. the cc/login hack).

Or perhaps what you say is true, and all of my code
consists of "evil hacks". I can live with that.

luser-ex-trolley

Bartc · Feb 15, 2009

[snip!]

So it appears my options boil down to:
1. Use GC.
2. Do it all myself.

Click to expand...

Perhaps look at whether a language as straightforward as Postscript
(if I recall) needs garbage collection all.

If the language variables don't share data, and it doesn't use
pointers, then you can just allocate/deallocate/reallocate as
needed, as you seem to have discovered already.

Click to expand...

The object structure boils down to this:

typedef struct s_object Object;
struct s_object {
enum {booleantype, realtype, arraytype, ...} type;
unsigned flags;
unsigned length;
union { int b; float f; Object *a; ...};
};

That shouldn't be a problem, but depends on, when you're copying one object
to another, whether the memory pointed to by *a is duplicated.

And there is a troublesome operator, 'getinterval',
which returns a subarray ... The elements in the
subsequence are shared between the original and new
objects.

so if i do:
[0 1] dup 0 1 getinterval %[0 1] [0]
0 2 put ==

should print [2 1].
the subarray reuses the appropriate portion of the
original array so they are sharing memory through
pointers. The difficulty here is when the subarray
is dropped from the stack (by 'put' in the example),
its pointer is not necessarily going to be present
in any cache of malloc'd memory (since it could be
offset to anywhere in the middle).

Routines which decide whether a pointer points to an allocated block I think
can already deal with pointers into the middle of the block.

Actually I may have just answered my own question.
If I add an integer offset to the structure, then
the Object *a member can always hold the pointer
returned by malloc.

All of this get much more complicated with the
Display Postscript and NeWS extensions for
multithreading.

My work is scratched out for me, but a great deal
of engraving and painting remain.

My strategy for implementing a simple stack language was:

* Variables always have their own, unshared data (ie. complex variables such
as arrays)

* When variable values are pushed onto the stack, a copy of the descriptor
(eg. your Object struct) is pushed, and a bit set saying this is a Copy.
This works provided the original variable is not going to disappear during
this expression.

* Values popped from the stack are freed, if the Copy bit is not set. Your
getinterval op would simply have created a slice (an Object set up for an
array) with Copy set; popping this requires no deallocation.

* Values popped into a variable, are duplicated if the Copy bit is set

* And functions which return a value should duplicate that value if
otherwise the Copy bit is set (for reasons which may not apply to
Postscript)

This way I've narrowly managed to avoid incorporating a proper GC for quite
a few years (and this language also has troublesome global variables and
pointers), but at a cost of deep copies for assignments.

Tim Rentsch · Feb 15, 2009

Malcolm McLean said:
Garbage collection has to be built into the compiler. Barring lots of evil
hacks, there's no way of providing it as user code.
Jacob Navia has a C compiler that offers garbage collection as an extension.
However I don't think he has a version for Unix.

I don't know what sorts of code might be considered "evil hacks" by
this poster. But if you want something that's reasonably likely to
provide the capability you want, try the Hans Boehm collector:

http://www.hpl.hp.com/personal/Hans_Boehm/gc/

One sentence summary: just use malloc(), don't bother using free();
compiler modifications not required.

Quoting from the page, under "Platforms":

The collector is not completely portable, but the distribution
includes ports to most standard PC and UNIX/Linux platforms. The
collector should work on Linux, *BSD, recent Windows versions,
MacOS X, HP/UX, Solaris, Tru64, Irix and a few other operating
systems. Some ports are more polished than others. There are
instructions for porting the collector to a new platform.

mijoryx · Feb 15, 2009

(e-mail address removed) wrote: [snip!]

The object structure boils down to this:

Click to expand...

typedef struct s_object Object;
struct s_object {
enum {booleantype, realtype, arraytype, ...} type;
unsigned flags;
unsigned length;
union { int b; float f; Object *a; ...};
};

Click to expand...

That shouldn't be a problem, but depends on, when you're copying one object
to another, whether the memory pointed to by *a is duplicated.

it is for the 'copy' operator (which demands an
appropriately sized container. but the 'dup' operator
simply copies the entire structure (pointer and all).
So it appears that 'dup' will need to do a little
extra work for composite objects (objects whose payload
is a pointer to memory containing other objects).
It seems I can do this with a bit in the flags.
Thanks. This may help enormously.

so if i do:
[0 1] dup 0 1 getinterval %[0 1] [0]
0 2 put ==

Click to expand...

should print [2 1].
the subarray reuses the appropriate portion of the
original array so they are sharing memory through
pointers. The difficulty here is when the subarray
is dropped from the stack (by 'put' in the example),
its pointer is not necessarily going to be present
in any cache of malloc'd memory (since it could be
offset to anywhere in the middle).

Click to expand...

Routines which decide whether a pointer points to an allocated block I think
can already deal with pointers into the middle of the block.

that makes sense. check whether pointer is within
the allocated space. p > space && p < (space + size)

My strategy for implementing a simple stack language was:

* Variables always have their own, unshared data (ie. complex variables such
as arrays)

* When variable values are pushed onto the stack, a copy of the descriptor
(eg. your Object struct) is pushed, and a bit set saying this is a Copy.
This works provided the original variable is not going to disappear during
this expression.

* Values popped from the stack are freed, if the Copy bit is not set.

So one object /owns/ the memory, (flags & COPY) == 0.
And pop adds: ((o.flags&COPY)?:free(o.a))
(it's a macro).

Your

getinterval op would simply have created a slice (an Object set up for an
array) with Copy set; popping this requires no deallocation.

comprendo.

* Values popped into a variable, are duplicated if the Copy bit is set

or in my case, defined in a dictionary.

* And functions which return a value should duplicate that value if
otherwise the Copy bit is set (for reasons which may not apply to
Postscript)

perhaps not, but excellent fuel for thought, regardless.

This way I've narrowly managed to avoid incorporating a proper GC for quite
a few years (and this language also has troublesome global variables and
pointers), but at a cost of deep copies for assignments.

I still need to think through how this affects
save/restore (restores all composite objects to previous
values).

thanks again.

luser-ex-troll

Tim Rentsch · Feb 16, 2009

Malcolm McLean said:
Form the site:

DATASTART
The beginning of the main data segment. The collector will trace all memory
between DATASTART and DATAEND for root pointers. On some platforms,this can
be defined to a constant address, though experience has shown that to be
risky. Ideally the linker will define a symbol (e.g. _data whose address is
the beginning of the data segment. Sometimes the value can be computed using
the GC_SysVGetDataStart function. Not used if either the next macro is
defined, or if dynamic loading is supported, and the dynamic loading support
defines a function GC_register_main_static_data() which returns false.
SEARCH_FOR_DATA_START
If this is defined DATASTART will be defined to a dynamically computed value
which is obtained by starting with the address of _end and walking backwards
until non-addressable memory is found. This often works on Posix-like
platforms. It makes it harder to debug client programs, since startup
involves generating and catching a segmentation fault, which tends to
confuse users.

I am not saying the garbage collector is a bad thing, but these are clearly
evil hacks.

In regular user code, I'd probably agree with you.

For this kind of implementation add-on, it kind of comes with the
territory. Pragmatically, if it provides the right functionality,
and if the quality of code and implementation is high enough, the
benefits of using it are pretty likely to outweigh the costs of
making sure it works correctly on whatever platform(s) the code is
intended to address.

Don't get me wrong, I'd prefer a GC implementation more integrated
with the compiler(s) the project is using. Absent that, certainly
its reasonable -- for some projects, and especially in the context
of this kind of add-on -- to judge the techniques mentioned as
raising only tactical concerns, not strategic objections.

Guest · Feb 16, 2009

Hello,

I'd appreciate any advice fellow bit surgeons may
have pertinent to a c language implementation
of a postscript interpreter/X11 graphics engine.

I've got the basic object model/scanner/executive
interface working quite well (if I do say so myself ,
and have begun to consider how best to handle
the virtual memory and garbage collection aspects.

perhaps it's a bit late in the day to be considering
memory management *after* you've implemented your code.
Garbage collection isn't the only way to manage memory.

<sniP>

mijoryx · Feb 16, 2009

perhaps it's a bit late in the day to be considering
memory management *after* you've implemented your code.
Garbage collection isn't the only way to manage memory.

<sniP>

Agreed. My claim may have both over- and under-promoted
the work I've done so far. I began a few days after
January 1, and have already achieved a monstrous 2557
lines. I've been using the "tracer-bullets" strategy
from /The Pragmatic Programmer/, so it really does fit
to add these sorts of larger features laterally. And I
didn't see any need to worry about the issue until the
concept was proved.
Now that it's drawing filled polygons from a file,
run by a procedure, I feel it's safe to flesh out more
of the features concerning whose implementation I was
less conversant.

Is there something else that would accomplish similar
ends in a different manner (besides option 0: leave it
leaking memory).

luser-ex-troll

Guest · Feb 17, 2009

Agreed. My claim may have both over- and under-promoted
the work I've done so far. I began a few days after
January 1, and have already achieved a monstrous 2557
lines. I've been using the "tracer-bullets" strategy
from /The Pragmatic Programmer/,

I've considered buying this but never have. Is it any good?

so it really does fit
to add these sorts of larger features laterally. And I
didn't see any need to worry about the issue until the
concept was proved.

ok, I perhaps sounded harsher than I meant to.
But my experiences of trying to retrofit sane
memory management to leaky implementaions were
not good.

Now that it's drawing filled polygons from a file,
run by a procedure, I feel it's safe to flesh out more
of the features concerning whose implementation I was
less conversant.

Is there something else that would accomplish similar
ends in a different manner (besides option 0: leave it
leaking memory).

eeek!

I was thinking of explicitly free()ing when the memory
went out of use. *Some* part of your program ought to
know when you've finished with something. Even take a look at
C++s smart pointers and RAII.

mijoryx · Feb 17, 2009

On 16 Feb, 15:41, (e-mail address removed) wrote:
[snip!]

Agreed. My claim may have both over- and under-promoted
the work I've done so far. I began a few days after
January 1, and have already achieved a monstrous 2557
lines. I've been using the "tracer-bullets" strategy
from /The Pragmatic Programmer/,

Click to expand...

I've considered buying this but never have. Is it any good?

I liked it so much I never took it back to the library.
(But I will soon. Sorry, Ben Franklin!)

ok, I perhaps sounded harsher than I meant to.
But my experiences of trying to retrofit sane
memory management to leaky implementaions were
not good.

Yes. I agree. But I consider this an exception since
it's still "early days" for a project of this magnitude
(it took Peter Deutsch 5 years to release the first
version of ghostscript).
I still haven't tackled fonts, aliasing, images, ....
Of course I hope this software will be brilliant and
revolutionary, but I'll be satisfied if I learn enough
to try again (or something different but representing
a similar challange) and if it helps distract me from
porn.

eeek!

I was thinking of explicitly free()ing when the memory
went out of use. *Some* part of your program ought to
know when you've finished with something. Even take a look at
C++s smart pointers and RAII.

There are specific points where objects are discarded
(storing a new value in a dictionary, popping from
the stack, every operator that takes arguments from
the stack). But at these places, the program logic is
unaware of whether the object is unique (and ready to
be free'd). The postscript language appears to demand
garbage collection of some sort.

I'm afraid of C++. It has disappointed my expections
on numerous occasions. But I'll look into RAII.

luser-ex-troll

Flash Gordon · Feb 18, 2009

On 16 Feb, 15:41, (e-mail address removed) wrote:
[snip!]

I've got the basic object model/scanner/executive
interface working quite well (if I do say so myself ,
and have begun to consider how best to handle
the virtual memory and garbage collection aspects.
perhaps it's a bit late in the day to be considering
memory management *after* you've implemented your code.
Garbage collection isn't the only way to manage memory.
<sniP>
Agreed. My claim may have both over- and under-promoted
the work I've done so far. I began a few days after
January 1, and have already achieved a monstrous 2557
lines. I've been using the "tracer-bullets" strategy
from /The Pragmatic Programmer/,

Click to expand...

I've considered buying this but never have. Is it any good?

Click to expand...

I liked it so much I never took it back to the library.
(But I will soon. Sorry, Ben Franklin!)

ok, I perhaps sounded harsher than I meant to.
But my experiences of trying to retrofit sane
memory management to leaky implementaions were
not good.

Click to expand...

Yes. I agree. But I consider this an exception since
it's still "early days" for a project of this magnitude
(it took Peter Deutsch 5 years to release the first
version of ghostscript).
I still haven't tackled fonts, aliasing, images, ....
Of course I hope this software will be brilliant and
revolutionary, but I'll be satisfied if I learn enough
to try again (or something different but representing
a similar challange) and if it helps distract me from
porn.

eeek!

I was thinking of explicitly free()ing when the memory
went out of use. *Some* part of your program ought to
know when you've finished with something. Even take a look at
C++s smart pointers and RAII.

Click to expand...

There are specific points where objects are discarded
(storing a new value in a dictionary, popping from
the stack, every operator that takes arguments from
the stack). But at these places, the program logic is
unaware of whether the object is unique (and ready to
be free'd). The postscript language appears to demand
garbage collection of some sort.

I'e not looked at the details of postscript so this might be completely
the wrong approach, but you could consider reference counting since I'm
assuming that you need to keep objects alive until the last reference to
it goes. You write functions to create objects, duplicate them, add and
destroy references etc. Then the routine for destroying the object (or
deleting a reference) decrements the reference count and only actually
frees it if that was the last reference.

I'm afraid of C++. It has disappointed my expections
on numerous occasions. But I'll look into RAII.

From the sounds of it you will learn a lot about managing memory by
implementing this in C, so if this is a learning project then it is
probably worth continuing. Then, once you have solved it in C, see if it
can be solved more elegantly by using C++ and RAII or some other tool.
However, remember that very little is free, and there are certainly
costs to garbage collection, but sometimes those costs are worth paying.

dj3vande · Feb 18, 2009

Flash Gordon said:
I'e not looked at the details of postscript so this might be completely
the wrong approach, but you could consider reference counting since I'm
assuming that you need to keep objects alive until the last reference to
it goes. You write functions to create objects, duplicate them, add and
destroy references etc. Then the routine for destroying the object (or
deleting a reference) decrements the reference count and only actually
frees it if that was the last reference.

Reference counting has a lot of advantages (it's simple to implement,
simple to understand, and guaranteed to deallocate resources as soon as
it can be known that they're no longer needed), but it has one
nontrivial disadvantage (it needs space for a counter for every object
you create) and one often-fatal flaw:
*** If you can *EVER* create a reference cycle, a refcounting ***
*** GC will *NEVER* collect it. ***
So unless your type system is restricted enough (and strong enough) to
guarantee that cycles are impossible, or your object model lets you
forbid any mutation that might change what existing objects refer to
(if you can only create objects with references to existing objects,
and you can't modify objects once they're created, that forces your
reference graph to be acyclic), refcounting will be inadequate.
(It's possible to work around this problem, but in doing so you end up
losing a lot of the advantages refcounting has over other types of
garbage collection; in particular, the client code typically needs to
be aware that it's using a refcounted GC and know how to play nicely
with it.)
Like Flash, I don't know enough about postscript to comment on whether
refcounting is appropriate for your (OP's) purposes.

From the sounds of it you will learn a lot about managing memory by
implementing this in C, so if this is a learning project then it is
probably worth continuing. Then, once you have solved it in C, see if it
can be solved more elegantly by using C++ and RAII or some other tool.
However, remember that very little is free, and there are certainly
costs to garbage collection, but sometimes those costs are worth paying.

Experience in the world of software development at large suggests that
you meant to say "...often those costs are worth paying.".
Of course, if you narrow your scope to the sorts of things C is a good
tool for, the proportion goes way down. (I think the largest class of
things that C-plus-GC is a good match for is implementations of
higher-level languages that provide GC.)

dave

Flash Gordon · Feb 18, 2009

Reference counting has a lot of advantages (it's simple to implement,
simple to understand, and guaranteed to deallocate resources as soon as
it can be known that they're no longer needed), but it has one
nontrivial disadvantage (it needs space for a counter for every object
you create) and one often-fatal flaw:
*** If you can *EVER* create a reference cycle, a refcounting ***
*** GC will *NEVER* collect it. ***
So unless your type system is restricted enough (and strong enough) to
guarantee that cycles are impossible, or your object model lets you
forbid any mutation that might change what existing objects refer to
(if you can only create objects with references to existing objects,
and you can't modify objects once they're created, that forces your
reference graph to be acyclic), refcounting will be inadequate.
(It's possible to work around this problem, but in doing so you end up
losing a lot of the advantages refcounting has over other types of
garbage collection; in particular, the client code typically needs to
be aware that it's using a refcounted GC and know how to play nicely
with it.)
Like Flash, I don't know enough about postscript to comment on whether
refcounting is appropriate for your (OP's) purposes.

Good points. I *suspect* that in implementing a postscript interpreter
would fit within the limits of reference counting, but you are correct
that it might not.

Experience in the world of software development at large suggests that
you meant to say "...often those costs are worth paying.".
Of course, if you narrow your scope to the sorts of things C is a good
tool for, the proportion goes way down. (I think the largest class of
things that C-plus-GC is a good match for is implementations of
higher-level languages that provide GC.)

Well, most of the programming I've done in over 20 years has not needed
dynamic memory allocation. The instances where it was needed were

1) To get around a limit of 64K of static data, so GC was not needed
2) The software on one processor on a 20 processor system where I had to
write the SW in assembler, and the requirements were such that there
were easier ways than using GC (I could just drop *all* allocations for
the start of each time round the main loop)
5) The SW I've worked on over the past 8 years where GC might be useful.

So my personal experience is that you don't need dynamic memory
allocation most of the time, but that experience is biased by the types
of work I spent 15 years doing.

My opinion is that sometimes the cost of GC is worth it, and I'm not
going to take a guess at percentages either inside or outside C programming.

Guest · Feb 18, 2009

Well, most of the programming I've done in over 20 years has not needed
dynamic memory allocation. The instances where it was needed were

1) To get around a limit of 64K of static data, so GC was not needed
2) The software on one processor on a 20 processor system where I had to
write the SW in assembler, and the requirements were such that there
were easier ways than using GC (I could just drop *all* allocations for
the start of each time round the main loop)
5)
5?

The SW I've worked on over the past 8 years where GC might be useful.

So my personal experience is that you don't need dynamic memory
allocation most of the time, but that experience is biased by the types
of work I spent 15 years doing.

what about the cases where you don't know in advance how many
thingies
you'll need. For instance inter-processor messages where the messages
vary in size and you don't know how many you'll need. Its obvious
when you've finished with a message so you can explicitly return
and don't need Garbage Collection.

oh, wait what was the memory leak I spent 3 weeks hunting down...
That involved instrumenting every malloc and free (ok, it was new and
delete) in the system and seeing what type of thingy just kept
on increasing. Timers and inter-processor messages.

mijoryx · Feb 19, 2009

(e-mail address removed) wrote: [snip!]

There are specific points where objects are discarded
(storing a new value in a dictionary, popping from
the stack, every operator that takes arguments from
the stack). But at these places, the program logic is
unaware of whether the object is unique (and ready to
be free'd). The postscript language appears to demand
garbage collection of some sort.

Click to expand...

I'e not looked at the details of postscript so this might be completely
the wrong approach, but you could consider reference counting since I'm
assuming that you need to keep objects alive until the last reference to
it goes. You write functions to create objects, duplicate them, add and
destroy references etc. Then the routine for destroying the object (or
deleting a reference) decrements the reference count and only actually
frees it if that was the last reference.

Yes this appears to be exactly the approach taken by NeWS.
I found a pretty good description in Davison, et al.
/Distributed Window Systems/.
Apparently there's an extra mechanism called "soft" references
that helps will possibly cyclic structures, but I don't
really understand that part.

From the sounds of it you will learn a lot about managing memory by
implementing this in C, so if this is a learning project then it is
probably worth continuing. Then, once you have solved it in C, see if it
can be solved more elegantly by using C++ and RAII or some other tool.
However, remember that very little is free, and there are certainly
costs to garbage collection, but sometimes those costs are worth paying.

Yes, it is a learning project. Of course, I hope it'll change
the world and earn me millions. But mostly, this is the
software I want to use and nobody else is gonna do it for me.
So my only costs are time and cranial capacity. But dividends
in either tend to cross-benefit.
In short, excellent advice. Thank you.

luXer-ex-troXX

mijoryx · Feb 19, 2009

5?

Don't let him hold the holy handgrenade!

hee hee

Flash Gordon · Feb 19, 2009

5?

Do you have something against 5?

what about the cases where you don't know in advance how many
thingies
you'll need. For instance inter-processor messages where the messages
vary in size and you don't know how many you'll need. Its obvious
when you've finished with a message so you can explicitly return
and don't need Garbage Collection.

oh, wait what was the memory leak I spent 3 weeks hunting down...
That involved instrumenting every malloc and free (ok, it was new and
delete) in the system and seeing what type of thingy just kept
on increasing.

I acknowledged above my experience is biased. I also clearly
acknowledged that there are occasions where GC is worth it. I even
stated that I've worked for 8 years on SW where it could be useful!

Timers and inter-processor messages.

Don't always deal with unknown amounts of stuff. Sometimes they do,
sometimes they don't. I suspect the 20 processor system I worked on
might have had a little communication between processes (4 of the
processors were just dealing with marshalling, synchronisation, data
transfer etc whilst the other 16 did the work), and I've dealt with
timers before as well.

Tim Rentsch · Feb 20, 2009

Flash Gordon said:
Good points. I *suspect* that in implementing a postscript interpreter
would fit within the limits of reference counting, but you are correct
that it might not.

Well, most of the programming I've done in over 20 years has not needed
dynamic memory allocation. The instances where it was needed were

1) To get around a limit of 64K of static data, so GC was not needed
2) The software on one processor on a 20 processor system where I had to
write the SW in assembler, and the requirements were such that there
were easier ways than using GC (I could just drop *all* allocations for
the start of each time round the main loop)
5) The SW I've worked on over the past 8 years where GC might be useful.

So my personal experience is that you don't need dynamic memory
allocation most of the time, but that experience is biased by the types
of work I spent 15 years doing.

My opinion is that sometimes the cost of GC is worth it, and I'm not
going to take a guess at percentages either inside or outside C programming.

True story follows.

A development team working on a reasonably large (> 600 KLOC)
application made a decision to drop in a conservative collector
to do their memory management. Up until that point they had
managed memory in the usual malloc()/free() style, and everything
was working okay, but maintaining that part of their code that
did memory management was just taking too many of their cycles.
So they decided to "bite the bullet" and switch over.

The result was great. Not only did they no longer have to
worry about when to free() memory, but using the conservative
collector the application also ran faster.

No conclusions offered, just a data point.

Advice on learning to code?	3	Mar 10, 2022
Garbage collection problems	89	Nov 18, 2007
[OT] Looking for garbage collection library	10	Oct 16, 2010
Some advice on getting into coding?	5	Jun 14, 2021
Garbage collection	97	Apr 25, 2008
A new idea for garbage collection	18	Oct 15, 2009
Some thoughts on garbage collection	5	Jan 23, 2006
Unraveling Pointers and Arrays in C++: Seeking Expert Advice.	1	Jan 26, 2024

advice/thoughts on garbage collection?

mijoryx

Bartc

mijoryx

mijoryx

Bartc

Tim Rentsch

mijoryx

Tim Rentsch

Guest

mijoryx

Guest

mijoryx

Flash Gordon

dj3vande

Flash Gordon

Guest

mijoryx

mijoryx

Flash Gordon

Tim Rentsch

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads