Garbage collection

C

cr88192

Ian Collins said:
I caught up a long time ago, at least in the Unix world.

yeah...


my compiler is not caught up yet though...
partly, it is that there are a few hidden snares in x86-64 that makes more
than a little difficulty for targeting it with a compiler (or, at least with
the calling convention used on Linux...).

IMO, Win64 has a better calling convention (it is much simpler to
target...).

more so, at least, I would want arguments for vararg functions (or, at
least, the varargs) to be passed on the stack...

sadly, one can't change the calling convention without making a big ugly
mess wrt linking to externally compiled code...

 
C

cr88192

Malcolm McLean said:
A non-technical person, or even a technical person, isn't really
interested in how fast his processor is or how much memory it has
installed. He wants to know "what can I do with this computer?"
At the moment the answer is that five year old hardware will watch video
clips on the web perfectly happily, will do wordprocessing with ease, will
play a nice 3d game. So there's no reason to upgrade except maybe for the
game, and not everyone plays 3d video games.

The really interesting question is what new things we will find to do with
computers.

maybe merging 3D gaming and porn...

some Sci-Fi style immersive interface with tactile feedback (for example,
tactile stimulation combined with force feedback), with high quality models
derived from actual data...

or, for the lazy, a partial option consisting of a headset and hands-free
stimulator...

(or, maybe an MMOG where people can get somehow pulled into the game, and if
they die in the game they die in reality...).


ok, yes, all of this is a bunch of generic Sci-Fi cliches, but yeah...

 
H

Hallvard B Furuseth

Friedrich said:
I think this one is a reayll simply example. And I would be very
suprised if any GC would fail on this simple thing.

Sure it could. p might have representation 0x6f726c64, which is the
same as the ASCII representation of "orld", from the program's output
text "Hello, world!". Very often that's OK. Sometimes it's not.

Consider a host with N-bit pointers and a program using 2**(N-2) bytes
of memory: Every random N-bit value, considered as a potential pointer,
has close to 1/4 chance of pointing into a live object. And if pointers
have 1-byte alignment on this host then most bytes of memory are part of
sizeof(pointer) "potential pointers". E.g. if you have an array of 2
values 0x1234 and 0x5678, then 0x3456 is also a potential pointer.

It's usually not that bad since words do not contain random values -
they'll normally cluster around small integers, representations of ASCII
letters, etc. But not all programs behave that way. And each piece of
uncollected garbage will keep all the other garbage it recursively
points at, uncollected.

In any case, using close to 4G of memory with garbage collection on a
32-bit host with 1-byte aligned pointers is not the brightest idea.
Switch to a host with 64-bit pointers and you are better off, in
particular if the implementation is aware of potential GC and avoids
heap addresses that match common memory patterns (small positive and
negative integers, ASCII text etc).

Also the program can help the collector by zeroing out unused pointer
values (or all unused values). Sometimes that's no less work than a
free(), but it does relieve the program of remembering if there are
other pointers to the data.
 
G

George Peter Staplin

Hallvard said:
Sure it could. p might have representation 0x6f726c64, which is the
same as the ASCII representation of "orld", from the program's output
text "Hello, world!". Very often that's OK. Sometimes it's not.

Consider a host with N-bit pointers and a program using 2**(N-2) bytes
of memory: Every random N-bit value, considered as a potential pointer,
has close to 1/4 chance of pointing into a live object. And if pointers
have 1-byte alignment on this host then most bytes of memory are part of
sizeof(pointer) "potential pointers". E.g. if you have an array of 2
values 0x1234 and 0x5678, then 0x3456 is also a potential pointer.

It's usually not that bad since words do not contain random values -
they'll normally cluster around small integers, representations of ASCII
letters, etc. But not all programs behave that way. And each piece of
uncollected garbage will keep all the other garbage it recursively
points at, uncollected.

In any case, using close to 4G of memory with garbage collection on a
32-bit host with 1-byte aligned pointers is not the brightest idea.
Switch to a host with 64-bit pointers and you are better off, in
particular if the implementation is aware of potential GC and avoids
heap addresses that match common memory patterns (small positive and
negative integers, ASCII text etc).

Also the program can help the collector by zeroing out unused pointer
values (or all unused values). Sometimes that's no less work than a
free(), but it does relieve the program of remembering if there are
other pointers to the data.


That's how C works with a conservative GC, such as the Boehm GC. With
an accurate GC there are no such limitations. All root variables are
known with an accurate GC, so that once all references from the roots to
an address or address range go away, the memory can be reused/released.

Unfortunately making an accurate GC for C would require compiler
support, and even then it would break in some cases due to the design of
C.

Some people are content with a few false references and thus the Boehm
GC works for them. I hope that their software isn't used for something
involving human lives, and that the allocations retained aren't large.


George
 
J

jacob navia

George said:
Some people are content with a few false references and thus the Boehm
GC works for them. I hope that their software isn't used for something
involving human lives, and that the allocations retained aren't large.

Incredible. And you will trust your life to the ability of a programmer
to avoid bugs in MANUALLY done memory management?

I would prefer GC at any time!
 
F

Flash Gordon

jacob navia wrote, On 27/04/08 14:57:
Incredible. And you will trust your life to the ability of a programmer
to avoid bugs in MANUALLY done memory management?

I would prefer GC at any time!

Actually, I would prefer safety critical software not use *any*
allocated memory, no recursion, and the maximum call depth and memory
usage calculated and *proved* to fit in the available memory. I've done
a significant amount of non-safety critical work with these restrictions
so for certain classes of application it is entirely possible.
 
S

santosh

jacob said:
Incredible. And you will trust your life to the ability of a
programmer to avoid bugs in MANUALLY done memory management?

I would prefer GC at any time!

In mission critical contexts deterministic behaviour (at least from the
software POV) is necessary. C with GC is not deterministic. I'm sure
you know this. Remember, desktops and servers are only one
(diminishing) of the many domains where C is used.
 
E

Eligiusz Narutowicz

Flash Gordon said:
jacob navia wrote, On 27/04/08 14:57:

Actually, I would prefer safety critical software not use *any*
allocated memory, no recursion, and the maximum call depth and memory
usage calculated and *proved* to fit in the available memory. I've
done a significant amount of non-safety critical work with these
restrictions so for certain classes of application it is entirely
possible.

Why? If you are able to be programming and testing properly your
seemingly random restrictions seem more tailored to a low quality C
programmer than to produce a sensible and reliable program. Mallocing
memory is not dangerous at all when its correctly handled. The same for
recursion. I would suggest more rigour in your design and testing than
silly rules such as you mention.
 
B

Bartc

Eligiusz Narutowicz said:
Why? If you are able to be programming and testing properly your
seemingly random restrictions seem more tailored to a low quality C
programmer than to produce a sensible and reliable program. Mallocing
memory is not dangerous at all when its correctly handled.

Malloc can run out of memory.

Not using malloc can't! So the program can't fail for that reason.
 
S

santosh

Eligiusz said:
Why? If you are able to be programming and testing properly your
seemingly random restrictions seem more tailored to a low quality C
programmer than to produce a sensible and reliable program. Mallocing
memory is not dangerous at all when its correctly handled. The same
for recursion. I would suggest more rigour in your design and testing
than silly rules such as you mention.

I think the point that Flash is making is that malloc *can* fail. In a
fault-intolerant environment, you want to keep chances of failures to a
minimum. In a important section of code, what should be done if malloc
returns NULL due to some reason or the other, like say memory
fragmentation? The options open to desktop or general purpose programs
(like calling exit, abort etc.) may not be acceptable at in some
contexts. Recursion *can* be used safely, but again it simply opens up
the program to more chances of failure and consumes excessive resources
in a very likely resource constrained environment.

So I would say that for certain limited types of applications Flash's
recommendations are entirely sensible.
 
E

Eligiusz Narutowicz

Bartc said:
Malloc can run out of memory.

Not using malloc can't! So the program can't fail for that reason.

Malloc can also break down when the power goes off too. In a well
designed program on a properly specced piece of HW it does not run out
of memory. So the natural extension of your reply is "do not use malloc
as it may run out of memory". I wonder if C is being for you?
 
S

santosh

Bartc said:
Malloc can run out of memory.

Not using malloc can't! So the program can't fail for that reason.

In a really safety critical program sufficient memory for maximum
expected usage must be present as part of system design. Otherwise,
instead of malloc failing, we will simply get the even more
unmanageable stack memory exhaustion.
 
J

jacob navia

Eligiusz said:
In a well
designed program on a properly specced piece of HW it does not run out
of memory. So the natural extension of your reply is "do not use malloc
as it may run out of memory". I wonder if C is being for you?

Of course it may run out of memory if for some reason (BUG) there is
a memory leak.

Of course this never happens to you, Mr Superman. (Unless you run into
Kryptonite)

What people like you fail to understand is that a moderately complex
piece of software can't be tested exhaustively since the exponential
growth of the combinations of input parameters, interrupts, timing
considerations, possible paths through the software etc makes exhaustive
testing impossible.

What can be done is to test each module as far as the budget goes,
test all the modules in simulated real life conditions, etc.

But no amount of testing will make bugs go away, specially in C,
and SPECIALLY using manually garbage collection!!!

In manual GC (malloc/free) each allocation/release becomes a point of
failure. Using a GC, the GC can be the point of failure but when the GC
is certified, all other points of failures disappear.

Obviously you haven't read the abundant literature about software
testing, and apparently you are ignorant of all the problems associated
with manual GC. It is therefore with the usual ARROGANCE of the people
that think that know everything that you dare to ask:

<< I wonder if C is for you >>

as if you were some kind of demigod, that never makes mistakes.

You know nothing about the subject matter guy!
 
B

Bartc

Eligiusz Narutowicz said:
Malloc can also break down when the power goes off too. In a well
designed program on a properly specced piece of HW it does not run out
of memory. So the natural extension of your reply is "do not use malloc
as it may run out of memory". I wonder if C is being for you?

What difference will changing language make? If dynamic memory is being used
in a complex way then it is conceivable that memory can become full whatever
the language. At least in C there is some choice.

Imagine a graph showing heap memory usage from 0% to 100% over runtime.
There will be peaks and troughs. Some peaks can be near 100%. But can you
prove that it will not hit 100%? Ever? On any combination of programs that
might be sharing the memory?

If you take dynamic memory out of the equation (or get the dynamic
allocations out of the way right at the start, in a proven setup) then you
forcing other techniques to be used so that you can avoid that problem.
Malloc can also break down when the power goes off too.

Where it's important, then there are ways of taking care of that.
 
S

santosh

jacob navia wrote:

In manual GC (malloc/free) each allocation/release becomes a point of
failure. Using a GC, the GC can be the point of failure but when the
GC is certified, all other points of failures disappear.

<snip>

What exactly do you mean by a "failure" in a GC? Is failing to collect
storage for which all references have been lost a failure? If this is
so, and in due course the program fails to run because of an
out-of-memory situation due to the GC failing to collect blocks that it
should have, would you still say that such software can be reliably
used in safety or performance critical contexts?

Also how exactly do you propose to certify a GC? Just because it works
for some set of programs doesn't mean that it will work for all
possible programs.
 
F

Flash Gordon

santosh wrote, On 27/04/08 16:37:
I think the point that Flash is making is that malloc *can* fail. In a
fault-intolerant environment, you want to keep chances of failures to a
minimum.

You also need to be able to prove with a certain degree of rigour (which
varies depending on how safety critical the SW is and what else there is
to prevent injury or loss of life).
In a important section of code, what should be done if malloc
returns NULL due to some reason or the other, like say memory
fragmentation?

Or simply a bug in the malloc/free implementation. So you would have to
either get the library provider to get their library tested to the
relevant standards (such testing is expensive, even more so if it was
not designed to meet those standards in the first place) or you have to
take responsibility for it and prove the library.
The options open to desktop or general purpose programs
(like calling exit, abort etc.) may not be acceptable at in some
contexts.

Indeed. How do you fancy being in (or under) a military jet flying at
night 30 feet above the ground when the pilots night vision system
decides to reboot? You could be dead in under a second.
Recursion *can* be used safely, but again it simply opens up
the program to more chances of failure and consumes excessive resources
in a very likely resource constrained environment.

Even if it is not tightly constrained (where I worked there was a
standard requirement of 50% free resources to allow for future
enhancements) it still makes it harder to *prove* that it is safe.
So I would say that for certain limited types of applications Flash's
recommendations are entirely sensible.

There are a lot of applications where it makes sense, and a lot more
where it doesn't make sense. Safety critical (and it was someone else
who mentioned "life depending on it") would be at the top of my list of
things where it makes sense.

I would be interested to here what the MISRA standard said about
allocated memory and recursion, I suspect it might be forbidden or at
least highly constrained there too.

Oh, and when I worked in the defence industry we had these restrictions.
For the type of applications I worked on it was not a problem and even
without the restrictions there was only one application in 15 years
where I might have considered using malloc, and I can't think of any
where recursion would have been appropriate.

For "normal" applications (like those I work on now) I will happily use
malloc/realloc/free, I might consider garbage collection if it was part
of the standard, and I think nothing of mutually recursive functions (we
have very good reason for a() calls b() calls c() calls a() and even
more complex things).
 
J

jacob navia

santosh said:
jacob navia wrote:



<snip>

What exactly do you mean by a "failure" in a GC?

1) Failing to collect an unused block
2) Failure to realize that a released block was actually in use
3) Failure to compact memory
Is failing to collect
storage for which all references have been lost a failure?
Yes.

If this is
so, and in due course the program fails to run because of an
out-of-memory situation due to the GC failing to collect blocks that it
should have, would you still say that such software can be reliably
used in safety or performance critical contexts?

Of course not. And I am not proposing that a conservative GC
that is OK for desktop applications be used in an airplane software.

But a GC can be built into the compiler, or programmed specifically
for the application, etc.
Also how exactly do you propose to certify a GC?

As all other certifications process for complex software go.

Just because it works
for some set of programs doesn't mean that it will work for all
possible programs.

Exactly. It is very difficult!
 
H

Hallvard B Furuseth

George said:
That's how C works with a conservative GC, such as the Boehm GC. With
an accurate GC there are no such limitations. (...) Unfortunately
making an accurate GC for C would require compiler support, and even
then it would break in some cases due to the design of C.

True, I should have mentioned that I was talking about GC in C (or a
normal C implementation anyway), which doesn't store enough type
information for the collector to know what is a pointer and what isn't.
 
E

Eligiusz Narutowicz

jacob navia said:
Of course it may run out of memory if for some reason (BUG) there is
a memory leak.

Yes - this is true of all programs in all languages.
Of course this never happens to you, Mr Superman. (Unless you run into
Kryptonite)

I'm not sure what you are being aggressive for. I am just saying that a
well tested and debugged program does not lose memory - do yours?
Certainly using GC to cover sloppy programming is no remedy I think.
What people like you fail to understand is that a moderately complex
piece of software can't be tested exhaustively since the exponential
growth of the combinations of input parameters, interrupts, timing
considerations, possible paths through the software etc makes exhaustive
testing impossible.

What can be tested and checked is a well marshalled malloc/free
access. God knows I have worked on enough C progrms which are large and
do not leak memory. Have I made bugs? Of course.
What can be done is to test each module as far as the budget goes,
test all the modules in simulated real life conditions, etc.

But no amount of testing will make bugs go away, specially in C,
and SPECIALLY using manually garbage collection!!!

I think - sorry, I know, you are wrong. You are basically saying you
can not make a bug free program using malloc. This is clearly not the
truth.
In manual GC (malloc/free) each allocation/release becomes a point of
failure. Using a GC, the GC can be the point of failure but when the
GC
is certified, all other points of failures disappear.

Obviously you haven't read the abundant literature about software
testing, and apparently you are ignorant of all the problems associated
with manual GC. It is therefore with the usual ARROGANCE of the people
that think that know everything that you dare to ask:

<< I wonder if C is for you >>

as if you were some kind of demigod, that never makes mistakes.

You know nothing about the subject matter guy!

Actually I have been a professional C programmer for approximately 20
years. I am acutely aware of the issues and pitfalls involedv in C
programming.

I think you have misunderstood me.

I am saying this - one does not "not use malloc" because its possible to
use it badly. Do you not drive a car because its possibly to crash it if
you make a mistake?
 
E

Eligiusz Narutowicz

CBFalconer said:
Definitely yes. I pay attention to these things, and have produced
lots of software, embedded and otherwise, that accurately handles
memory with malloc and equivalents. I know what is happening. And
the lives of others depended on that software.

I prefer to trust my own judgment than that of some unknown.

I too am confused why Jacob thinks it is not possible to use malloc/free
in a proven, tested and bug free manner. it is a surprise to me and I
wonder how all these C programs over the years are still working so well
- including Gnome and the Linux kernel.

Personally I do not like GC - I think it encourages lazy thinking and if
you then go back to a non GC C environment then possibly this lazy
thinkings will be causing bugs.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top