Experiment: functional concepts in C

  • Thread starter Ertugrul Söylemez
  • Start date
T

toby

What basis do you have for this absurd claim...


...or this one? The OS, if it frees your memory, needs just
as much CPU time as your code would,

AFAICT Hyman is correctly pointing out that meticulous free'ing of
individual allocations is potentially expensive and pointless
maintenance, if the process is terminating. The operating system does
not do the equivalent free'ing operations, but rather, the heap just
goes away, at no cost.
 
A

Andrew Poelstra

Experience. Logic.

Experience, perhaps, if your experience is truly that limited.

Logic, no.
No. A program which insists on freeing its allocated memory
does so by going through each allocated object individually
freeing it. This can involve large amounts of time when the
program has allocated millions of objects in its data
structures. Upon program exit, the operating system reclaims
all memory allocated by the program in one action, and that
takes essentially no time at all.

Perhaps you should write memory managers, if you can really
reclaim millions of individually allocated blocks in a single
action. (Though given your belief there is no need to do so,
I have an idea of how it might work...)
This is not even to mention that when a program frees allocated
memory through 'free', it is usually doing nothing but placing
that memory back onto an internal data structure so that it may
be allocated again. It is seldom returned to the operating system
because it was not allocated that way; allocators request large
blocks and break them up internally to the program.

Allocators allocate. To ascribe anything more to them is
presumptuous and unjustified.
 
S

Seebs

...or this one? The OS, if it frees your memory, needs just
as much CPU time as your code would, had you chosen to make
it portable to machines who are not your mother.

Not generally true.

Imagine that I have allocated a linked list of a million nodes, each
pointing to 1024 bytes of data. On a typical malloc implementation,
freeing these will result in no fewer than two million separate operations,
each of which may also result in complicated operations as the library
consolidates free space, etcetera.

The operating system will simply mark my pages unused -- which it has to do
whether or not I've freed them in terms of the library operation.

The operating system can do it faster because it doesn't need to care about
the internals and bookkeeping of the malloc arena.

-s
 
A

Andrew Poelstra

Not generally true.

Imagine that I have allocated a linked list of a million nodes, each
pointing to 1024 bytes of data. On a typical malloc implementation,
freeing these will result in no fewer than two million separate operations,
each of which may also result in complicated operations as the library
consolidates free space, etcetera.

The operating system will simply mark my pages unused -- which it has to do
whether or not I've freed them in terms of the library operation.

The operating system can do it faster because it doesn't need to care about
the internals and bookkeeping of the malloc arena.

But you would need to have found a gigabyte of contiguous memory
dedicated to your program for this to be simple - otherwise you
would be in memory space also used by other applications, and
bookkeeping does matter.
 
S

Seebs

But you would need to have found a gigabyte of contiguous memory
dedicated to your program for this to be simple - otherwise you
would be in memory space also used by other applications, and
bookkeeping does matter.

It's rather more complicated on most systems. But, in general, there is
nothing you can do inside the program that will reduce the total amount of
bookkeeping that needs to be done on exit, and simply not freeing anything
(as in free()) does not increase the amount of bookkeeping that needs to be
done.

The closest you could get is that some implementations may be such that,
once you've freed all the storage allocated within a given block of
OS-provided storage, you can return that block to the OS, meaning that it does
the bookkeeping to free that block right away, rather than having to do it
after you exit, but this can't make things any *better*, and may make things
*worse*. (For instance, if you end up releasing a block between two other
blocks -- the OS might have to do extra work there that it wouldn't if all
three were released at once.)

Note that "contiguous" is pretty much irrelevant on most modern systems,
as program address spaces are all virtual anyway.

So the fact remains: In general, on all the major platforms in use, there is
no possible case where calling free() a lot right before termination can make
things any better, and there are many obvious cases where they can make things
worse by orders of magnitude.

-s
 
J

jacob navia

Andrew Poelstra a écrit :
What basis do you have for this absurd claim...


"Absurd claim"

ALL OS do that. If not, they would run a single leaking program and
would die immediately.

I have never seen an OS that did not do that.
...or this one? The OS, if it frees your memory, needs just
as much CPU time as your code would,

No, in most cases it will use a single free() call
since all memory allocated for a process will be in
a single heap.
had you chosen to make
it portable to machines who are not your mother.

Nonsense
 
A

Andrew Poelstra

Andrew Poelstra a écrit :


"Absurd claim"

ALL OS do that. If not, they would run a single leaking program and
would die immediately.

Not immediately. And if this was an embedded system, eventually
the watchdog would kick and the programmer would (hopefully)
realize that his code was buggy and needed fixing.
I have never seen an OS that did not do that.

Again, in embedded systems many OS's choose not to support
bad programming. There are no users downloading garbage
from the Internet and blithely expecting it to run despite
barely being compilable.
No, in most cases it will use a single free() call
since all memory allocated for a process will be in
a single heap.

But in "most cases" there is so little memory allocated
that the equivalent free() calls will take practically
zero time anyway. In the cases that a ton of memory is
used, then the OS has its work set out for it whether or
not the code is explicitly calling free().
 
S

Seebs

ALL OS do that. If not, they would run a single leaking program and
would die immediately.

I have never seen an OS that did not do that.

Have you never seen Windows 3.1 or DOS? Both had C implementations where
malloc() could reserve memory that lived past program termination. I think
the Amiga did too.

I mean, not saying that it's still a major issue for most people. But there
certainly did exist such machines, and some people may care about them.

-s
 
A

Andrew Poelstra

So the fact remains: In general, on all the major platforms in use, there is
no possible case where calling free() a lot right before termination can make
things any better, and there are many obvious cases where they can make things
worse by orders of magnitude.

Leaving aside OS's that do not or cannot cleanup, because to the
best of my knowledge no "big" OS's are in use that are like that:


I am going to disagree with you on this point: on my system, if a program
cleans up its own memory, I see a delay between clicking the X (or Ctrl+C,
or whatever) and the program actually dying. During this time it does its
cleanup, plays with the hard drive and whatever. Meanwhile, I am free to
do other things.

However, if the program just dies, I see a quick and painless exit. And
then the kernel takes over to do cleanup, which has a higher priority than
the dead program. And while it's cleaning up, the rest of my system is
sluggish, even if only for a second.

Better the cleanup be done with the program's CPU slices than my UI's,
I say.


Andrew
 
E

Ertugrul Söylemez

Hyman Rosen said:
There is generally no need to free resources just before program exit,
because the operating system will do that for you - a program which
spends a bunch of time freeing every allocated object and then exits
has just wasted all of that time. It's important to free resources
which can be allocated in an unbounded fashion. That is, a program
should act like a garbage collector - unreachable objects must be
reclaimed, but reachable ones don't need to be.

Okay, first let's talk about C. In C memory allocation and freeing is
not quite fast, so you're indeed making a good point. There is also no
garbage collector, which does that job for you. However, there is still
a very important reason to free your memory. At some point in time you
may want to restructure your program, perhaps to enhance it, at which
point it will get you into trouble. We're talking about safety and
modularity here.

Next thing is that exiting a program, which really allocates millions of
memory blocks generally doesn't need to exit that fast anyway. Also
note that allocating and freeing 5 million (!) blocks of 256 bytes using
malloc/free takes only around 1.6 seconds here (Athlon64 x2 with 2.7 GHz
and 2 GiB of RAM). The program I used can be found at [1].

So still there is no excuse for not freeing memory. And that's just
about memory, but I was talking about all kinds of resources. Not
freeing them will get you into trouble. If you're too convinced (or
lazy) to do it, better stop using C. You're writing unsafe, nonmodular
code, because of conviction (or laziness).

Now let's talk about PHP, which is a garbage-collected language, so you
don't deal with memory anyway, unless you specifically want to (e.g. by
using unset()). But there are a lot of resource types, which are not
covered by the garbage collector like files and database connections.
You should free them for the same reasons as above. You're not doing
anything the PHP runtime wouldn't do by itself, but you're writing safe,
modular code that way, which is hardly slower.

[1] http://codepad.org/tjlVGJB8


Greets
Ertugrul
 
S

Seebs

So still there is no excuse for not freeing memory.

Again, there are a couple. However, they're usually system-specific. I can
point you at an obvious case, although it's not portable: If you wish to
execute another program in a UNIX-like environment, you almost certainly
need to allocate the space for its argument list... And then your program
exits to exec the other program. You must not free the memory before
calling exec(), and after exec(), you don't exist.

-s
 
S

Seebs

I am going to disagree with you on this point: on my system, if a program
cleans up its own memory, I see a delay between clicking the X (or Ctrl+C,
or whatever) and the program actually dying. During this time it does its
cleanup, plays with the hard drive and whatever. Meanwhile, I am free to
do other things.
Yes.

However, if the program just dies, I see a quick and painless exit. And
then the kernel takes over to do cleanup, which has a higher priority than
the dead program. And while it's cleaning up, the rest of my system is
sluggish, even if only for a second.

You are mistaken.

This gets pretty off-topic, but: The OS is doing the same cleanup either way.
What can create a noticeable sluggishness is that, if something aborts,
the kernel may choose to dump core -- which is to say, *write out to disk*
a copy of the process memory space. Which can slow things down quite a bit.

If you don't have coredumps, there's no extra cost.

Seriously, I have worked on this code. The kernel does the same cleanup
either way, nothing the program does prior to exiting saves it any time in
most real-world cases. Again, *the kernel does not know or care how the
process has divided up its memory*. In most cases, the process has a single
huge block of memory (from the kernel's point of view), even if the process
has internally subdivided it into thousands or millions of chunks. The
kernel cleanup is the same either way.

-s
 
W

Willem

Seebs wrote:
)> I am going to disagree with you on this point: on my system, if a program
)> cleans up its own memory, I see a delay between clicking the X (or Ctrl+C,
)> or whatever) and the program actually dying. During this time it does its
)> cleanup, plays with the hard drive and whatever. Meanwhile, I am free to
)> do other things.
)
) Yes.
)
)> However, if the program just dies, I see a quick and painless exit. And
)> then the kernel takes over to do cleanup, which has a higher priority than
)> the dead program. And while it's cleaning up, the rest of my system is
)> sluggish, even if only for a second.
)
) You are mistaken.
)
) This gets pretty off-topic, but: The OS is doing the same cleanup either way.
) What can create a noticeable sluggishness is that, if something aborts,
) the kernel may choose to dump core -- which is to say, *write out to disk*
) a copy of the process memory space. Which can slow things down quite a bit.

That's pretty Unix-specific.
On windows, I guess something similar does happen, that the OS is trying to
do some kind of crash-recovery or writing debug info or reporting stuff or
whatever when a program terminates abnormally. Perhaps there's even a
Dr.Watson or similar running in the background that does stack traces and
all kinds of error reporting whenever a program crashes.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
J

jacob navia

Andrew Poelstra a écrit :
Not immediately.
????

And if this was an embedded system, eventually
the watchdog would kick and the programmer would (hopefully)
realize that his code was buggy and needed fixing.

the code is not buggy.

Again, in embedded systems many OS's choose not to support
bad programming. There are no users downloading garbage
from the Internet and blithely expecting it to run despite
barely being compilable.

OK. In your pet os (with lowercase) a program that has
a memory leak will bring the whole system down.

GREAT SYSTEM!
But in "most cases" there is so little memory allocated
that the equivalent free() calls will take practically
zero time anyway. In the cases that a ton of memory is
used, then the OS has its work set out for it whether or
not the code is explicitly calling free().

you have no idea how modern OSes work apparently
 
E

Ertugrul Söylemez

Seebs said:
Again, there are a couple. However, they're usually system-specific.
I can point you at an obvious case, although it's not portable: If
you wish to execute another program in a UNIX-like environment, you
almost certainly need to allocate the space for its argument
list... And then your program exits to exec the other program. You
must not free the memory before calling exec(), and after exec(), you
don't exist.

Of course, but as you say that's system-specific and you're left with no
choice anyway. But if you have a choice, you should choose correctness
over small performance gains.


Greets
Ertugrul
 
A

Andrew Poelstra

Andrew Poelstra a écrit :

the code is not buggy.

It takes memory from the OS and never gives it back.
In what world would this oversight not be a bug?
OK. In your pet os (with lowercase) a program that has
a memory leak will bring the whole system down.

Eventually it will, if you run it enough times that it
uses all of the memory in the system. And then it will
reset and likely be called again. So "problem solved",
aside from the unnecessary reboot and consequential
power usage.
GREAT SYSTEM!

Well, it works, as above.

CPU cycles are not unlimited. Spending them trying to
save inattentive programmers who don't keep track of
their resources is not always a good investment.
 
S

Seebs

It takes memory from the OS and never gives it back.
In what world would this oversight not be a bug?

One in which we expect the operating system to manage resources.

Which is actually pretty much the point of calling something an
"operating system".

If the OS doesn't know which programs have which resources, it's already
failed.
CPU cycles are not unlimited. Spending them trying to
save inattentive programmers who don't keep track of
their resources is not always a good investment.

It turns out, though, that marking the few megabytes a process used to
have as "no longer in use" is much, much, cheaper than having the process
try to walk through a ton of complicated data structures.

-s
 
S

Seebs

Of course, but as you say that's system-specific and you're left with no
choice anyway. But if you have a choice, you should choose correctness
over small performance gains.

In general, I agree. In practice, though, I don't see anything intrinsically
wrong with a system guaranteeing that certain classes of resources are
automatically deallocated on exit, and relying on that behavior.

How many people have you seen close stdout, stderr, and stdin at the end
of a program? Why don't they do that? You'd normally close any file that
was previously open, right?

-s
 
E

Ersek, Laszlo

So the fact remains: In general, on all the major platforms in use, there is
no possible case where calling free() a lot right before termination can make
things any better, and there are many obvious cases where they can make things
worse by orders of magnitude.

It has a development-related side, too. Freeing everything correctly in
the end
- increases understanding of inter-object ownerships,
- helps debugging with valgrind or one's custom allocator,
- makes it easier to embed the program into another program (or
library).

Perhaps such a final cleanup should depend on a compile time (or
runtime) toggle.

It is not easy to spot a leak if everything is leaked in case of a
successful exit. If suddenly the program must do the same thing in a
loop, leaking becomes an issue.

Cheers,
lacos
 
E

Ertugrul Söylemez

Seebs said:
In general, I agree. In practice, though, I don't see anything
intrinsically wrong with a system guaranteeing that certain classes of
resources are automatically deallocated on exit, and relying on that
behavior.

It is good that the operating system does that, but it's wrong to rely
on that feature.

How many people have you seen close stdout, stderr, and stdin at the
end of a program? Why don't they do that? You'd normally close any
file that was previously open, right?

You're not supposed to close the std-handles, because you have not
opened them. You are responsible for closing/freeing resources you have
opened/allocated yourself. For many resources there is even a stack of
allocations, which you should follow. The pattern looks like this:

<file1>
<file2>
<someCode />
<file3>
<someCode />
</file3>
<someCode />
</file2>
</file1>

As your main function is entered, there are already three of those file
tags open, one for each of stdin, stdout and stderr and possibly some
others. Closing one of the handles is the same as putting a closing tag
too early. Because of system peculiarities you may be forced to do
that, but it's better to avoid for the sake of consistency. Closing a
resource prematurely is just as wrong as closing it too late (or not
closing it at all).

That's also the reason I suggest freeing all of the allocated memory
before exiting. If you don't, there is a closing tag missing in your
program pattern. When you restructure your program, you might
accidentally forget that missing closing tag and introduce memory leaks.
You know how difficult it can be to find memory leaks, if you ever
notice them at all.

The CPS-based resource management function I proposed solves this
problem to some extent, because every resource has a clear scope, and
without abusing global/static variables, it can't even escape that
scope. In other words, the closing tag is enforced in the pattern.


Greets
Ertugrul
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,682
Members
48,796
Latest member
Greg L.

Latest Threads

Top