Experiment: functional concepts in C

Seebs · Feb 17, 2010

So you'd have caused us the same problem that $PROGRAMMING_TEAM did.
After all, at the time they wrote the code, they had no reason to
suspect a new pass; this was not added until some time after completion
of the original program.

Maybe. My guess is that I wouldn't have, because the unfreed allocations
would all be of the form:

if (!table)
table = malloc(...);

and adding passes would have no effect. The only cases I can think of where
I have unconditional allocations which aren't freed, they lead to something
to the effect of:
execv(...);
fprintf(stderr, "exec failed: %s\naborting.\n", strerror(errno));
exit(EXIT_FAILURE);

Or, probably in some cases, there's a single giant object with a foo_free()
defined which you could free at the end of each pass.

Hmm. Okay, I found an exception. I have a typical Unix-style filter
lying around which does not deallocate its buffer and list of offsets into
the buffer. On the other hand, because it's a filter, it's *already*
looping, so it wouldn't make much sense to try to invoke it in a loop. If
you did need to, though, it'd be pretty easy.

-s

Kaz Kylheku · Feb 17, 2010

Please don't. This is also an economically feasible approach, only with
very different economic factors.

Sure, you can always put together a portability programming tournament,
where everyone pays to enter, and the winner fetches a monetary prize.

The same economic factors as in poker, golf, competitive road running,
etc.

In actual software, maximal portabilty is the enemy of portability.

Ignoring for a moment that this is a factually incorrect deliberate
offense, isn't it the general understanding that reasoning and code and
development processes that require less mental strain are easier to keep
bug-free?

I didn't say anything about less work. A lower /level/ of reasoning
doesn't translate to less work.

For instance, throughout this thread you've been advocating the addition
of /completely unnecessary/ code (bearing unnecessary risk) to a program
for the sake of portability.

How can it ever be less straining to add more code, compared to
not adding it at all?

Any monkey can read a couple of ISO standards a few times over again and
learn to spot nonportability in code.

``Hey look, that text stream was closed and a few lines before it, it's
clear that the last character written wasn't a newline! I'm so smart!''

Zarquon · Feb 17, 2010

Richard said:
Only a complete fool makes blanket statements about what only a complete
fool would do, so I wouldn't pay too much attention if I were you.

There are a number of people who I think of as complete fools, who take
every opportunity to paint Richard Heathfield with the brush of
"complete fool", but since I regard them as complete fools, I can
completely ignore their attempts to undermine Richard's credibility.

To find Richard calling himself a complete fool is rather puzzling, as
he otherwise scores fairly high on my credibility meter. If I believe
him, then he has no credibility and if he has no credibility then I
don't believe him. But if I don't believe him then he does have
credibility, so I should believe him.

Help! The paradox is melting my brain!

Kaz Kylheku · Feb 17, 2010

["Followup-To:" header set to comp.lang.c.]

My aim was to enlighten the poster, not to win an argument by
pedantry. If that's being a complete fool, I can put up with it.

Sorry, I that wasn't aimed at you.

Ertugrul SÃ¶ylemez · Feb 17, 2010

Kaz Kylheku said:
What modern languages would those be? I suspect you could not name one
such that you are not wrong.

All languages with an automatic garbage collector. There are even
languages, which implement that "tag stack" I mentioned earlier
directly, like C# with its 'using' directive.

It's not fault tolerance but resource management. You are pitifully
wrong in your other posting.

Relying on this is no more wrong than relying on a TCP/IP stack,
graphics display, or DMA transfers.

TCP/IP stack, graphics display and DMA transfers are functionality.
What you can "resource management" prevents the system from crashing,
just because some programmers write wrong code. That's not
functionality, it's fault tolerance.

Programmers who get paid for a living want their programs to be
portable in a way that is /economically/ relevant, balanced with other
requirements.

Maximal portability, the kind where we pretend we write a single body
of source code without conditionl compilation while pretending we
programming for an incapable, broken platform, is only an obsession of
a few dull minds.

What the hell is wrong with freeing resources? It takes twenty seconds
to write the code to do it. The code itself uses at most one or two
seconds to execute and guarantees correctness.

We all know how well most commercial programs work. If that is your
"enconomically relevant, balanced" portability, then commercial
application development is in a bad state.

Serious programmers won't even argue about this topic. We're talking
about something as trivial as freeing allocated resources. He will just
do it and that's it. After all, that's what he has learned from books
and school anyway.

Without automatic storage managament, there is no real modularity.
Modules are coupled together by the distributed responsibility of
memory management. It creeps into all the interfaces: who allocates
what and who will free it.

It's laughable to advocate explicit storage deallocation and
modularity in the same breath.

Indeed. Having a system, which guarantees correctness, is important in
such a setting.

Greets
Ertugrul

Seebs · Feb 17, 2010

What the hell is wrong with freeing resources? It takes twenty seconds
to write the code to do it. The code itself uses at most one or two
seconds to execute and guarantees correctness.

Well, no.

1. It doesn't guarantee correctness. I already provided an example where
it did not guarantee correctness, because a program could run out of memory
if it ran long enough, even though at any given point, if it exited, it
would free everything.
2. It can take a lot more than twenty seconds to write the code to do it.
3. It can take a lot more than one or two seconds to execute, especially
on slower hardware.

There exist cases where there are good reasons not to do it. They may be
specialized cases, they may not be all that common...

Here's the thing. If you want to argue that it's a good idea, and should be
assumed as a default strategy and used until someone has a concrete and
specific reason not to do it, great! I agree totally.

When you start making blanket assertions, or calling programs which rely on
a 100% documented and guaranteed aspect of their operating environment
"incorrect", though, you undermine your position. The net result is that not
only are people not persuaded of what you're saying, they're persuaded against
the general practice of freeing allocated resources, because they conclude
that the people advocating it are fanatics who don't understand engineering
tradeoffs.

Serious programmers won't even argue about this topic.

Does preemptively insulting people for disagreeing with you usually work
in your culture? In mine it's often taken to be a sign of lacking persuasive
arguments.

-s

Keith Thompson · Feb 17, 2010

Seebs said:
Well, no.

1. It doesn't guarantee correctness. I already provided an example where
it did not guarantee correctness, because a program could run out of memory
if it ran long enough, even though at any given point, if it exited, it
would free everything.
2. It can take a lot more than twenty seconds to write the code to do it.
3. It can take a lot more than one or two seconds to execute, especially
on slower hardware.

4. Even if it takes only one or two seconds, that overhead might
be unacceptable. If my text editor takes one or two seconds to
terminate after I tell it to exit, I'll be annoyed. If a command
that I execute a few hundred times in a loop takes one or two seconds
to exit, I'll probably have to find a different way to accomplish
my task.

Kaz Kylheku · Feb 17, 2010

So be very careful.

Why is buf static?

Because this is a singleton pattern; each call to this function
is expected to yield the same object.

I think what Kaz had in mind is that any memory that needs to be
deallocated on each iteration of the loop is allocated, not via
malloc and friends, but by some other routines that use the "heap"
object created by heap_create(). Since getGlobalFoozleBuffer
calls calloc() directly, the allocated memory won't be affected
by heap_dispose().

But originally, /all/ functions call the allocator directly, just
like this function. When we are refactoring the program to use the heap
allocator we have to distinguish this case, and perhaps have it do this:

static char *buf = heap_calloc(global_heap, 147, 13);

But this is a simple case where the data flow is simple: the source of
the dynamic object and its destination static variable are in the same
full expression. This is not always the case in a real program. The
calloc call may be in some generic routine that is called (perhaps
through a number of levels of call nesting), such that its return value
is sometimes stored in a static variable and sometimes not.

So then the whole function chain needs to pass down the allocation
context.

Finding such cases in a large program is difficult, and so this can
add significant risk.

This boils down to the problem of various degrees of incompleteness of
the simulation of re-running a program. To run a C program, we must
initialize its static variables to their correct intial values specified
in the source code (among lots of other things). Since we have not done
that, we get the old values of the static variables, which may contain
dangling pointers.

The first solution I might reach for to solve this would be dynamic linking,
which is available on a number of important platforms, and provides
reinitialization of statics (if you unload a library and then re-load,
it gets new statics which are properly initialized). On platforms that
support dynamic linking, it can be considered the preferred tool
for turning standa-alone programs into components of other programs.
There is likely going to be some considerable overhead in doing the
unload and reloading, since it involves mapping and unmapping memory.

A less easy, though more portable solution which has other advantages
too, is to identify the static variables of program foo and factor them
out into a ``struct foo_globals''. Then we can initialize them before
re-running the program. This is a lot of work, but at least it's
largely mechanical: looking for all file scope and block scope static
definitions.

A much less portable solution, though seemingly easy, would be to get
help from a scriptable linker like GNU ld. The static objects in
program foo could be put into custom .foo.bss and .foo.data section. On
program startup, we stash a copy of .foo.data somewhere. Prior to each
call to the foo_main, we reinitialize the .foo.data section with the
stashed copy, and clear .foo.bss to all zero bits.

Kaz Kylheku · Feb 17, 2010

All languages with an automatic garbage collector. There are even
languages, which implement that "tag stack" I mentioned earlier
directly, like C# with its 'using' directive.

You're confusing ``storage managed language'' with ``language
which requires all resources to be released when the program
terminates''.

I can think of several storage-managed languages which neglect this
requirement (/and/ do not provide any way for the program to
implement its own cleanup!)

TCP/IP stack, graphics display and DMA transfers are functionality.
What you can "resource management" prevents the system from crashing,
just because some programmers write wrong code. That's not
functionality, it's fault tolerance.

Yet you advocate garbage collection. Garbage collection can also be
regarded as not storage management, but fault tolerance, which
prevents the system from crashing in the face of ``bad'' programs.

Procewss clean up is a coarsely grained form of garbage collection.
A process has certain private resources (no other process has access
to them). So when a process goes away, those resources become
garbage.

My troll detector is starting to go off at this point;
repeated contradictions like this can only be planted.

ImpalerCore · Feb 17, 2010

Well, no.

1. It doesn't guarantee correctness. I already provided an example where
it did not guarantee correctness, because a program could run out of memory
if it ran long enough, even though at any given point, if it exited, it
would free everything.
2. It can take a lot more than twenty seconds to write the code to do it.
3. It can take a lot more than one or two seconds to execute, especially
on slower hardware.

There exist cases where there are good reasons not to do it. They may be
specialized cases, they may not be all that common...

Here's the thing. If you want to argue that it's a good idea, and should be
assumed as a default strategy and used until someone has a concrete and
specific reason not to do it, great! I agree totally.

When you start making blanket assertions, or calling programs which rely on
a 100% documented and guaranteed aspect of their operating environment
"incorrect", though, you undermine your position. The net result is that not
only are people not persuaded of what you're saying, they're persuaded against
the general practice of freeing allocated resources, because they conclude
that the people advocating it are fanatics who don't understand engineering
tradeoffs.

To me, this argument advocates the practice of not explicitly freeing
allocated resources since

1. Deallocating memory at the 'free' level is slower, and gets worse
the complexity increases.
2. It costs more to code it properly and verify.
3. Many target OS handle this condition out of the box.

Why should using "free" be the default strategy when the OS does it
"better"? Why should I spend hours writing and testing manually
freeing resources at the end of any application, when I can get the
above benefits of letting the OS handle it? (Again, I'm not referring
to avoiding freeing resources at all within the normal run of a
program, just at the end of the application.)

If you started a new project where all the target OS have this
feature, why bother explicitly freeing memory resources at all?

Kaz Kylheku · Feb 17, 2010

Does preemptively insulting people for disagreeing with you usually work
in your culture? In mine it's often taken to be a sign of lacking persuasive
arguments.

Note that ``about this topic'' means either side. This is a
clear ``I am trolling'' signal for the astute readers.

Seebs · Feb 17, 2010

To me, this argument advocates the practice of not explicitly freeing
allocated resources since

It is intended to advocate the practice of *thinking* about it.

Why should using "free" be the default strategy when the OS does it
"better"?

1. It doesn't. For anything which happens more than once during a run,
a memory leak can mean that your program fails from running out of memory
before the OS cleans it up.
2. It is *in general* a good way to make sure you've correctly understood
your design and don't have any fundamental memory-management bugs.

Why should I spend hours writing and testing manually
freeing resources at the end of any application, when I can get the
above benefits of letting the OS handle it? (Again, I'm not referring
to avoiding freeing resources at all within the normal run of a
program, just at the end of the application.)

Consistency. If I free things when I'm done with them regardless of
whether I'm done with them for this loop or for this program, my code
will be more consistent and easier to follow.

If you started a new project where all the target OS have this
feature, why bother explicitly freeing memory resources at all?

Because the vast majority of my allocations are in processes which occur
repeatedly.

-s

Seebs · Feb 17, 2010

Note that ``about this topic'' means either side. This is a
clear ``I am trolling'' signal for the astute readers.

That could be. With the .de address, I was also thinking it might be that
the choice of preposition wasn't quite idiomatic, since that happens a lot
between German and English.

-s

Keith Thompson · Feb 17, 2010

Kaz Kylheku said:
Because this is a singleton pattern; each call to this function
is expected to yield the same object.

I had assumed that calloc() would be called each time
getGlobalFoozleBuffer() is called, and that the above was equivalent
to:

char *getGlobalFoozleBuffer()
{
return calloc(147, 13);
}

But in fact the above isn't even legal in C. The initializer for an
object with static storage duration must be constant (C99 6.7.8p4).

<OT>The rules are are different in C++.</OT>

Ike Naar · Feb 17, 2010

Unless you're very careful, this is an anti-pattern,
and will cause difficult to diagnose errors for this
common singleton pattern:

char *getGlobalFoozleBuffer()
{
static char *buf = calloc(147, 13);
return buf;
}

This is also another example of a sort of memory allocation
which is not, and need not be, freed by the program before
it exits.

Your code fragment leaks memory if getGlobalFoozleBuffer() is called
more than once; I thinks this makes more sense:

char *getGlobalFoozleBuffer()
{
static char *buf; /* initially NULL */
if (buf == NULL)
buf = calloc(147, 13);
return buf;
}

ImpalerCore · Feb 17, 2010

It is intended to advocate the practice of *thinking* about it.

Well, you've been successful in that aspect.

1. It doesn't. For anything which happens more than once during a run,
a memory leak can mean that your program fails from running out of memory
before the OS cleans it up.
2. It is *in general* a good way to make sure you've correctly understood
your design and don't have any fundamental memory-management bugs.

I agree, and I've spent a lot of time trying to verify memory
management correctness, and I still run into little bugs. I just
started testing my library code with an allocator that fails and
returns NULL on some nth iteration, and it has led to several seg
faults and requiring another rethink of my implementation.

Consistency. If I free things when I'm done with them regardless of
whether I'm done with them for this loop or for this program, my code
will be more consistent and easier to follow.

Because the vast majority of my allocations are in processes which occur
repeatedly.

Do you agree with, as a rule of thumb, that a developer should
implement a manual free of an application to verify correctness, and
if some metric isn't met (shutdown time,lifetime issues), shortcut the
system and let the OS release the resources.

From the tone of some of the other posters, I (probably incorrectly)
perceived that they think that implementing a "manual free" version is
a waste of time given an application with enough complexity.

Best regards,
John D.

Seebs · Feb 17, 2010

Do you agree with, as a rule of thumb, that a developer should
implement a manual free of an application to verify correctness, and
if some metric isn't met (shutdown time,lifetime issues), shortcut the
system and let the OS release the resources.

As a rule of thumb, yes.

From the tone of some of the other posters, I (probably incorrectly)
perceived that they think that implementing a "manual free" version is
a waste of time given an application with enough complexity.

It might be. There might be cases where I'd look at it and conclude that
it wasn't appropriate, and there are cases where an API makes freeing
impossible. For instance, the POSIX getenv/setenv has a design such that it
MUST leak memory in at least some cases. Similarly, argument lists allocated
before calling exec() can't be freed.

Freeing everything you allocate is a very good rule of thumb, but not a very
good bit of absolute dogmatism. Similarly, it's a great idea to write
portably *when possible*, but if your task is inherently non-portable, it may
not make any sense to try to make it portable. I usually find that writing
clean and portable code for as much of it as possible is still rewarding,
but "portable" can be a bit abstract. A function to manipulate the contents
of a "struct stat" really has no reason to try to be portable outside of
Unix-like environments.

-s

Richard Tobin · Feb 18, 2010

What modern languages would those be? I suspect you could not name one
such that you are not wrong.

[/QUOTE]

All languages with an automatic garbage collector.

I have used many implementations of such languages that did not
perform a garbage collection before exiting. Objects that happened to
be live when the program exited were therefore not freed within the
program.

-- Richard

Nick Keighley · Feb 18, 2010

What the hell is wrong with freeing resources? It takes twenty seconds
to write the code to do it. The code itself uses at most one or two
seconds to execute and guarantees correctness.

it only "guarantees correctness" because you defined correctness as
"freeing all resources at the end of a program".

We all know how well most commercial programs work. If that is your
"enconomically relevant, balanced" portability, then commercial
application development is in a bad state.

Serious programmers won't even argue about this topic.

so you, me, Seebs and Kaz aren't serious programmers?

<snip>

Beej Jorgensen · Feb 18, 2010

Have you never seen Windows 3.1 or DOS? Both had C implementations where
malloc() could reserve memory that lived past program termination.

I only used Turbo C on MSDOS, but as far as I know, all malloc()d memory
was reclaimed unless you exited via the keep() call, which was basically
"terminate and stay resident". You'd tell keep() how much memory (in
16-byte "paragraphs") needed to stay resident, including the code
size... keep() was unaware of which memory had been malloc()d or was on
the stack or was in the code segment or anything.

So it wasn't so much a function of malloc(), as it was telling the OS
what you wanted it to do after the program terminated. Though I'm not
certain, I believe keep() used the MSDOS memory allocation calls to
reserve the memory from the OS.

FWIW,
-Beej

Composability and Concurrency and Functional programming	1	Jun 13, 2014
Experiment: Church lists in Python	0	Jan 16, 2009
Java App for an Online Experiment	1	Aug 23, 2007
Functional schmunctional...	10	Feb 10, 2009
The Concepts and Confusions of Prefix, Infix, Postfix and Fully Functional Notations	30	May 23, 2007
C++ Now 2013 Call for Submissions	0	Oct 31, 2012
PEP thought experiment: Unix style exec for function/method calls	4	Jun 25, 2006
simpler over view on dao: a functional logic solver with builtinparsing power, and dinpy, the sugar	0	Nov 8, 2011

Experiment: functional concepts in C

Seebs

Kaz Kylheku

Zarquon

Kaz Kylheku

Ertugrul SÃ¶ylemez

Seebs

Keith Thompson

Kaz Kylheku

Kaz Kylheku

ImpalerCore

Kaz Kylheku

Seebs

Seebs

Keith Thompson

Ike Naar

ImpalerCore

Seebs

Richard Tobin

Nick Keighley

Beej Jorgensen

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads