Experiment: functional concepts in C

  • Thread starter Ertugrul Söylemez
  • Start date
J

jacob navia

Stefan Ram a écrit :
To me, a lack of automatic memory management makes
functional programming difficult in C.

snip...

In the end, one might invest more time in thinking about
memory management than about the actual programming problem.

So, I believe, this is the real problem, when one tries
to do functional programming in C.

It depends what "C" you are using.

lcc-win proposes a garbage collector in its standard distribution.
All the problems above are solved with a gc
 
J

jacob navia

Stefan Ram a écrit :
Yes, the whole point of my post is that C usually does
/not have/ a GC. So I infer that functional programming,
therefore, is difficult or nearly impossible (beyond a
certain level of complexity) in C.

I have been proposing a garbage collector forC since 2004. The
garbage collector is a part of thestandard lcc-win distribution.

The same collector is available for most workstation environments.

jacob
 
J

jacob navia

Stefan Ram a écrit :
I thought about something along the lines of
ISO/IEC 9899:1999 (E).
The gc is a library, like a graphics library or a network library.

Are you implying that you can't do network programming in C because "network" doesn't appear in the
standard?

The standard doesn't speak a word about graphics either. You can't do graphics in C then?

Your viewpoint of limiting C to what is explicitly said in the standard is just another way of the old:

"You can't do anything serious in C"

and then (implicitly)

"Use C++" or whatever
 
L

ld

  Yes, the whole point of my post is that C usually does
  /not have/ a GC. So I infer that functional programming,
  therefore, is difficult or nearly impossible (beyond a
  certain level of complexity) in C.

I have already answered to this kind of question recently. You don't
need a GC, just a richer set of ownership rules. In the C Object
System (100% C lib) I use new, delete and autoDelete (delayed delete)
for object creation/destruction and retain, release and autoRelease
(delayed release) for object ownership. The reason why there is a
difference between creation and ownership comes from the possibility
to convert a posteriori an existing code back and forth to something
semi-gc where all new do an autoDelete and no delete are required from
the user (and obviously the rule does not apply to retain/release life-
cycle). Since I have adopted these rules with _local_ balance (new <->
delete/autoDelete, retain <-> release/autoRelease), I don't have
anymore memory leaks.

Once you have ownership, you need few more concepts to be able to
program in C as you would in Lisp (and beyond). Another (big) step is
the implementation of generic closures (high order functions and high
order messages) and there you need the full expressive power of COS...

cheers,

ld.
 
S

Stefan Ram

jacob navia said:
Are you implying that you can't do network programming in C
because "network" doesn't appear in the standard?

Given only an implementation of ISO/IEC 9899:1999 (E),
you can not do this. I wrote something about this with C++
in mind, but it applies to C as well:

http://www.purl.org/stefan_ram/pub/c++_standard_extensions_en
"Use C++" or whatever

The problems apply to C++ as well, but not to Java, because
Java has a GC and a network library /in its standard J2EE/.
 
S

Stefan Ram

ld said:
I have already answered to this kind of question recently.
You don't need a GC, just a richer set of ownership rules.

Yes, please give me the URI of this set!
 
K

Keith Thompson

jacob navia said:
Your viewpoint of limiting C to what is explicitly said in the
standard is just another way of the old:

"You can't do anything serious in C"

and then (implicitly)

"Use C++" or whatever

You wouldn't be waging a campaign against C99, would you?
 
L

ld

  Yes, please give me the URI of this set!

I mentioned the list of messages in my post. You can find the semantic
explanation in the paper

http://cos.cvs.sourceforge.net/viewvc/cos/doc/cos-draft-dls09.pdf

or in the slides

http://cos.cvs.sourceforge.net/viewvc/cos/doc/slides-cos.pdf

You can also read the doc on ownership and memory management from the
Apple dev center since they use autoRelease pools for more than a
decade in Cocoa (and still nowadays despite of the introduction of the
GC with Objective-C 2.0).

The code can be browsed on the cvs repository (things in CosBase are
stable but on CosStd and CosExt are unstable).

cheers,

ld.
 
P

Paul Rubin

Yes, the whole point of my post is that C usually does
/not have/ a GC. So I infer that functional programming,
therefore, is difficult or nearly impossible (beyond a
certain level of complexity) in C.

The functional programming would only work on a certain class of objects
within the C program, that are collected by an internal gc. Sometimes
what this leads to is called "Greenspun's Tenth Law" ;-).

There are also so-called conservative gc's for C and C++ (that treat all
data words as possible heap pointers) but at least as a matter of
principle, those can be considered unsatisfying.

Anyway I came late into this thread, but I don't think anyone was
proposing to embed Haskell as cpp macros or anything like that ;-). The
idea is just that it's possible for one's C programs to be influenced by
functional style, and to implement some nontrivial functional constructs
through a few coding disciplines supported by some utility functions.
 
S

Stefan Ram

ld said:

This seems to be a library that either uses garbage collection
or user aided reference counting.

I was hoping for ownership rules for plain C objects or
linked C data structures.

C does not have a garbage collector.

A reference counter is an additional subobject that must be
added to all objects to be managed in this way.

This URI list rules for reference counting with COM (using
"AddRef" and "Release"):

http://en.wikipedia.org/wiki/Component_Object_Model#Reference_counting

And this URI lists some of the problems with reference counting:

http://en.wikipedia.org/wiki/Component_Object_Model#Reference_counting_2

Another URI:

"Bugs caused by incorrect reference counting in COM
systems are notoriously hard to resolve"

http://en.wikipedia.org/wiki/Reference_counting#COM

I believe having read that problems with reference counting
are a reason, why COM does not get so much support from
Microsoft anymore, while many major Windows applications
still use COM. I can not find a direct source for this,
but the following goes into this direction:

»The messiness of COM (Component Object Model) is
removed. (...) .NET addresses many of the shortfalls of
COM, including (...) reference counting (...)«

»The developer controls the lifetime of a COM object.
AddRef and Release live in infamy. They are methods of
the IUnknown interface and control the persistence of
the COM object. If implemented incorrectly, a COM object
could be released prematurely or, conversely, never be
released and be the source of memory leaks. The .NET
memory manager, appropriately named the Garbage
Collector, is responsible for managing component
lifetimes for the application - no more AddRef and Release.«

http://www.codeguru.com/csharp/sample_chapter/article.php/c8245/
 
B

BGB / cr88192

Paul Rubin said:
The functional programming would only work on a certain class of objects
within the C program, that are collected by an internal gc. Sometimes
what this leads to is called "Greenspun's Tenth Law" ;-).

There are also so-called conservative gc's for C and C++ (that treat all
data words as possible heap pointers) but at least as a matter of
principle, those can be considered unsatisfying.

yeah.
they can tend to be slow and unreliable, and can be rather ill-behaved for
certain usage patterns...

ideally, it would be preferable to avoid a conservative GC (and instead use
precise GC), but alas, for C, this leads to far too much hassle/overhead
(having to mess with registering and unregistering roots, ...).

the JVM seem to use an alternate strategy for JNI:
references are "local" or "global", where local references have a
constrained lifespan (and may be destroyed after the JNI method returns).


for C, I have ended up developing a sort of pseudo-GC strategy (used mostly
in my compiler internals), where the allocator just sort of naively
allocates from sliding buffers, which may be destroyed/reused when done.

any data to preserved then is, manually, copied elsewhere (although, more
often, it is just discarded, with the result having been "transcribed" into
a different form).
if made automatic, I guess this would be a sort of crude copy-collector.

OTOH, this could be made into a "semi-automatic" API, whereby one could
instruct the system to copy the data, but it goes about the actual logistics
of doing so (likely with the help of user-registered "copy" handlers). (in
the "manual" strategy, usually type-specific data-copying functions are
written for any data needing to be copied...).

but, anyways, the main advantage of this strategy is that it does well for
the usage patterns which seem to pop up in my compiler: producing large
volumes of objects in short bursts, which almost all turn into garbage.

this particular usage-cases tend to kill my main GC (which tends to work
best with a gradual accumulation of garbage at a relatively low rate).

Anyway I came late into this thread, but I don't think anyone was
proposing to embed Haskell as cpp macros or anything like that ;-). The
idea is just that it's possible for one's C programs to be influenced by
functional style, and to implement some nontrivial functional constructs
through a few coding disciplines supported by some utility functions.

yeah.
Haskell mixed in with C would be nasty...

well, of note:
I do have support for a "closure" facility in my case (via an API call).

however, I haven't really exactly used it much (except in a few edge cases
where it was needed), and generally it would require manually "capturing"
the bindings in the form of a heap-allocated struct (or similar).

so, in general, it has neither the grace nor the elegance of the FP
analogue.

it tends to be actually more usable just to pass an object with the funtion
embedded in a function pointer (even though) you can't simply call this as
if it were a function pointer. this tends actually to be a lot more
convinient, as well as being faster (via lack of crufty "transfer thunks")
and typically generating less garbage...
 
J

jacob navia

Stefan Ram a écrit :
C does not have a garbage collector.

Many things are left for system libraries, like graphics, network,
garbage collectors, and many other things like (for instance)
directory access, thread management, multiprocessing etc.

All of this can be done in C if you do not arbitrarily restrict
the language to the minimal subset as you are trying to do here.

I have promoted and distributed a garbage collector for C since
several years. People like you (and the other "regs") have always
fought this, so your attitude is not surprising.

What is surprising is that now you refuse even to acknowledge that
a widely distributed implementation of gc exists, and it is used.

Your argument that "it is not part of the text of the standard"
is just ridiculous. The "C" of the standard is not even able to
use a directory effectively and there is NO serious software in C
that uses that minimal subset.

With your attitude there is no software written in C.
 
J

jacob navia

Stefan Ram a écrit :
http://en.wikipedia.org/wiki/Reference_counting#COM

I believe having read that problems with reference counting
are a reason, why COM does not get so much support from
Microsoft anymore, while many major Windows applications
still use COM. I can not find a direct source for this,
but the following goes into this direction:

»The messiness of COM (Component Object Model) is
removed. (...) .NET addresses many of the shortfalls of
COM, including (...) reference counting (...)«

»The developer controls the lifetime of a COM object.
AddRef and Release live in infamy. They are methods of
the IUnknown interface and control the persistence of
the COM object. If implemented incorrectly, a COM object
could be released prematurely or, conversely, never be
released and be the source of memory leaks. The .NET
memory manager, appropriately named the Garbage
Collector, is responsible for managing component
lifetimes for the application - no more AddRef and Release.«

http://www.codeguru.com/csharp/sample_chapter/article.php/c8245/

That is why lcc-win proposes a garbage collector, because is the only
sensible solution.

Obviously your agenda is just to demonstrate that C is unable to do
anything. Granted. Go ahead. But you aren't confusing anyone.
 
G

gwowen

That is why lcc-win proposes a garbage collector, because is the only
sensible solution.

And what language do you write your garbage collector in? ;)

Garbage collectors clearly have their place, and an (optional) one
would be a boon to any language that doesn't have one built in. But
there are perfectly good alternatives, besides manual reference
counting. The fact that COM screwed up reference counting, doesn't
mean that reference counting is a non-starter. In many circumstances
a smart pointer is preferable to GC, as it gives a programmer greater
control over deterministic destruction (RAII works a lot better on
managing mutexes than GC, for example -- lock()/unlock() cares about
timing in a way malloc()/free() doesn't). In many circumstance, GC is
preferable to smart pointers, as the programmer doesn't care about
those things.

As ever: There is no silver bullet.
 
J

jacob navia

gwowen a écrit :
And what language do you write your garbage collector in? ;)

Boehm's GC is written in plain C. Yes, you can write a GC in C. There
are a few bits and pieces done in assembly, but for the most part
everything is in C.
Garbage collectors clearly have their place, and an (optional) one
would be a boon to any language that doesn't have one built in.
True

But
there are perfectly good alternatives, besides manual reference
counting. The fact that COM screwed up reference counting, doesn't
mean that reference counting is a non-starter.

COM did not "screw it". Reference counting has its drawbacks, that's all.
In many circumstances
a smart pointer is preferable to GC, as it gives a programmer greater
control over deterministic destruction (RAII works a lot better on
managing mutexes than GC, for example -- lock()/unlock() cares about
timing in a way malloc()/free() doesn't).

That could be done in C if we had operator overloading, something I have
been trying to promote for years.
In many circumstance, GC is
preferable to smart pointers, as the programmer doesn't care about
those things.

Yes.
As ever: There is no silver bullet.

True. GC has its drawbacks too. For instance if you forget a reference to some
huge object you create an enormous memory leak since the collector can't
free it. Finding out where is that reference can be an incredibly complex
task.

Note that that is a problem for ALL GC's, java, lisp whatever.
 
N

Nick Keighley

gwowen a écrit :


True. GC has its drawbacks too. For instance if you forget a reference to some
huge object you create an enormous memory leak since the collector can't
free it. Finding out where is that reference can be an incredibly complex
task.

Note that that is a problem for ALL GC's, java, lisp whatever.

it has to be, otherwise the garbage collector would have to read your
mind. "ah, I see he has a reference to GiganticDataStructure that he
meant to get rid of. I'll just null it out and re-run gc".
 
G

gwowen

COM did not "screw it". Reference counting has its drawbacks, that's all.

The IUnknown AddRef() and Release() interface is incredibly brittle
and error prone. Making a programmer manage all the references by
hand is a bad idea. AddRef/Release is to reference counting as malloc
()/free() is to memory. It's not consistent to repeatedly recommend
GC for memory, without spotting that the IUnknown is prone to the same
problem.

As the background mechanism, reference counting is fine -- just like
heap allocation is fine. It is forcing the programmer to manage it by
hand that can frequently lead to errors. It's actually probably worse
for interfaces, because most programmers have an idea who the "owner"
of some memory is, which, in my experience, is not necessarily the
case with Interfaces. Also, leaked memory is easily cleaned up at
process exit time -- that's also not necessarily the case for non-
memory resources.
 
R

Rob Kendrick

Your argument that "it is not part of the text of the standard"
is just ridiculous. The "C" of the standard is not even able to
use a directory effectively and there is NO serious software in C
that uses that minimal subset.

You've heard of Lua, yeah? http://www.lua.org/ It's a very serious
piece of software; one of the fastest scripting languages around. And
written entirely in ANSI C with only a handful of concessions (such as
dlopen to allow for extensions).

It's quite easy to write great, reusable, reliable software components
in ANSI C without reaching for OS-specific extensions.

B.
 
M

Michael Foukarakis

gwowen a écrit :



Boehm's GC is written in plain C. Yes, you can write a GC in C. There
are a few bits and pieces done in assembly, but for the most part
everything is in C.


COM did not "screw it". Reference counting has its drawbacks, that's all.

It's not in any way convenient for the programmer. The fact that all
parties involved have to consistently count references is as error
prone as simply enforcing the proper use of malloc() & free() and
equivalents, and it has led to notoriously hard to find bugs. Even
Microsoft admitted that COM did, in fact, "screw it" which is why they
favoured tracing GC in .NET, so there's no point in trying to defend
its shortcomings. My personal belief is that *manual* reference
counting is still a bad idea - why do it when you can completely
decouple it from the programmer by providing him with a decent memory
manager? That way, everybody wins. And if you can't do that, just let
the poor guy manage malloc() & free(), why burden him/her with a
crippled interface?
That could be done in C if we had operator overloading, something I have
been trying to promote for years.

Operator overloading and lots of preprocessor tricks, correct?
True. GC has its drawbacks too. For instance if you forget a reference to some
huge object you create an enormous memory leak since the collector can't
free it. Finding out where is that reference can be an incredibly complex
task.

This can be a problem with smart pointers as well. Typical example:
shared_ptr between an automatic pointer and a member of a singleton
(seen it a thousand times..).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,432
Messages
2,571,680
Members
48,796
Latest member
Greg L.

Latest Threads

Top