Functions and functional programming.

Malcolm McLean · Jun 4, 2014

nobody in their right mind would prefer this interface [pass an error
flag in as a pointer] when the function returns nothing.

If you've got a convention that error shall always be a pointer and
shall always be the last parameter, it's best to stick to it.
The convention has some value when functions naturally return a
result, in some libraries results are returned through pointers,
purely to preserve the convention that the return type shall be
the error status.

August Karlstrom · Jun 4, 2014

[...]
- Is it okay for error to be null if I'm not interested in the error,
or must it always point to a valid location?

- Does foo always store a result in *error, or only if there is an error?
I.e. must I initialize foo to some value (say zero) and then test
for nonzero? Or can I leave it uninitialized and test it afterward?

The parameter `error' is an output parameter so it must point to a valid
location which is always written to.

nobody in their right mind would prefer this interface when the
function returns nothing.

I don't understand what you mean here.

Then there is the minor point that when you're developing foo, perhaps the
compiler has a useful diagnostic like "not all paths return a value"
or "return with no value in a fucntion returning int".

No compiler I know of will tell you "not all paths store a value into *error".

It is common practice to initialize an output parameter like this to
either zero or one depending on whether successful or unsuccessful paths
dominate the implementation.

-- August

Ike Naar · Jun 4, 2014

And this would add unpredictability to the program. The GC can step at any
inappropriate time, rendering real time behavior a total mess.
In real world applications there are more requirements than just computing
the correct results.

What is the difference between an application and a real world application?
Can you name an application that is not a real world application?

Ike Naar · Jun 4, 2014

No, it may not, because if there are other copies of the object,
it is not reclaimed. The ptr = nil assignment is innocent; it doesn't
take on the full responsibility of computing the lifetime of the object;
it only contributes to that information.

Also, the C compiler may also cause spurious retention in cases like this:

{
var ptr = big_object();
}

computation();

here instead of ptr = nil, we ended the scope of ptr, so it doesn't exist.
(Is that also considered manual GC?)

But the C compiler may well leave a memory location on the stack which
still contains a pointer to big_object across computation.

In a garbage collected (by design, from the ground up) language, the garbage
collector has information from the compiler about where the live objects are;
it doesn't have to just scan every location in the stack conservatively. Or
else, if the collector does just scan the stack, the compiler generates code
which tidies up dead references.

If the 'ptr=NIL;' assignment is not meant to be a manual GC request,
then what would be the reason to perform the assignment at all?
Remember we're talking about an assignment that might be optimized
away (see elsethread), which gives the impression that it apparently
serves no other purpose than to trigger garbage collection.

Jorgen Grahn · Jun 4, 2014

On 2014-06-04 16:46, Malcolm McLean wrote:
void foo(int x, int *error);

Click to expand...

[...]
- Is it okay for error to be null if I'm not interested in the error,
or must it always point to a valid location?

- Does foo always store a result in *error, or only if there is an error?
I.e. must I initialize foo to some value (say zero) and then test
for nonzero? Or can I leave it uninitialized and test it afterward?

Click to expand...

The parameter `error' is an output parameter so it must point to a valid
location which is always written to.

C doesn't have output parameters, so that would have to be a
(documented) convention. I have personally used all three variants he
lists (may be NULL, indicates error/no error, accumulates errors) but
I had to choose and then document carefully.

I don't understand what you mean here.

I think he's talking in a C context. For one thing, such a construct
is almost unheard of in C. You said it yourself upthread --
C programmers generally don't accept that principle you're following.

/Jorgen

Melzzzzz · Jun 4, 2014

GC, at least any semi-competent implementation, should not be the
cause of that problem. There are a bunch of incremental or
non-blocking garbage collectors, more than a few in common use.

GC is problematic on large heaps, that is causes non stop swapping.
That is why it is not advisable to use GC at virtual machines.
Other thing is that non blocking GC-s require much more processing
time than stop the world collectors. Eg IBM concurrent collector
requires complex access to pointer/reference - that is membar and
some pointer chasing...

jacob navia · Jun 4, 2014

Le 04/06/2014 21:35, Malcolm McLean a écrit :

nobody in their right mind would prefer this interface [pass an error
flag in as a pointer] when the function returns nothing.

Click to expand...

If you've got a convention that error shall always be a pointer and
shall always be the last parameter, it's best to stick to it.
The convention has some value when functions naturally return a
result, in some libraries results are returned through pointers,
purely to preserve the convention that the return type shall be
the error status.

In the containers library I used another convention. Error return (an
integer mostly) in the result, and parameters by pointers.

Functions that return pointers are also boolean: either it is a valid
pointer or NULL. Obviously when ANY error happens, the standard
callbacks are called, before any other action.

The library comes with default callbacks that put something in stderr
and go on.

The library user can change this as he/she wants by changing the
callbacks and putting other functions.

So:

int errorCode = Interface.functionXXX(Object *,...);

Positive: OK
Zero: OK with warnings
Negative: hard error

if (List.Append(memberslist,newSubscriptions ) >= 0)
// OK;
else return NULL;

To do:

Extensible. You should be able to add your own error types to the table.
I think that a utility could inspect the executable between two UUIDs,
modifying parameters in the style:

What do you want to do

when there is no more memory?
When a bad pointer is discovered?
Array bounds exceeded?
Other internal inconsistencies?

Do you want to log all errors?
If yes where (path)?

Binary layout would be:

static uuid start = { };
/* Modifiable parameters */
size_t size;
char strict=0; // Go on by default
.... // Other parameters
statuc uuid end = { };

The size data field means the number of bytes to skip to find the end uuid.

A simple utility scans the executable for the start uuid, reads the size
parameter, skips that number of bytes and verifies that the end uuid is
also there.

Now, the default callback of the no memory error would look at the value
of strict. If non zero it will exit the program. Other callbacks may or
may not follow that convention.

This gives great flexibility to classify errors. For instance some
memory-hog program crashes but the main programis not affected, it is an
optional functionality, available only when more memory is there.

Richard Bos · Jun 4, 2014

August Karlstrom said:
On 2014-06-04 16:46, Malcolm McLean wrote:
void foo(int x, int *error);

Click to expand...

[...]
- Is it okay for error to be null if I'm not interested in the error,
or must it always point to a valid location?

- Does foo always store a result in *error, or only if there is an error?
I.e. must I initialize foo to some value (say zero) and then test
for nonzero? Or can I leave it uninitialized and test it afterward?

Click to expand...

The parameter `error' is an output parameter so it must point to a valid
location which is always written to.

There is no such thing as an "output parameter" in C. There are only
pointers. Pointers can be null. It is the function's documentation's job
to declare whether this pointer _may_ also be null, or not. This is a
very different case from real output parameters, as used in, e.g.,
Pascal, which _cannot_ be null.
And yes, there are real reasons to have a function where an *error
pointer _may_ be null. And yes, there are real examples of functions
where this is the case. strtol() does something very similar.

Richard

Kaz Kylheku · Jun 5, 2014

On 2014-06-04 16:46, Malcolm McLean wrote:
void foo(int x, int *error);

Click to expand...

[...]
- Is it okay for error to be null if I'm not interested in the error,
or must it always point to a valid location?

- Does foo always store a result in *error, or only if there is an error?
I.e. must I initialize foo to some value (say zero) and then test
for nonzero? Or can I leave it uninitialized and test it afterward?

Click to expand...

The parameter `error' is an output parameter so it must point to a valid
location which is always written to.

Where is it declared that it's an output parameter?

The int foo(int x) return value is obviously nothing else but an output
from the semantics of returning.

I don't understand what you mean here.

I.e. we are not using the return value for anything and have made it void, and
then are using a pointer to return the status.

We are ignoring the right tool for the job: the return mechanism.

It is common practice to initialize an output parameter like this to
either zero or one depending on whether successful or unsuccessful paths
dominate the implementation.

It's not common *compiler* practice to enforce this, compared to useful
warnings related to return values.

Kaz Kylheku · Jun 5, 2014

If the 'ptr=NIL;' assignment is not meant to be a manual GC request,
then what would be the reason to perform the assignment at all?

Note that in a properly designed garbage collected language we wouldn't need
this assignment.

The garbage collector, somehow in cooperation with the compiler, already knows
that ptr is a dead variable: a variable with no next use in the data flow
graph! (For instance, the compiler might publish liveness information whereby
for a given instruction pointer, there is a table of stack offsets where
variables live in the current frame, and perhaps a bitmask of registers which
contain live values. The register or memory location containing ptr is not
listed. Or, the compiler could generate the "ptr = nil" code for dead
variables, and somehow mark it so that later optimizations don't remove it.)

So, yes, this assignment this is a manual "something". It's not exactly a GC
request because it doesn't necessarily end the lifetime of the object,
nor does it trigger GC. It's a manual attempt way to propagate the end-of-life
of the variable to GC.

Except that it doesn't work without some additional trick due to the
optimization that is carried out without any GC integration.

Remember we're talking about an assignment that might be optimized
away (see elsethread), which gives the impression that it apparently
serves no other purpose than to trigger garbage collection.

Yes; it is written in hopes that if ptr has the last reference then
the object becomes garbage.

Kaz Kylheku · Jun 5, 2014

GC is problematic on large heaps, that is causes non stop swapping.
That is why it is not advisable to use GC at virtual machines.

Without massive qualifications, this is simply nonsense.

"Virtual machine" is not the same as "virtual memory". (Why wouldn't
you use GC on a virtual machine that is fully backed by adequate RAM.)

Generational GC avoids making passes over old data that wastefully confirm that
old data that was previously reachable is still reachable.

Other thing is that non blocking GC-s require much more processing
time than stop the world collectors.

You don't get something for nothing.

Here, we can make the broad observation that there is a tradeoff between
achieving low latency and high throughput.

If a system's throughput is well optimized, you will hardly be able to
improve its latency without giving up some throughput, and vice versa.

Stephen Sprunk · Jun 5, 2014

Note that in a properly designed garbage collected language we
wouldn't need this assignment.

The garbage collector, somehow in cooperation with the compiler,
already knows that ptr is a dead variable: a variable with no next
use in the data flow graph!

Even an ideal garbage collector and compiler cannot always prove that a
object is never referenced again, especially if the reference is in a
static or global variable.

And current garbage collectors and compilers are far from ideal. For
instance, many garbage collectors can't free circular references; you
have to clear one of the references for the circle to be collected.

So, yes, this assignment this is a manual "something". It's not
exactly a GC request because it doesn't necessarily end the lifetime
of the object, nor does it trigger GC. It's a manual attempt way to
propagate the end-of-life of the variable to GC.

Exactly, plus it may also convey information to other programmers, even
if it's not useful to the GC system (or future, better GC systems).

S

August Karlstrom · Jun 5, 2014

Where is it declared that it's an output parameter?

It is semantically an output parameter in the same sense that the return
value of the previously mentioned `int foo(int x)' is semantically an
error flag.

I.e. we are not using the return value for anything and have made it void, and
then are using a pointer to return the status.

We are ignoring the right tool for the job: the return mechanism.

In my opinion the return mechanism is the right tool for the job only
when we are writing a pure (side effect free) function. To me there is
something inherently ugly about expressions with side effects. They also
make it harder to reason about program correctness.

-- August

Malcolm McLean · Jun 5, 2014

In my opinion the return mechanism is the right tool for the job only
when we are writing a pure (side effect free) function. To me there is
something inherently ugly about expressions with side effects. They also
make it harder to reason about program correctness.

if( foo(x, y, z) == 0)
bar();
else
bar();

Looks like it ought to be replaceable by

bar();

or, more subtly

if(foo(x, z, z) == 0 || bar(x, y, z) == 0)

looks like it ought to be replaceable with

if (bar(x, y, z) == 0 || foo(x, y, z) == 0)

Kaz Kylheku · Jun 5, 2014

Are you saying this from a C perspective? Because it's just not true
for most GC'd languages. Even the most basic mark-and-sweep
collectors will handle circular references.

Indeed, it is specifically reference counting which has that problem.

To recognize reference counting as a form of GC is to be charitably
open-minded.

Rosario193 · Jun 5, 2014

if( foo(x, y, z) == 0)
bar();
else
bar();

Looks like it ought to be replaceable by

bar();

not if foo() change some global variable that bar() use...

or, more subtly

if(foo(x, z, z) == 0 || bar(x, y, z) == 0)

looks like it ought to be replaceable with

if (bar(x, y, z) == 0 || foo(x, y, z) == 0)

not if foo() change some global variable that bar() use...

Composability and Concurrency and Functional programming	1	Jun 13, 2014
Programming math challenge gives wrong answer	2	Aug 6, 2023
Not a question, just proud and wanted to share this	1	Jan 29, 2022
Looking to change programming direction	1	Aug 10, 2022
8 buttons ,3 states and PJON Arduino	0	Jan 15, 2022
What should I do Before I give up programming?	6	Jan 14, 2023
Overload ambiguity between f(int,int) and f(double,double)?	8	Jun 15, 2013
C99 float variants of math.h functions	6	Aug 2, 2009

Functions and functional programming.

Malcolm McLean

August Karlstrom

Ike Naar

Ike Naar

Jorgen Grahn

Melzzzzz

jacob navia

Richard Bos

Kaz Kylheku

Kaz Kylheku

Kaz Kylheku

Stephen Sprunk

August Karlstrom

Malcolm McLean

Kaz Kylheku

Rosario193

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads