Contents of an exception object

Ben Bacarisse · May 31, 2014

James Harris said:
Basically, each function in the call chain would have the same
test-and-return line at selected points - usually after each call to a
function which could result in an exception - though there are
variations.

I see. So it's like functions that set errno? That does not match my
criteria for an exception mechanism (I'd call is having rules for
handling errors), but I don't want to argue over terms.

Thanks for taking the time to explain.

<snip>

Malcolm McLean · May 31, 2014

What about raise, longjmp, atexit, thrd_create (and friends), mtx_lock
(etc.)? Are they oddballs or simple to classify?

Parallel programming is a whole different can of worms. None of the above
functions can be written as bit-shuffling functions, so they can't
be called by bit-shuffling functions.But there shouldn't be any need for
a bit shuffling function to use any special flow control structures.
They can always be written, both in the practical and the theoretical
sense, in terms of loops and branches. It will never call exit() unless
it runs out of memory, for example. It can never fail unless passed a
bitstate for which it is undefined, in which case there's a programming
error in caller.

Maybe "IO procedure is the wrong term, because you could object that
"exit()" doesn't do IO. So go back to my term "procedure", which I was
scolded for.

I don't know what you are saying here. I just can't follow the words.

Is a simple yes/no answer possible for the two examples I gave? If not,
that's fine, I'd just like to know.

Would you describe a function which calls an IO routine through a callback
as "a function which perform IO?". You can't give a "yes no" answer to that.
But allowing a callback isn't fatal. It doesn't make a function a _non-
bit shuffling_ function. It does make it a "non-function" (in the
mathematical sense) if the callback doesn't obey certain rules.

Your classification is based on the function, but the examples were
supposed to suggest that it's sometimes the call that determines what
the function does (bit-shuffling or IO).

Basically you need to disallow passing callbacks that do IO to bit_shuffling
functions, yes. The system doesn't fall apart as badly as with direct
calls, but we lose a lot of the desirable characteristics of bit-shuffling
functions if we allow it.

I'd say simplistic rather than simple -- I think it misses the
complexity of most programming languages. Maybe it will seem simple
when it's defined well-enough for me to see what functions are in which
category.

Hardly simplistic.
It's simple to say "do IO or just shuffle bits" yes. But the implications
aren't obvious and go quite deep. As you've shown with some of these cases.

Ben Bacarisse · May 31, 2014

Malcolm McLean said:
Parallel programming is a whole different can of worms.

The first three are not about parallel programming.

None of the above
functions can be written as bit-shuffling functions, so they can't
be called by bit-shuffling functions.

Some can be. Dekker's algorithm is pure bit shuffling and can be used
to write mtx_lock (on the right hardware).

But there shouldn't be any need for
a bit shuffling function to use any special flow control structures.
They can always be written, both in the practical and the theoretical
sense, in terms of loops and branches. It will never call exit() unless
it runs out of memory, for example. It can never fail unless passed a
bitstate for which it is undefined, in which case there's a programming
error in caller.

Maybe "IO procedure is the wrong term, because you could object that
"exit()" doesn't do IO. So go back to my term "procedure", which I was
scolded for.

It has an existing well-understood meaning (well, more than one, but
context will usually disambiguate it).

The trouble I'm having with these mini essays you write is that they all
just add to the impression that this supposedly crisp distinction is in
fact anything but.

Would you describe a function which calls an IO routine through a callback
as "a function which perform IO?". You can't give a "yes no" answer to that.
But allowing a callback isn't fatal. It doesn't make a function a _non-
bit shuffling_ function. It does make it a "non-function" (in the
mathematical sense) if the callback doesn't obey certain rules.

I am more confused than ever now. Are there, then, four categories?
Bit-shuffling functions, IO procedures, non-functions and oddballs?

(BTW, I disagree with the idea that IO makes non-functions in the
mathematical sense. Haskell's IO monad is a counter-example, and there
are others.)

Basically you need to disallow passing callbacks that do IO to bit_shuffling
functions, yes. The system doesn't fall apart as badly as with direct
calls, but we lose a lot of the desirable characteristics of bit-shuffling
functions if we allow it.

Right, so it's not the function itself, but all the permitted calls that
determines its category, and some calls are disallowed in order to
preserve the distinction? That's another reason not to use it as far as
I am concerned. Higher-level functional patterns are too useful to be
disallowed by a convention whose only purpose is to maintain this
apparently useless distinction.

Hardly simplistic.
It's simple to say "do IO or just shuffle bits" yes. But the implications
aren't obvious and go quite deep. As you've shown with some of these
cases.

OK, but up until recently that seemed to what you were saying: that
there was a simple and crisp distinction between what you called
bit-shuffling functions and IO procedures.

Malcolm McLean · May 31, 2014

The trouble I'm having with these mini essays you write is that they all
just add to the impression that this supposedly crisp distinction is in
fact anything but.

No, a "function" is a mapping from input to output, a "procedure" is a
series of actions. So to a good approximation, procedures do IO,
functions do not. But it's better to use words rather than to try to
replace a word by a definition.

I am more confused than ever now. Are there, then, four categories?
Bit-shuffling functions, IO procedures, non-functions and oddballs?

Functions are basically defined in terms of input state versus output
state. That breaks down when another function is used as a parameter,
because the function that is passed as parameter is not defined by bits.
(In an underlying sense, of course it's a bit pattern of executable
statements, but we've no access to these at the level we're analysing).
But that's not serious enough for us to say they are "not functions"
in normal use. However you need special rules for callbacks to
preserve the desired characteristics of "bit shuffling functions".

Oddballs, probably are either procedures or functions, but it's maybe
not immediately obvious where they lie. sleep() for example, maybe
queries a clock internally, so it would be a procedure. But it's not
something we normally regard as "doing IO". You can maybe come up
with a marginal case.

Right, so it's not the function itself, but all the permitted calls that
determines its category, and some calls are disallowed in order to
preserve the distinction? That's another reason not to use it as far as
I am concerned. Higher-level functional patterns are too useful to be
disallowed by a convention whose only purpose is to maintain this
apparently useless distinction.

Callbacks need special rules. Otherwise the function plus the callback
parameter becomes effectively the "function". Those rules are in practice
obsverved by most real world programs, but they aren't enforced by C,
and they're not easy to automatically enforce for C-like languages.

They are basically that the callback shall not modify any bits used
by the main function, except such bits as are passed to it by that
function. If you break that, you cannot guarantee that the main
function is correct, it cannot be tested independently of the callback.
You've also got to specify that the program shall not rely, for its
correctness, in callbacks being called in any particular order or
any particular number. Violate that, and you lose rewritability, we
can no longer replace qsort() with an insertion sort, and guarantee
that nothing which calls qsort() will break.

So yes, in the case of callbacks, the callback function has to be of
a permitted type.

OK, but up until recently that seemed to what you were saying: that
there was a simple and crisp distinction between what you called
bit-shuffling functions and IO procedures.

I called then "bit shuffling functions" and "IO procedures", but it's
better and far less confusing, as it turns out, just to use normal
English words, "function" and "procedure". A function can't turn
a light on. It can be applied to another function, but it cannot
modify the definition of another function, it can't change a
mathematician's intermediate calculations as he evaluates another
function. It can always be replaced by another function which is
identical to it. These characteristics apply to mathematical
functions and to "bit shuffling" functions.

A procedure is a format or protocol for a series of actions, like
an operation in a hospital, or a pattern of radio signals and lights
for landing an aircraft. It's doing something, not calculating
something.

Now you start getting into deep waters when you say things like "why
treat the memory specially, surely it's just another device attached
to the processor" and "surely procedures themselves are bits, can
we not talk of the function of the procedure?". A lot underlies it.

But the basic idea is very simple and easy to implement, and the
benefits are very clear.

Ben Bacarisse · May 31, 2014

Malcolm McLean said:
Functions are basically defined in terms of input state versus output
state. That breaks down when another function is used as a parameter,
because the function that is passed as parameter is not defined by
bits.

Surely there are functions with function parameters where this does not
break down? I.e. there are functions "basically defined in terms of
input state versus output state" but which have function parameters?

(In an underlying sense, of course it's a bit pattern of executable
statements, but we've no access to these at the level we're analysing).
But that's not serious enough for us to say they are "not functions"
in normal use. However you need special rules for callbacks to
preserve the desired characteristics of "bit shuffling functions".

I'm not clear what these desired characteristics are. It might be
clearer to define the distinction using these characteristics: those
that have then are bit-shufflers, those that don't are IO procedures.

Callbacks need special rules.

Not all functions passed are callbacks. Is it only callbacks that need
special rules?

Otherwise the function plus the callback
parameter becomes effectively the "function". Those rules are in practice
obsverved by most real world programs, but they aren't enforced by C,
and they're not easy to automatically enforce for C-like languages.

They are basically that the callback shall not modify any bits used
by the main function, except such bits as are passed to it by that
function.

By main function you mean the calling function right? That's a phrase
to avoid in a C forum. It might help if you are thinking only in C
terms here. If not, "passed to it" is insufficient.

If you break that, you cannot guarantee that the main
function is correct, it cannot be tested independently of the
callback.

I don't get that. You seem to be saying that a function with this
pattern:

int f(int (*cb)(void *), int n)
{
int x = 0, i = 0;
while (i < n) x += cb(&i);
return x;
}

can be tested independently of the callback provided it only modifies
things passed to it like i? Maybe you were giving only a necessary
rather than a sufficient condition?

But even if you meant a necessary condition, I think it's wrong. I
think you can argue perfect well about the correctness of the map
function I posted, even if the callback does IO.

You've also got to specify that the program shall not rely, for its
correctness, in callbacks being called in any particular order or
any particular number. Violate that, and you lose rewritability, we
can no longer replace qsort() with an insertion sort, and guarantee
that nothing which calls qsort() will break.

So yes, in the case of callbacks, the callback function has to be of
a permitted type.

That's all about your bit-shuffling functions, yes? Do IO procedures
have any similar restrictions?

I called then "bit shuffling functions" and "IO procedures", but it's
better and far less confusing, as it turns out, just to use normal
English words, "function" and "procedure". A function can't turn
a light on.

Except in Haskell and in C.

It can be applied to another function, but it cannot
modify the definition of another function, it can't change a
mathematician's intermediate calculations as he evaluates another
function. It can always be replaced by another function which is
identical to it. These characteristics apply to mathematical
functions and to "bit shuffling" functions.

The last condition applies any object of any type at all. All things
(at least in a programming context) can be replaced by a thing that is
identical to it.

A procedure is a format or protocol for a series of actions, like
an operation in a hospital, or a pattern of radio signals and lights
for landing an aircraft. It's doing something, not calculating
something.

Unless you take the Haskell view that calculating actions is a perfectly
reasonable thing to do.

And does this function do something or calculate something?

void zero(int *x, size_t n) { memset(x, 0, n * sizeof *x); }

What about a function that swaps bytes in a buffer, or one that reverses
a string (in place)? Are there doers or calculators?

(And a "procedure" can also be replaced by a "procedure" that is identical
to it, can't it?).

Now you start getting into deep waters when you say things like "why
treat the memory specially, surely it's just another device attached
to the processor" and "surely procedures themselves are bits, can
we not talk of the function of the procedure?". A lot underlies it.

But the basic idea is very simple and easy to implement, and the
benefits are very clear.

Not to me. What are the benefits?

James Harris · Jun 1, 2014

Ben Bacarisse said:
I see. So it's like functions that set errno?

I wouldn't say so. Here are some reasons.

1. Conventions surrounding errno are different. The convention is what gives
the exception word approach consistency. Without the consistency the
approach would become ad-hoc and a mental burden on the programmer... "when
should I use one approach and when another?" for example.

2. AIUI errno might be set to a nonzero value even by functions that return
success. By contrast, exception would only be set if an exception occurred.

3. AIUI where a called function might set errno programs are expected to
check the return code of the function first and only use errno if the return
code indicates an error. Many programs only return an error code and do not
set errno at all. However, the exception variable is used exclusively to
indicate an exception so the response to it being nonzero is always
consistent. The test for an exception is always simple.

4. The errno variable might be set without having any accompanying detail to
reveal what really happened. By contrast, if exception is set the
exception_record struct will also have explanatory details. They are simple
at the moment but I may add further info over time.

That does not match my
criteria for an exception mechanism (I'd call is having rules for
handling errors), but I don't want to argue over terms.

Good idea.

Thanks for taking the time to explain.

You're welcome.

James

BartC · Jun 1, 2014

It is not a "normal" exception mechanism in that it is not automatic.
However, exceptions can be thrown, caught, replaced, cancelled and
rethrown. Once thrown they persist until they are cancelled or replaced.
They are easy to catch and handle. That lot seems to me fairly close to
covering the essentials!

Downsides: Unwinding is manual and I don't at present add any call stack
info to an exception object.

In fact the changes are not that intrusive. After a call to something
which could throw an exception (or which could itself call something which
could generate an exception) I usually have just

if (exception) return;

Frankly, it's awesome to have this in C! There is not much coding to do
and the result is useful and simple.

From what you've described in this and follow-up posts, this manual
'exception'-handling scheme can be implemented in any language, not just C!
(And I would disagree about the amount of coding: see below.)

That's if it is as I understand it , that all functions in a call-chain have
to cooperate, with every call to the next lower function in the chain having
to be checked for an error/exception condition, and to deal with the error
or to immediately return and pass it back up.

Compared with what is normally understood (or as I understand), where you
can have functions A,B,C,D,E in the chain, and only A and E have to
cooperate: E throws an exception, and A can catch it, while B, C and D don't
need to know anything about it, nor check anything, nor even execute a
return.

The intial query was about potential contents of the exception object, not
about the mechanism.

From your OP it sounds like there is no limit to what information can be
included. In that case it would depend largely on your application.

Ben Bacarisse · Jun 1, 2014

James Harris said:
I wouldn't say so. Here are some reasons.

<snip the differences from the rather under-specified rules for errno>

Yes, I did not mean to suggest that the protocol for handling the
variable was the same, only that the way in which you "get" to the error
handling is by a normal function return. There's no "jump".

In one not entirely dissimilar system, I kept a small stack of the error
data. This permitted calls higher up to add explanation to lower-level
errors. You would get things like

file transfer failed because
server could not be contacted because
permission denied when opening socket.

(along with function names and lines numbers and such like if you wanted
them). Of course, higher up calls could cancel the lower-level data
when there was no extra value that it could provide, and users of the
code (it was a networking library) could choose what got printed.

David Brown · Jun 1, 2014

Not to me. What are the benefits?

This, I think is a key question. If we can get an idea of the benefits
of Malcolms attempted classifications here, then perhaps it will be
possible to work backwards to clearer rules about categories of functions.

An example of such benefits might be that if a compiler knows that a
function is "pure" (always giving the same results for the same inputs,
and having no observable side-effects), then the compiler can apply
additional optimisations - compile-time calculations, hoisting out of
loops, etc. Then we can look at what requirements are needed for such
optimisations to be valid, and use that for a strict classification that
will make sense in the context of C programming.

Malcolm McLean · Jun 1, 2014

Surely there are functions with function parameters where this does not
break down? I.e. there are functions "basically defined in terms of
input state versus output state" but which have function parameters?

Exactly. The point is that you have to put some restrictions on the
callback for that to hold. But they are no onerous restrictions,
they're ones which most C programmers who use function-like callbacks
observe in practice. (As opposed to OO-like callbacks).

I'm not clear what these desired characteristics are. It might be
clearer to define the distinction using these characteristics: those
that have then are bit-shufflers, those that don't are IO procedures.

That's a good idea.
A bit-shuffling function.

Can be written in pure portable C, given only a specification of bit state
on output given bit state on input, without any calls to external libraries,
without any subroutine calls, without any special language constructs or
assembler, with the sole exception of calls to memory allocation functions.

Any bit-shuffling function can be replaced by any other implementation of
the bit shuffling function which also obeys the specification.

A bit shuffling function cannot fail to produce the desired output, unless
it is called with an input bit pattern for which it is undefined (a
programming error by caller), or the if memory allocation routines fail
to allocate the desired memory.

A bit shuffling program can be tested and "proven correct" (in scare
quotes) on any machine with a conforming C compiler and adequate memory,
regardless of other hardware devices attached to that machine.

That's all highly desirable, practically useful. Not theoretically
interesting in any mathematical sense, it's just a description of a
"function", a known concept.

Not all functions passed are callbacks. Is it only callbacks that need
special rules?

You can pass a "function" in some notation. So you're effectively creating
a meta-language. That has implications for testability and it becomes
non-trivial to describe "a bit state for which the function is defined".
But I dont think it makes a bit shuffling function into something else.

I don't get that. You seem to be saying that a function with this
pattern:

int f(int (*cb)(void *), int n)
{
int x = 0, i = 0;

while (i < n) x += cb(&i);

return x;
}

can be tested independently of the callback provided it only modifies
things passed to it like i? Maybe you were giving only a necessary
rather than a sufficient condition?

Yes, that's pretty much it.

Something like this

int i;

int f( int (*cb)(void), int n)
{
int x = 0;

for(i=0;i<n;i++)
x += cb();

return x;
}

can't be tested if we allow cb to modify i, particularly if the understanding
is that the bit state of x on exit is "the sum of n consecutive calls
to cb()".
if we rewrite the function

while(n--)
x += cb();

by bit-shuffling rules, that should be an invariant. But it's not.
Effectively f and cb need to be fused together, there half a function each.

But even if you meant a necessary condition, I think it's wrong. I
think you can argue perfect well about the correctness of the map
function I posted, even if the callback does IO.
I need to understand this.

That's all about your bit-shuffling functions, yes? Do IO procedures
have any similar restrictions?

No, because a "procedure" does things. So the order in which callbacks are
called can be important for program correctness.
Basically, if we're summing n variables, it doesn't matter if a function
loops round the list twice, then divides the total by two. it doesn't matter
how it got to the correct output state. But if we're outputting "hello world",
then outputting a series of letters and backspaces which adds up to "hello
world" is correct only for some physical output devices.

Except in Haskell and in C.

Exactly. Far too much is made of the fact that C programmers by tradition
use the term "function" in an extended sense.

The last condition applies any object of any type at all. All things
(at least in a programming context) can be replaced by a thing that is
identical to it.

Maybe I mean equivalent. (a + b)/2 and a + (b-a) * 0.5 are the same,
assuming no overflow. We can replace one by the other.

Unless you take the Haskell view that calculating actions is a perfectly
reasonable thing to do.

And does this function do something or calculate something?

void zero(int *x, size_t n) { memset(x, 0, n * sizeof *x); }

What about a function that swaps bytes in a buffer, or one that reverses
a string (in place)? Are there doers or calculators?

That's why memory is special.
They're functions. They're defined in terms of output state given input
state. Moving memory about is not an "action". That's not a physical
rule, it's a programming rule.

(And a "procedure" can also be replaced by a "procedure" that is identical
to it, can't it?).

No.
Let's define the procedure as "present the message 'hello world' to the user".
printf("H\bHello world\n")
fulfils the procedure on some output devices but not others.

Not to me. What are the benefits?

Essentially we can shift all the calculation into the bit-shuffling
functions. A procedure can do trivial calculation, for example if the
procedure is "read a string from input", it can nul-terminate it.
But if the procedure is "read a real from input", it should call
a bit-shuffling function to convert from ascii to floating point.

Whilst writing procedures and bit shuffling functions are both
programming, they are different types of programming tasks. I
said that "all the interesting work is in the bit shuffling
functions", which was maybe a bit unfair to people who write mainly
procedures. The interest and difficulty in a procedure is in
interacting with electronics, which demand things like bits being
written out on rising edges of clocks, handshakes being exchanges
until the device responds with a "ready". All hard to do, but a
totally different skill set to bit shuffling. I'm interested mainly
in bit shuffling, I use mainly accepted and developed interfaces for
IO.

ais523 · Jun 1, 2014

Keith said:
My previous post is incorrect, and I apologize for the error.
I did not read the Wikipedia article closely enough. In fact,
it does support a meaning for "idempotent" that's close (but not
identical) to the way Malcolm uses the term.

[snip quote from Wikipedia]

I was familiar with the mathematical meaning of the term, and the
similar meaning used in functional programming, and I incorrectly
assumed that the same meaning applied to programming in general.
This is the first time I've encountered the usage of "idempotent" to
refer to a function that merely returns the same result for the same
argument, rather than one with the property that f(f(x)) == f(x).

I agree with Malcolm that it's unfortunate that the term is used
inconsistently.

(Malcolm's definition still doesn't quite match the meaning mentioned
by the Wikipedia article; a function that depends on other things
can still be idempotent.)

The programming/computer science meaning I'm aware of is that "the
function can be called twice or more in a row with the same arguments,
and it will do the same thing as calling it once". This is not quite the
same as "always produces the same return value, given the same
arguments".

For instance, this function is idempotent:

static int a;
int foo(void)
{
a = 4;
return 2;
}

and this function isn't:

static int a;
int bar(void)
{
a++;
return 2;
}

Both functions always produce the same return value, given the same
arguments, but both functions also have side effects. Calling foo()
twice has the same effect as calling it once (thus idempotent), but a
different effect from calling it zero times. Calling bar() twice is
different from caling it once, and different from calling it zero times.

This corresponds to the mathematical definition if you treat the
argument and return values of the mathematical functions represented by
the C functions as including the state of the I/O system and all of
memory, rather than just the arguments as C understands them.

Ben Bacarisse · Jun 1, 2014

Exactly. The point is that you have to put some restrictions on the
callback for that to hold.

So you might have said "That *sometimes* breaks down when another
function is used as a parameter". I thought you were making a general
rule about when the input to output mapping breaks down.

But they are no onerous restrictions,
they're ones which most C programmers who use function-like callbacks
observe in practice. (As opposed to OO-like callbacks).

They are onerous to me. I have a list-processing library that uses
map-like functions to do all sorts of things, including output.

Now, if you are being C-specific, I agree that such patterns are not
common, but I got the feeling you were talking more generally than that.
In other languages, arbitrarily restricting function arguments is a
disaster. (But I see from what follows that the idea is, essentially,
C-based so programmers in other languages need not take this advice.)

That's a good idea.
A bit-shuffling function.

Can be written in pure portable C, given only a specification of bit state
on output given bit state on input, without any calls to external libraries,
without any subroutine calls, without any special language constructs or
assembler, with the sole exception of calls to memory allocation
functions.

Ah, so this is a C-specific. I'll stop talking about programming more
generally then.

What's more, there are almost no bit-shuffling functions Simply
writing

int max(int a, int b) { return a > b ? a : b; }
int max3(int a, int b, int c) { return max(max(a, b), c); }

stops max3 from being a bit-shuffler. That just seems way too
restrictive. At the same time, the very vague words "specification" and
"bit state" make it hard to see what else is excluded.

AH!! You are not even sticking to your own made-up terms. I took
"subroutine" to be some general term, but you mean one of your
"procedures" or "IO procedures", yes? Banning subroutine calls in the
general sense makes no sense.

Any bit-shuffling function can be replaced by any other implementation of
the bit shuffling function which also obeys the specification.

It's possible (and in my view more useful) to extend this rule to all C
functions. That means taking a suitably abstract view of the
specification. This is the single most important reason why I don't
think this sort of classification is useful. I want to be able to
reason (often about whole programs) written in portable C, regardless of
function arguments and "IO".

For example, I'd like to be able to argue about portable threaded C
programs and not a single function that uses any of C's concurrency
primitives is bit-shuffler. I'd rather develop ways to argue about a
much larger class of functions, than to home in on a tiny set.

A bit shuffling function cannot fail to produce the desired output, unless
it is called with an input bit pattern for which it is undefined (a
programming error by caller), or the if memory allocation routines fail
to allocate the desired memory.

A bit shuffling program can be tested and "proven correct" (in scare
quotes) on any machine with a conforming C compiler and adequate memory,
regardless of other hardware devices attached to that machine.

The same is true of ones that violate your rules, and that, to me, is
more important than your seemingly arbitrary classification.

That's all highly desirable, practically useful. Not theoretically
interesting in any mathematical sense, it's just a description of a
"function", a known concept.

It's highly desirable to do that whenever possible. Anything that
suggests that one should not or (worse) can not do that more widely is
unhelpful.

Yes, that's pretty much it.

Still baffled. How can you test the above independently of the callback?

Something like this

int i;

int f( int (*cb)(void), int n)
{
int x = 0;

for(i=0;i<n;i++)
x += cb();

return x;
}

can't be tested if we allow cb to modify i, particularly if the understanding
is that the bit state of x on exit is "the sum of n consecutive calls
to cb()".
if we rewrite the function

while(n--)
x += cb();

by bit-shuffling rules, that should be an invariant.

Eh? You've changed the meaning entirely independently of what rules the
callback obeys.

But it's not.
Effectively f and cb need to be fused together, there half a function
each.

Yes, you've written an entirely different function, regardless of what
cb does or does not do.

I need to understand this.

It's key for me. Feel free to ignore the detailed comments (I'm not
sure I understand the details of what you are saying anyway). The big
objection is that you focus the idea of reasoning about code on too
small a set of functions (in the C sense of the word), and with too
low-level a view of what is being verified by the proof or argument.

Exactly. Far too much is made of the fact that C programmers by tradition
use the term "function" in an extended sense.

And Haskell? It's functions are about as close to mathematical
functions as it is possible to get, but Haskell programs do IO.

But people here are not "making too much" of how the word function is
used. This is C group, and function has a meaning that is too well-know
to be hijacked for your purposes. You will just cause confusion.

No.
Let's define the procedure as "present the message 'hello world' to the user".
printf("H\bHello world\n")
fulfils the procedure on some output devices but not others.

I don't see what you mean. I can replace your code with

puts("H\bHello world")

and it succeeds and fails in the same way. If the first meets your
specification (and I don't know how it could, but that's another matter)
the second does as well.

Essentially we can shift all the calculation into the bit-shuffling
functions. A procedure can do trivial calculation, for example if the
procedure is "read a string from input", it can nul-terminate it.
But if the procedure is "read a real from input", it should call
a bit-shuffling function to convert from ascii to floating point.

That's not an advantage, or, if it is, I can't see what advantage you
are claiming. It just seems to be another edict about what programmer
should do.

Whilst writing procedures and bit shuffling functions are both
programming, they are different types of programming tasks. I
said that "all the interesting work is in the bit shuffling
functions", which was maybe a bit unfair to people who write mainly
procedures. The interest and difficulty in a procedure is in
interacting with electronics, which demand things like bits being
written out on rising edges of clocks, handshakes being exchanges
until the device responds with a "ready". All hard to do, but a
totally different skill set to bit shuffling. I'm interested mainly
in bit shuffling, I use mainly accepted and developed interfaces for
IO.

That's not an advantage either.

Malcolm McLean · Jun 1, 2014

They are onerous to me. I have a list-processing library that uses
map-like functions to do all sorts of things, including output.

I'd need to see the language to see how easy it is to recast in the
system. There's also the question of whether the language is
fundamentally misdesigned, or designed on an equally good but
incompatible principle.

Now, if you are being C-specific, I agree that such patterns are not
common, but I got the feeling you were talking more generally than that.
In other languages, arbitrarily restricting function arguments is a
disaster. (But I see from what follows that the idea is, essentially,
C-based so programmers in other languages need not take this advice.)

I'm essentially talking in terms of C-like languages.

Ah, so this is a C-specific. I'll stop talking about programming more
generally then.

"Can be written in pure portable C" doesn't mean "cannot be written in
other languages".

What's more, there are almost no bit-shuffling functions Simply
writing

int max(int a, int b) { return a > b ? a : b; }

int max3(int a, int b, int c) { return max(max(a, b), c); }

stops max3 from being a bit-shuffler. That just seems way too
restrictive. At the same time, the very vague words "specification" and
"bit state" make it hard to see what else is excluded.

Same mistake. "Can be written without subroutines" doesn't mean "cannot
be written with subroutines".

we can write max3

int max3(int a, int b, int c)
{
if(a > b && a > c)
return a;
if(b > c)
return b;
return b;
}
Then we can drop that version into any program that calls max3. So if we
can't find the source to max2(), or can't figure out how it works because
no-one has told us about that weird syntax, all is not lost.

Now

void hello
{
printf("hello world");
}

cannot be rewritten without any external subroutines. It does't have that
characteristic.

AH!! You are not even sticking to your own made-up terms. I took
"subroutine" to be some general term, but you mean one of your
"procedures" or "IO procedures", yes? Banning subroutine calls in the
general sense makes no sense.

Yes it does. Leaf functions are more useful than non-leaf functions
because they can be cut and pasted, because their behaviour can be
determined by examining them. However we're not banning subroutines.
We're retaining the right to ban subroutines should be wish to do so.

It's possible (and in my view more useful) to extend this rule to all C
functions. That means taking a suitably abstract view of the
specification. This is the single most important reason why I don't
think this sort of classification is useful. I want to be able to
reason (often about whole programs) written in portable C, regardless of
function arguments and "IO".

You're right in a very non-trivial sense. IO procedures like fopen()
are widely portable, they can be implemented on lots of physically
very different devices. So ANSI standard C tries to make code portable
by specifying a standard interface for the task of reading and writing
streams for an external datasource.
What I am saying is that this approach doesn't work. Because you can
always get some physical device which won;t support your interface.
Call to streams which pass over internet connections can't sensibly be
blocking, because internet connections hang, and users want some
sort of feedback. So stdio doesn't work. Everything breaks that relies
upon it.
Now you can easily say "OK, pass fgetc() a timeout callback". But that
too will break when the next technology comes along.

even this

void hello(char *str)
{
strcpy(str, "hello world");
}

is not portable. It will break on a Microsoft compiler in default mode.
So we need to be able to retain the right to ban subroutines.

We can't ban malloc() unfortunately. That's the one subroutine we really
can't do without.

For example, I'd like to be able to argue about portable threaded C
programs and not a single function that uses any of C's concurrency
primitives is bit-shuffler. I'd rather develop ways to argue about a
much larger class of functions, than to home in on a tiny set.

I'd like to extend this analysis to parallel programming. But let's
get everything straight for serial programs first.

The same is true of ones that violate your rules, and that, to me, is
more important than your seemingly arbitrary classification.

No, if a function must "clock out bits on the rising edge", you can't
test that that is working by writing C on a Unix box. At least in any
way that I know.

Still baffled. How can you test the above independently of the callback?

The function sums n results from calls to the callback. So we black-box
test it with various callbacks that return different numbers, and show
that it's returning correct results for all combinations of n and
functions with different characteristics. I put "prove correct" in
scare quotes because of course that only proves it correct in an
engineering sense. But we know it's been written by someone who
wouldn't introduce any deliberate perversities, it sums the results
of n calls to cb().
Now we ship it, and our customer uses it with his own callback, which
is highly intricate and we couldn't develop ourselves. And we know it
will sum n calls.
But callbacks are special. You've lost a degree of safety

Eh? You've changed the meaning entirely independently of what rules the
callback obeys.

A loop of n calls to a callback can be written

for(i=0;i<n;i++)
x += cb();
or

while(n--)
x += cb();

They're both correct and interchangeable, unless cb is allowed to modify
i or n. That's what you've got to ban.

It's key for me. Feel free to ignore the detailed comments (I'm not
sure I understand the details of what you are saying anyway). The big
objection is that you focus the idea of reasoning about code on too
small a set of functions (in the C sense of the word), and with too
low-level a view of what is being verified by the proof or argument.

No, I give simple examples, but of course the method is entirely
pointless if applied to trivial functions. What matters is how it
scales up to massive programs which are too complex for any one
person to develop independently.

And Haskell? It's functions are about as close to mathematical
functions as it is possible to get, but Haskell programs do IO.

I'm not a mathematician. But I understand that

x + 2

is a function of x.

x + a value obtained from somewhere

is not a function of x, we can't analyse it or say anything about it.

But people here are not "making too much" of how the word function is
used. This is C group, and function has a meaning that is too well-know
to be hijacked for your purposes. You will just cause confusion.

"Malcolm-function" is flattering, but I can hardly claim to deserve to
have the concept of a mathematical function defined by discrete bits
named after me. "Bit-shuffling function" seems to cause as much confusion
as "function". But feel free to suggest a term.

I don't see what you mean. I can replace your code with

puts("H\bHello world")

and it succeeds and fails in the same way. If the first meets your
specification (and I don't know how it could, but that's another matter)
the second does as well.

You don't know that.
lets say we run both functions on identical machines, with the following code.

if(fork())
printhelloworld();
else
closedowntty();

so one thread is priting out the message, the other shuts down the tty
whilst it is executing. can we now guarantee indentical output for both
versions of printhelloworld() ?

That's not an advantage, or, if it is, I can't see what advantage you
are claiming. It just seems to be another edict about what programmer
should do.

Because we can take the function which converts an ascii string to
a real, and use it in other contexts. But only if it is a bit-shuffling
function, not if it is tied to reading characters from IO.[ IO procedure programming different from bit shuffling ]

That's not an advantage either.

There's no particular reason why someone who is a good at bit shuffling
should also be good at IO, or vice versa, other than the general
observation that people who are good at one task are usually at least
competent at a related one.
The fields have a different logic, they require a different mindset,
they require different physical resources. The software assets produced
have a different lifespan and applicability. We'll probably find they are
best handled in different languages.

Ike Naar · Jun 1, 2014

Same mistake. "Can be written without subroutines" doesn't mean "cannot
be written with subroutines".

we can write max3

int max3(int a, int b, int c)
{
if(a > b && a > c)
return a;
if(b > c)
return b;
return b;
}

Ben's version of max3(1,2,3) returns 3.
Your version of max3(1,2,3) returns 2.

David Brown · Jun 1, 2014

Ben's version of max3(1,2,3) returns 3.
Your version of max3(1,2,3) returns 2.

Malcolm's version clearly has a typo - the last line should be "return
c;". But that also illustrates why it is important to be able to use
subroutines that are tried and tested!

James Harris · Jun 1, 2014

....

From what you've described in this and follow-up posts, this manual
'exception'-handling scheme can be implemented in any language, not just
C!

Good point. It could even be used where the code is written in multiple
languages, e.g. C calling assembly where the exception could be thrown in an
assembly function and caught in a C function or vice versa.

(And I would disagree about the amount of coding: see below.)

That's if it is as I understand it , that all functions in a call-chain
have to cooperate, with every call to the next lower function in the chain
having to be checked for an error/exception condition, and to deal with
the error or to immediately return and pass it back up.

Yes, although there are cases, notably tail call, where checking can be
omitted, practically speaking it's easiest to think that all the functions
in a call chain would need to cooperate.

Compared with what is normally understood (or as I understand), where you
can have functions A,B,C,D,E in the chain, and only A and E have to
cooperate: E throws an exception, and A can catch it, while B, C and D
don't need to know anything about it, nor check anything, nor even execute
a return.

Yes, B, C and D would be expected to participate. If you wanted to avoid
that what about combining the approach with setjmp/longjmp? Then a longjmp
from E could get straight back to a setjmp test in A.

One caveat of jumping back: depending on how resources were accounted for,
if one of the intervening modules allocated resources you might need to stop
off at that module on the way back so that the resources could be released.

James

Malcolm McLean · Jun 1, 2014

On 01/06/14 18:55, Ike Naar wrote:

Malcolm's version clearly has a typo - the last line should be "return
c;". But that also illustrates why it is important to be able to use
subroutines that are tried and tested!

The idea of bit-shuffling functions isn't an anti-subroutine design
paradigm in any way. The subroutines can be removed if necessary,
it's not saying they should be removed.

You might need to remove subroutines for efficiency, or to make the
function self-contained. Leaf functions have certain advantages.
They can be cut and pasted. The function can be examined manually
for correctness. The subroutines can't be detached. I'm not saying
that those advantages will outweigh the advantages of subroutines
in any particular case, but, if you judge that it is the case that
a function is better as a leaf, you can always remove the
subroutines.

Keith Thompson · Jun 1, 2014

Malcolm McLean said:
On Sunday, June 1, 2014 3:04:16 PM UTC+1, Ben Bacarisse wrote: [...]
even this

void hello(char *str)
{
strcpy(str, "hello world");
}

is not portable. It will break on a Microsoft compiler in default mode.
So we need to be able to retain the right to ban subroutines.

If that's the case (and I'll take your word for it), it means
that a Microsoft compiler in default mode is not a conforming
C compiler. Neither is gcc. It's not particularly interesting
that some conforming, or even strictly conforming, C code is not
portable to things other than conforming C compilers.

[...]

"Malcolm-function" is flattering, but I can hardly claim to deserve to
have the concept of a mathematical function defined by discrete bits
named after me. "Bit-shuffling function" seems to cause as much confusion
as "function". But feel free to suggest a term.

I don't think it's meant to be flattering; it's just a desperate
attempt to encourage you to write clearly.

It's already been pointed out that the term "bit-shuffling" implies
things that you didn't mean to imply, namely that such a function
would reorder bits without changing them (by analogy to shuffling
a deck of cards). "Bit-manipulation function" is probably closer
to what you're trying to say. But ultimately this is your concept,
and I suggest that it's up to you to come up with a name for it that
(a) expresses the concept clearly and (b) *isn't already used for
something else*.

This is your concept; I suggest that it's up to you to come up with
a good name for it. If you have difficulty naming it, perhaps that
says something about the concept itself.

If you want to keep misusing the word "function", you aren't going
to be able to communicate your actual ideas -- something you could
achieve much more easily by not posting at all.

[...]

Ike Naar · Jun 1, 2014

It's already been pointed out that the term "bit-shuffling" implies
things that you didn't mean to imply, namely that such a function
would reorder bits without changing them (by analogy to shuffling
a deck of cards). "Bit-manipulation function" is probably closer
to what you're trying to say. But ultimately this is your concept,
and I suggest that it's up to you to come up with a name for it that
(a) expresses the concept clearly and (b) *isn't already used for
something else*.

The most interesting property of bit shuffling functions so far is
that functions that shuffle bits are not bit shuffling functions.

David Brown · Jun 1, 2014

Yes it does. Leaf functions are more useful than non-leaf functions
because they can be cut and pasted, because their behaviour can be
determined by examining them. However we're not banning subroutines.
We're retaining the right to ban subroutines should be wish to do so.

Functions can be "leaf" as far as C is concerned, while being far from
"leaf" in implementation. For example:

double foo(double a, double b) {
return a / b;
}

This contains no function calls at the C level. But the implementation
might do all sorts of things, such as call subroutines to handle the
division in software, invoke some sort of error handler on a divide by
zero (including a nasal daemon launcher - which I think would be
classified as a "Malcolm-procedure" rather than a "Malcolm-function").
It might also change states in the system - such as putting the floating
point hardware (or software subsystem) into default rounding modes.

By "cut and pasted", do you mean "inlined by a compiler" or "cut and
pasted by someone editing the source code" ? Either way, the
requirement for that is that the function's definition is known - there
is no need for it to be a leaf function, or to know the definitions of
all functions called.

I'm not a mathematician. But I understand that

x + 2

is a function of x.

x + a value obtained from somewhere

is not a function of x, we can't analyse it or say anything about it.

That depends on how that value is obtained. If it is a global constant,
for example, then we can say a great deal about it.

Comparison of Integer and Pointer (that's supposed to be an Integer). Where did I go wrong?	0	Nov 19, 2022
An embeddable job widget showing job listings discussion of an approach	0	Mar 22, 2023
How can I upload a tar.bz2 file to OpenStack swift object storage container using the Python swift client?	2	Mar 22, 2024
The Horror of pointers...	5	Jan 11, 2025
Exception propagation in C	63	Dec 29, 2011
unittest: assertRaises() with an instance instead of a type	20	Mar 28, 2012
Sort from an array of objects into stacks without using java.util	0	Jul 6, 2021
comp.lang.c FAQ list Table of Contents	0	Jan 12, 2008

Contents of an exception object

Ben Bacarisse

Malcolm McLean

Ben Bacarisse

Malcolm McLean

Ben Bacarisse

James Harris

BartC

Ben Bacarisse

David Brown

Malcolm McLean

ais523

Ben Bacarisse

Malcolm McLean

Ike Naar

David Brown

James Harris

Malcolm McLean

Keith Thompson

Ike Naar

David Brown

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads