Exception propagation in C

J

Joe keane

If you need something that works in a wide variety of languages without glue
or other hassle, it sounds like simple return values are the way to go.

I guess the goal is to get rid of the compare-and-branch for each
operation? Then you have C++-style function calls [assume succesful,
then fix things up with funky extra code if there is an error]. I think
if this is too costly you have other problems.

Also the machine already does something a lot like this. Assume the
branch is not taken, smash it, later fix the machine state back if you
find there was a mistake.

Maybe the real problem is that you have too many function calls?
Refactor the code, or use macros [or use inline], find the can't-fail
operations...

#define FOO_GET_X(FOO, X) ((X) = (FOO) >> 4 & 0x1)
#define FOO_SET_X(FOO, X) ((FOO) |= ((X) & 0x1) << 4)
 
J

James Harris

....
Hmm, sorry about that -- I thought I replied, but I can't find it now.  I
may have had the reply open when my computer crashed.

....

No worries. Thanks for replying.

....
Yes, that's what I mean.  Swapping on thread switch is only viable for
per-CPU storage, and if the whole point is to have a fixed virtual address,
that means you need a separate MMU context -- which is doable, but at the
point that I'd question whether it's really worthwhile.

To answer I'll have to ditch the term "thread" which I used as it was
something people would be familiar with and instead use the terms I
really use, "task" and "address space." This starts to get into the
detail of the design and could go way off topic so I'll limit the info
to what hopefully will be enough to make sense of the exception-
indicator word which is the point being discussed.

In these terms, an "address space" is hopefully fairly self
explanatory. A "task" is basically a unit of scheduling. Unlike
threads in a process, tasks, by default, do not share resources.

When an address space is supporting just one task there is no
possibility of interference as there's nothing to interfere with.
Another task in another address space would have its own exception
word.

Where tasks execute in different address spaces they cannot interfere
with each other's exception word as it is in different memory. Address
X in one is not the same as address X in another.

Where a single address space supports multiple tasks things get a bit
more interesting.

If the multiple tasks are kept on one CPU they cannot interfere
because the task switcher changes the exception word on a task switch.
This is my preferred model. The address space is kept on one CPU
partly to avoid the cost of transferring updated data between CPU
caches across the system bus.

(In that scenario, although all the tasks in one address space are
restricted to running on one CPU at a time, the other CPUs would not
need to be idle as they could still be busy executing active tasks in
other address spaces.)

What about the final scenario where it really is a good idea to have
multiple tasks in the same address space running concurrently on
different CPUs? I can think of two potential solutions.

1. Use a different address for the exception word on each CPU. This
does involve pinning tasks to CPUs, more specifically the code of
tasks would need to include the specific address assigned for that
CPU. Fortunately my primary execution model uses a very late link step
- data values are resolved at load time - so the task_excep word and
its ilk could be resolved after the CPU has been chosen by the OS.

2. Alternatively, let the tasks share most address space but keep the
task-local data er, local to a CPU. If it would work this sounds quite
attractive.

As mentioned, my preferred option is to keep an address space on one
CPU at a time but the last situation may be resolvable where
necessary. Someone will probably tell me in no uncertain terms if it
is not.
A summary of what I recall of the rest of the reply:


It won't skip them as quickly as native compiler exception handling

That's what I thought at the start but considering the time that might
be needed to establish stack frames that the native exception handler
can unwind I'm having my doubts. A lot possibly depends on how many
try-type blocks there are as to which option would come out on top.

It is a moot point anyway as the original intention was to devise
something that would work in standard C. Changing a compiler or using
any compiler-specific facilities is out of scope.
-- but
yes, you'd need some glue code around language transitions unless all the
languages involved implement the same exception mechanism (asm could be made
to, others maybe with compiler extensions?).  How often are potentially
cross-language calls made, compared to component-internal calls?

Not very often, it's true. As a general construct, cross-language
calls are irrelevant. They may be very important for some of what I am
doing, though, where some of the code calls assembly routines. And
they simplify matters by making the interfaces consistent. (If only I
could do similarly for the calling conventions, eh!)
If you need something that works in a wide variety of languages without glue
or other hassle, it sounds like simple return values are the way to go.  You
can combine errors and successful return values as long as they have
disjoint ranges (often negative for errors, non-negative for success), so
for many functions you may not need an extra pointer parameter.  Perhaps you
could stick a full error report struct in true TLS (you wouldn't need to
access it in the no-error case, so it's not as speed-sensitive), but be
careful that such structs can't leak.

Yes, an object that explains what went wrong would have to be in true
TLS.
What would go in handler()?  I'd think you'd usually want to make the
decision locally about whether/how to handle or propagate and clean up from
the error.

The handlers wouldn't need to be the same. They could be the same or
different depending on need.
I've only occasionally felt the need to do this -- often I'd want to clean
up differently, or I wouldn't want to call b() if a() failed.

True in many cases. I have a piece of code I wrote just the other day
(using the exception word idea) which needed multiple pieces of
memory. It naturally fell out as

a = mem_alloc...
b = mem_alloc...
if (task_excep) goto excep_handler_mem;

Such a form may not be all that uncommon. Obtain a bunch of things
then check for an exception before using them.

(I have issues with what to represent as an exception if more than one
mem_alloc fails, though!)
Doing a load from memory followed by a compare and conditional branch will
also slow down the normal processing case.  A few ORs is well worth
eliminating a bunch of loads.

I don't agree with this at all! The exception word will normally load
from cache, L1 cache at that. The branch predictor, because a taken
exception-branch is rare, will predict not taken and the CPU will
carry on with the next instruction (which it already has read-in and
decoded). The sequence

cmp [task_excep], 0
jne handler

should thus virtually disappear in the normal processing case.
Don't fear the goto -- it's one of the nicer ways of doing error cleanup in
C that I've seen.  The Linux kernel uses it extensively.

Someone mentioned that - maybe you. I'm not sure I'd take the Linux
kernel as a good model (I don't like a lot of conditional compilation,
for example) but in general the reaction to my suggestion of using
goto has been far more positive than I expected.

Another thought just occurs to me. Perhaps the above ideas could be
combined so that instead of

<operation potentially causing an exception>
if (task_excep) goto handler_x;

we could have in some cases

<operation potentially causing an exception>
if (task_excep && handler_x()) return;

though that fails to skip following statements. Another option could
be some variation of

<operation potentially causing an exception>
if (task_excep) if (handler_x()) return;

with appropriate else clauses as required. Not sure. Although I
dislike using goto it does express the thought succinctly and if used
consistently would become idiomatic and thus easy to read.

James
 
J

James Harris

I'm baffled as to why one would do that.

It's like using a global variable to pass information between a function
and a function it calls or the reverse.  Then you realize that it
doesn't work for multi-threaded and make it more complicated.

On the contrary. As proposed, this is a per-thread variable.
It still doesn't work when you get an exception, then the cleanup code
itself gets an exception.  [Of course, in some cases you just throw up
your hands and call abort(), but this often can be handled.]  Presumably
the callers want the -first- error code but you overwrote it.

If the cleanup code triggers an exception this is the situation that
was discussed earlier of what response is expected or best of three:
propagate the initial exception, propagate the new exception,
propagate both. The latter has its own questions. In the proposed
model the abort-type call should not be necessary or used in any
circumstances.
If the stack works [or registers] why not use that?

The stack is a good option but does require explicit copying and
propagation of an exception indicator and may be slower to manage
where an exception has not occurred. Can you come up with a fast set
of instructions to manage exception indicators using the stack?
Registers may be a bad option as they are sometimes in short supply
and may be better used for general purpose use.

James
 
J

Joe keane

The stack is a good option but does require explicit copying and
propagation of an exception indicator and may be slower to manage
where an exception has not occurred. Can you come up with a fast set
of instructions to manage exception indicators using the stack?

Suppose you dedicate one register to be the 'alternate link'.

When you call a function, if there is no error, it should return to the
usual place (after the call instruction). If there is an error, it
returns to the alternate address. The alternate link can be set when
the function starts, or different for each calling site, as PC-relative.

On return from function, you can say use normal, use alternate, or test
r0 and let that decide.

This lets you write normal C code. It's just an optimization. When
people complain about a compare-and-branch for every function call, you
say 'no it isn't' and when they ask you wave your hands and say 'magic'.
Registers may be a bad option as they are sometimes in short supply
and may be better used for general purpose use.

That may be true. I think it's easier to describe it with registers,
then how to do it with stack locations is straightforward. Which is
better, dunno.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top