Exception propagation in C

J

James Harris

Below is a suggestion for handling - more accurately, propagating -
exceptions in C. It is intended for overall program performance though
it goes against a preference of many including me. Can anyone come up
with a faster or better way to propagate exceptions? What other ways
are there to handle exceptions in C? What do you do? (I tend to detect
exceptions where they occur by checking return values but I have never
had a good way in C to get them to propagate.)

The issue is the desire to have exception propagation using *standard*
C. I'm not talking principally about detecting exceptions but of a way
for nested functions to signal an exception and have that exception
propagate back along the call chain to a function that is set up to
handle it.

* Option 1. Use setjmp/longjmp. Not too good as, to be compatible,
they need to be used in an if or a switch statement IIRC which can be
inconvenient. Don't they also need a separate jump buffer for each
active setjmp? I have used them in the past but found them awkward.

* Option 2. Traditional exception handling. As used in other languages
this unwinds the stack of function calls looking for a handler. I
cannot think of a way to implement that in C. Furthermore, it might be
slow for two reasons: 1) the need to push and pop sufficient context
for the exception handler to recognise where a given exception should
be caught, 2) the consequent loss of sync between the stack and the
CPU's branch prediction for returns. (A return would go back to
somewhere other than where it was called from potentially impacting
the performance of the rest of the calls on the call stack.) The
latter is OK if exceptions are infrequent but is not generally
applicable.

* Option 3. The function detecting an exception creates an exception
object and sets an exception type in a thread-specific variable. Any
calling functions, after each call (or group of calls) test that
variable. If non-zero they jump to a local exception handler if there
is one or jump to a return from the current function if not. This does
involve a test and branch for each call but that is very fast (fusible
and single cycle) and it avoids the need for exception handling
context to be pushed and popped. It would also preserve the branch
prediction of calls and returns.

Unfortunately option 3 would mean use of goto. I haven't used goto in
C code for many years and don't like the idea of starting now but I
can't think of anything quicker. In C this could be

temp = func();
if (thread_excep) goto exception_handler;
x = temp;

In x86-32 this should become something like

cmp [thread_excep], 0
jne exception_handler
mov x, eax

If set up to do so the exception handler would handle the exception
itself and clear thread_excep. If not or if it decided to propagate
the exception it would return to its caller with the thread_excep word
still holding the exception value. Or it could alter thread_excep and
return.

There are subtleties but option 3 is the basic idea. Anyone know of
something better?

James
 
J

jacob navia

Le 29/12/11 18:52, James Harris a écrit :
* Option 2. Traditional exception handling. As used in other languages
this unwinds the stack of function calls looking for a handler. I
cannot think of a way to implement that in C.

This can be done in C of course. Look at the exception handling code in
gcc. It implements an abstract machine that gets opcodes from a series
of tables generated by the compiler.

For more information look at the DWARF debugging info specs, where
the stack unwinding machine is described in detail. These are the
specs, not to be confused with the implementation of those specs
by gcc (the only compiler that uses this monster) that are
slightly different.

You should also look at the code of gcc. Specifically look in the
"except.c" file and in the unwind-xx-fde files. There you will
find the implementation
Furthermore, it might be
slow for two reasons: 1) the need to push and pop sufficient context
for the exception handler to recognise where a given exception should
be caught,

That is the exception handling object. But before you go any further,
think a bit:

SPEED IS NOT IMPORTANT HERE

The program has just crashed, and most often than not this is NOT the
regular path of the application. So, stop worrying about speed now.

2) the consequent loss of sync between the stack and the
CPU's branch prediction for returns.

So what? The pipeline will be flushed, but that is not important at all
since (I repeat) SPEED IS NOT IMPORTANT HERE.
(A return would go back to
somewhere other than where it was called from potentially impacting
the performance of the rest of the calls on the call stack.) The
latter is OK if exceptions are infrequent but is not generally
applicable.

Look. This is a quite researched field, and several full
implementations exist. Before starting spewing ideas that
are nothing more than a few hours of thought

1) Read the literature and documentations of the existing
implementations. Do NOT think you will run into the
GREAT idea that all others didn't see.

2) Study the code of gcc and the open implementations of the
exception handling implementations published by Microsoft and
Apple. (Apple is mostly gcc anyway).

3) Choose one model implementation and implement that. See how
it works within your context.

If you are writing an OS forget about C and remember that your OS
should support several languages smoothly, not only just C.
 
B

BGB

Le 29/12/11 18:52, James Harris a écrit :

This can be done in C of course. Look at the exception handling code in
gcc. It implements an abstract machine that gets opcodes from a series
of tables generated by the compiler.

For more information look at the DWARF debugging info specs, where
the stack unwinding machine is described in detail. These are the
specs, not to be confused with the implementation of those specs
by gcc (the only compiler that uses this monster) that are
slightly different.

You should also look at the code of gcc. Specifically look in the
"except.c" file and in the unwind-xx-fde files. There you will
find the implementation

simpler is, of course, to do it more like Win32 SEH, which basically
just captures the state (stack position, registers, ...) at each handler
frame (state is captured, linked into a linked-list, and unlinked on
return).

in this case, passing control to a frame essentially restores the state
at this point, and invokes the handler logic.


proper frame unwinding (like used with GCC and Win64) does work, and is
a little cheaper (when no exceptions are thrown), but is a bit more
complex and requires compiler support (debug-info, or, in the case of
Win64, specially-formed frame prologues/epilogues and some special info
located in a table within a special section).


I had written some (never really used) ideas for my own code-generators,
namely using a prologue/epilogue system similar to Win64 and having
linked annotation metadata (exploiting the use of multi-byte NOP
instructions, which can also be used to encode pointers to things).

more practically though, I just stuck with a SEH-like system, since then
the unwinder code does not have to know anything about how to unwind the
stack or similar (or cross language/compiler boundaries), it just
saves/restores whatever.

That is the exception handling object. But before you go any further,
think a bit:

SPEED IS NOT IMPORTANT HERE

The program has just crashed, and most often than not this is NOT the
regular path of the application. So, stop worrying about speed now.

yep, agreed.
despite some people using exceptions as a control-flow feature, this
isn't really their intended or ideal use.

2) the consequent loss of sync between the stack and the

So what? The pipeline will be flushed, but that is not important at all
since (I repeat) SPEED IS NOT IMPORTANT HERE.

I think he was expecting something like "all branch predictions for all
subsequent returns will fail", which will not actually happen (given how
the CPU works).

it will flush the pipeline, maybe effecting several instructions and
costing several clock cycles, but really?... a mispredicted "if()", a
call through a function pointer, or invoking a "switch()" which uses a
jump-table, all have the same sorts of costs here.

Look. This is a quite researched field, and several full
implementations exist. Before starting spewing ideas that
are nothing more than a few hours of thought

1) Read the literature and documentations of the existing
implementations. Do NOT think you will run into the
GREAT idea that all others didn't see.

2) Study the code of gcc and the open implementations of the
exception handling implementations published by Microsoft and
Apple. (Apple is mostly gcc anyway).

3) Choose one model implementation and implement that. See how
it works within your context.

yep.


If you are writing an OS forget about C and remember that your OS
should support several languages smoothly, not only just C.

this doesn't entirely follow.

ideally, one can have an interface which works fairly well for whatever
languages work on the system.

however, sadly, IRL it is often not so nice:
typically language-specific exceptions and OS exceptions are disjoint.

say, Win32 SEH vs C++ or Java exceptions, ...
typical result is then that they don't really cleanly cross language
boundaries or show up in a "sane" manner (things don't necessarily
unwind in a sane order, often one system is invisible to another, ...).

at least on Win64, IIRC Win64 SEH / MSVC++ / .NET all use the same basic
mechanism.

typically, this means some amount of ugliness to deal with these cases.


luckily, at least, C is sort of a "least common denominator".

better would probably be a solid ABI-level mechanism (which applies to C
and any other language which is compiled to native code on the OS).

but, alas...
 
K

Kaz Kylheku

Below is a suggestion for handling - more accurately, propagating -
exceptions in C. It is intended for overall program performance though
it goes against a preference of many including me. Can anyone come up
with a faster or better way to propagate exceptions? What other ways
are there to handle exceptions in C? What do you do? (I tend to detect
exceptions where they occur by checking return values but I have never
had a good way in C to get them to propagate.)

The issue is the desire to have exception propagation using *standard*
C. I'm not talking principally about detecting exceptions but of a way
for nested functions to signal an exception and have that exception
propagate back along the call chain to a function that is set up to
handle it.

* Option 1. Use setjmp/longjmp. Not too good as, to be compatible,
they need to be used in an if or a switch statement IIRC which can be
inconvenient. Don't they also need a separate jump buffer for each
active setjmp? I have used them in the past but found them awkward.

* Option 2. Traditional exception handling. As used in other languages
this unwinds the stack of function calls looking for a handler. I
cannot think of a way to implement that in C.

Option 2 can be done using Option 1, with some macros.

It can be be done and I have done it, as well as others. Years ago I wrote an
exception handling module that came to be used in Ethereal (now called
Wireshark) which is written in C. Maybe that code is still used. If a packet
is captured by Wireshark which is truncated (does not parse properly according
to its protocol) the packet dissectors throw an exception.

More recently, I made setjmp-based exception handling in a language
called TXR, more sophisticated than that old one.

See here:

http://www.kylheku.com/cgit/txr/tree/unwind.c
http://www.kylheku.com/cgit/txr/tree/unwind.h

This can't be easily lifted into other C programs, though, because
of the reliance on the "val" type, which is a dynamically typed value.
the need to push and pop sufficient context
for the exception handler to recognise where a given exception should
be caught, 2) the consequent loss of sync between the stack and the
CPU's branch prediction for returns.

Any exception handling that isn't integrated into your compiler is going to be
slow in some way compared to a "state of the art" implementation that is
properly integrated into a language. That is almost a given.
* Option 3. The function detecting an exception creates an exception
object and sets an exception type in a thread-specific variable. Any
calling functions, after each call (or group of calls) test that
variable.

Testing a global error status variable is not exception handling,
even if you call that variable "current_exception" or whatever
gives you a feeling that the approach is "exceptional".

Exception handling was invented specifically to avoid crufty
code like this.

It becomes an exception handling strategy if you can hide those tests from the
programmer (i.e. have the compiler generate them).
If non-zero they jump to a local exception handler if there
is one or jump to a return from the current function if not. This does
involve a test and branch for each call but that is very fast (fusible
and single cycle) and it avoids the need for exception handling
context to be pushed and popped. It would also preserve the branch
prediction of calls and returns.

But, at least, functions which are not interested in exception handling do not
have to push and pop contexts (unless they have local resources that need
unwinding!)

Under Option 3, all functions have to cooperate, because any normal return is
suspected of occuring in an exceptional situation. If an exceptional return is
happening, then subsequent statements in a function must not be executed. Every
function must check and return.

This is likely going to be a loser in the average cases, not to mention a
nightmare to maintain.

Furthermore, your scheme won't work when the call stack contains frames from
third-party libraries, even if those libraries don't have any local resources
that need cleanup. Longjmp-based exceptions can work through third party call
frames when those frames have no local resources that must be cleaned up.

In any good exception handling implementation, functions that are not involved
with exception handling or unwinding do not "pay" any cost.

The tradeoff is played out between the cost of searching for the exit point
and doing the control transfer, and the cost of setting up a catch or unwind
handler.

In general, if you make it very cheap to set up the catch, then the machine has
to work harder in processing a throw, to identify where to go and to restore
the state properly.
There are subtleties but option 3 is the basic idea. Anyone know of
something better?

I can't think of anything that isn't better.

Oh, here is one: it's not worse than a big hammer on a spring popping out of
your computer's case and whacking you on the head when an error occurs!

The really silly thing is that you're worried about shaving machine cycles on
comparisons or branch prediction, but you're overlooking what it costs to do a
thread-local lookup, which this scheme requires at every function call
boundary, just in case an exception is happening.

(You can pass the address of that variable as an extra argument among your
functions wherever that is feasible, I suppose: yet more cruft.)
 
R

Rod Pemberton

James Harris said:
Below is a suggestion for handling - more accurately, propagating -
exceptions in C. It is intended for overall program performance though
it goes against a preference of many including me. Can anyone come up
with a faster or better way to propagate exceptions? What other ways
are there to handle exceptions in C? What do you do? (I tend to detect
exceptions where they occur by checking return values but I have never
had a good way in C to get them to propagate.)

James, my reply to this was posted to alt.os.development.


Rod Pemberton
 
B

BartC

* Option 1. Use setjmp/longjmp. Not too good as, to be compatible,
* Option 2. Traditional exception handling. As used in other languages
* Option 3. The function detecting an exception creates an exception

What sort of syntax did you have in mind to make use of this:

Presumably something will be needed at the top level, to let the exception
handler know where to return to. And something in the lower level to invoke
the exception (I assume this will be the result of some test or failure in
software).

I don't have any ideas, just interested in how this might be done myself.
Particularly of what clean-up is (or isn't) done of the intermediate
functions which may have allocated or opened resources that need taking care
of.
Unfortunately option 3 would mean use of goto.

I suspect some of the solutions will involve a lot worse than goto.

(FWIW I achieve something similar (error recovery) by using inline assembly
then:

o Saving Stack, Frame registers, and storing a recovery label address (in
this same function) in a global variable

o Starting the process (calling the functions) where the error might occur

o When the error occurs, jumping to the recovery point (more inline
assembly), where the Stack and Frame registers are restored.

No resources are released here either, but the program is usually about to
end anyway so is not critical.)
 
J

jacob navia

Le 29/12/11 22:02, Kaz Kylheku a écrit :
I can't think of anything that isn't better.

The really silly thing is that you're worried about shaving machine cycles on
comparisons or branch prediction, but you're overlooking what it costs to do a
thread-local lookup, which this scheme requires at every function call
boundary, just in case an exception is happening.

This is a consequence of ignoring the literature, not studying the
subject matter and think that you are going to find the smart idea
that all others missed.

The crucial point (that you mention) is that needs to be done in the
compiler, not in each user program. Since he has chosen a "lightweight"
C compiler (gcc) he doesn't need to do anything if he uses the one
in gcc, but probably he is not aware of that fact, or, for unknown
reasons, he doesn't want to use gcc's.
 
J

James Harris

On 12/29/2011 11:52 AM, James Harris wrote:
....


it depends on what exactly one means by "standard".
I will assume here "can work with a typical C compiler on a typical
target without the need/ability to modify said compiler to make it work".

Yes, that's the intention.

....
TLS variables are not standard C (but are supported by MSVC and GCC).
potentially though one could implement them as status-getter function
calls, which is allowed in standard C.

I can't see TLS as C or not C. Wouldn't TLS be a characteristic of the
environment in which the C code runs?

....
more so, assuming one had TLS variables, one can't typically access them
in a single instruction.

I was thinking of thread_excep being at a fixed location so it could
literally be checked with one instruction:

cmp [thread_excep], 0
more typically it is some magic like:
mov ecx, [fs:magicOffset]
mov edx, [__myVariable_index]
mov eax, [ecx+edx*4]

and eax, eax
jnz ...

on some systems, they amount to an internal function call.

As shown, with static linking

if (thread_excep)

should compile to

cmp [thread_excep], 0
unclear how this would work exactly.

I'll try to explain what I have in mind. There would be a machine word
(which I've called thread_excep above) which would be initialised to
zero. A non-zero value would be set in it to indicate that an
exception had occurred. I'm not sure how best to use the word but one
option is to use different bits to indicate exception classes: data
exception, index exception, io exception, user exception etc.

In the main body of the code exceptions would be detected by

if (thread_excep) goto handler_2;

Within a function there would be as many handlers as are needed. Say
we wanted to deal with only index-type exceptions in this module. We
could code

handler_2:
if (thread_excep & EXCEP_INDEX) {
/* Deal with an index exception */
thread_excep &= ~EXCEP_INDEX;
/* Delete the exception object */
/* Next step: return, go on to next iteration, whatever */
}
else return 0;

In the else path the value of thread_excep has not been changed so
whatever called this routine will get a chance to handle the
exception(s).

Does that make sense? I'd welcome corrections or improvements.

James
 
J

James Harris

What sort of syntax did you have in mind to make use of this:

No special syntax. Just standard C.
Presumably something will be needed at the top level, to let the exception
handler know where to return to.

Not as such. Each routine would decide what to do. In the general case
there would need to be some default response at the top level.
And something in the lower level to invoke
the exception (I assume this will be the result of some test or failure in
software).

Yes, the lower level that detected the exception would create an
object to describe it. I've no idea yet what format the exception
object would take. I need to find out what others have done.
I don't have any ideas, just interested in how this might be done myself.
Particularly of what clean-up is (or isn't) done of the intermediate
functions which may have allocated or opened resources that need taking care
of.

The beauty of it is that each function would be responsible for
cleaning up its own resources.

No, the main beauty of it is that it will work with different
languages and even between languages. That's particularly important
for what I have in mind.

Check out the example code I posted in response to BGB.
I suspect some of the solutions will involve a lot worse than goto.

:) I'm beginning to see this is very true.

James
 
J

James Harris

....

http://www.kylheku.com/cgit/txr/tree/unwind.c
http://www.kylheku.com/cgit/txr/tree/unwind.h

This can't be easily lifted into other C programs, though, because
of the reliance on the "val" type, which is a dynamically typed value.

Nevertheless, ISTM that code is very good.

....
Testing a global error status variable is not exception handling,
even if you call that variable "current_exception" or whatever
gives you a feeling that the approach is "exceptional".

Exception handling was invented specifically to avoid crufty
code like this.

It becomes an exception handling strategy if you can hide those tests from the
programmer (i.e. have the compiler generate them).

We have to work with what we have. I'm not looking to alter a language
or a compiler.

In fact a great thing about the scheme, IMHO, is that it will work
between languages. A routine in language A could call a routine in
language B ... could call a routine in language Z. Each could work
with the exception indication of the others. None would have to know
what other languages were involved.
But, at least, functions which are not interested in exception handling do not
have to push and pop contexts (unless they have local resources that need
unwinding!)

Right, there is no pushing and popping of contexts.
Under Option 3, all functions have to cooperate, because any normal return is
suspected of occuring in an exceptional situation. If an exceptional return is
happening, then subsequent statements in a function must not be executed.Every
function must check and return.

Well, the exception status could be checked whenever needed, probably
after each call but if appropriate it could be checked after a group
of calls as it would be 'sticky'.

Yes, every function would have to check.
This is likely going to be a loser in the average cases, not to mention a
nightmare to maintain.

It is much faster than pushing and popping contexts.
Furthermore, your scheme won't work when the call stack contains frames from
third-party libraries, even if those libraries don't have any local resources
that need cleanup.  Longjmp-based exceptions can work through third party call
frames when those frames have no local resources that must be cleaned up.

True. The exception status would pass through such code, though.

....
In general, if you make it very cheap to set up the catch, then the machine has
to work harder in processing a throw, to identify where to go and to restore
the state properly.

True. IIRC there's a good write-up on the tradeoffs in books such as
Programming Language Pragmatics and Principles of Programming
Languages.

....
The really silly thing is that you're worried about shaving machine cycles on
comparisons or branch prediction, but you're overlooking what it costs todo a
thread-local lookup, which this scheme requires at every function call
boundary, just in case an exception is happening.

I don't think you understand what I have in mind. If I have it right
checking the exception indication would be just one instruction
testing a word that is probably cached in L1. It could hardly be
faster.

James
 
E

Eric Sosman

Below is a suggestion for handling - more accurately, propagating -
exceptions in C. It is intended for overall program performance though
it goes against a preference of many including me. Can anyone come up
with a faster or better way to propagate exceptions? What other ways
are there to handle exceptions in C? What do you do? (I tend to detect
exceptions where they occur by checking return values but I have never
had a good way in C to get them to propagate.)

Maybe this is clear to you already, but the words you use leave
me in some doubt: In pondering the performance of an exception scheme
one should not worry about how fast exceptions are handled, but about
how fast non-exceptions are not-handled. If you make a million calls
to malloc() and get an exception on one of them, it's the performance
of the other 999,999 that you need to consider.

They're called "exceptions," not "usuals."
The issue is the desire to have exception propagation using *standard*
C. I'm not talking principally about detecting exceptions but of a way
for nested functions to signal an exception and have that exception
propagate back along the call chain to a function that is set up to
handle it.

* Option 1. Use setjmp/longjmp. Not too good as, to be compatible,
they need to be used in an if or a switch statement IIRC which can be
inconvenient. Don't they also need a separate jump buffer for each
active setjmp? I have used them in the past but found them awkward.

This is how all the "pure C" exception frameworks I've seen have
worked. Yes, it requires a dedicated jmp_buf for each target. It
can also require that some variables in the setjmp() caller be made
`volatile', which is easy to forget to do *and* may prevent some of
the optimizer's more imaginative transformations -- that is, it may
slow down the more important, non-exception path.
* Option 2. Traditional exception handling. As used in other languages
this unwinds the stack of function calls looking for a handler. I
cannot think of a way to implement that in C. Furthermore, it might be
slow for two reasons: 1) the need to push and pop sufficient context
for the exception handler to recognise where a given exception should
be caught, 2) the consequent loss of sync between the stack and the
CPU's branch prediction for returns. (A return would go back to
somewhere other than where it was called from potentially impacting
the performance of the rest of the calls on the call stack.) The
latter is OK if exceptions are infrequent but is not generally
applicable.

The only portable ways to unwind the stack are to return or to
use longjmp(). With the latter there's no way to unwind the stack
frames one by one, nor to inspect what's being unwound: You just
longjmp() to the next-higher handler, do whatever local cleanup's
needed, longjmp() again to a handler higher still, and so on until
somebody's local cleanup says "This exception has propagated far enough;
I'm absorbing it."

Again, I feel you're worrying about performance in a situation
where performance is not very important.
* Option 3. The function detecting an exception creates an exception
object and sets an exception type in a thread-specific variable. Any
calling functions, after each call (or group of calls) test that
variable. If non-zero they jump to a local exception handler if there
is one or jump to a return from the current function if not. This does
involve a test and branch for each call but that is very fast (fusible
and single cycle) and it avoids the need for exception handling
context to be pushed and popped. It would also preserve the branch
prediction of calls and returns.

Isn't this just an `errno' equivalent, with a pointer instead
of an `int'? In "The Standard C Library," PJ Plauger writes

Nobody likes errno or the machinery that it implies. I can't
recall anybody defending this approach to error reporting,
not in two dozen or more meetings of X3J11, the committee
that developed the C Standard. Several alternatives were
proposed over the years. At least one faction favored simply
discarding errno.

I'd hesitate before imitating an interface nobody likes ...

Also, the performance probably suffers. For all its drawbacks,
the function-returns-a-distinguishable-value-on-failure approach has
the advantage that the value is very likely in a CPU register when
you need to test it, not in a global variable somewhere off in RAM.
[...] Anyone know of
something better?

I've seen various exception-like frameworks overlaid on C, some
down-and-dirty, some fairly elaborate. All of them, however useful
for the particular task at hand, suffered from lack of support by
the language itself: There's just no way to tell the compiler that
a function might throw an exception, which means the compiler can't
tell the programmer when he's forgotten to catch one. Somebody opens
a file, obtains values from various accessor functions and writes them
to the file, closes the stream, and returns -- except that one of the
accessors throws an exception that gets caught somewhere way up the
stack, and the file-writing function never closes its stream ...

Errors of this kind can be prevented by programmer discipline,
but I've seen no scheme that will help keep the programmer on the
strait and narrow path. This is sad, because it means an exception
mechanism that was supposed to make the programmer's job easier can
wind up making it more burdensome -- and more perilous.

Exceptions aren't something you can just bolt on to a language.
Well, actually, you *can*: people have bolted them on to C, over and
over again. Yet none of the various schemes has caught on, which I
think should cause you to ponder. Maybe it's because bolting on is
a poor substitute for baking in.
 
B

BGB

Yes, that's the intention.

...


I can't see TLS as C or not C. Wouldn't TLS be a characteristic of the
environment in which the C code runs?

...

TLS is typically stored separately from typical global variables, and is
generally indicated via the use of special modifiers ("__thread" or
"__declspec(thread)", say, in GCC or MSVC), however, doing so is not
allowed in standard C (these are extensions).

typically, the threading library will also provide function-calls to
get/set TLS variables, which are a little more standard.

more so, assuming one had TLS variables, one can't typically access them
in a single instruction.

I was thinking of thread_excep being at a fixed location so it could
literally be checked with one instruction:

cmp [thread_excep], 0

if it is a global though, it is not a TLS.

otherwise, I guess it would need to be changed around by the scheduler
or something (register variables with scheduler, and it saves/restores
their values on context switches or something?...).

more typically it is some magic like:
mov ecx, [fs:magicOffset]
mov edx, [__myVariable_index]
mov eax, [ecx+edx*4]

and eax, eax
jnz ...

on some systems, they amount to an internal function call.

As shown, with static linking

if (thread_excep)

should compile to

cmp [thread_excep], 0

except if it were a proper/traditional TLS variable, which are typically
accessed indirectly (on Windows and Linux, there is a special per-thread
structure, generally accessed via a segment register).


I guess, if one is making an OS, other possibilities also exist (special
magic TLS sections?).

say, if one has a section ".tls" section, where the scheduler
automatically saves/restores the contents of anything stored within
these sections. the advantage could be very fast access to TLS variables
at the cost of potentially more expensive context switches.

I'll try to explain what I have in mind. There would be a machine word
(which I've called thread_excep above) which would be initialised to
zero. A non-zero value would be set in it to indicate that an
exception had occurred. I'm not sure how best to use the word but one
option is to use different bits to indicate exception classes: data
exception, index exception, io exception, user exception etc.

FOURCC?...

'IOX ', 'GPF ', 'PAGE', ...

many systems use objects though, where the object will typically hold
information, such as the location, captured register state, ...

In the main body of the code exceptions would be detected by

if (thread_excep) goto handler_2;

Within a function there would be as many handlers as are needed. Say
we wanted to deal with only index-type exceptions in this module. We
could code

handler_2:
if (thread_excep& EXCEP_INDEX) {
/* Deal with an index exception */
thread_excep&= ~EXCEP_INDEX;
/* Delete the exception object */
/* Next step: return, go on to next iteration, whatever */
}
else return 0;

In the else path the value of thread_excep has not been changed so
whatever called this routine will get a chance to handle the
exception(s).

Does that make sense? I'd welcome corrections or improvements.

I don't think flags are the ideal strategy here.
flags don't as easily allow user-defined exceptions, and if one has only
a few bits for a user-defined exception class or similar, then it
creates a bit of a burden WRT avoiding clashes.

one "could" potentially use interned strings and magic index numbers
though, say, 10 or 12 bits being plenty to allow a range of named
exception classes (potentially being keyed via a hash-table or similar).

this is actually how my dynamic type-system works (types are identified
by name, but encoded via hash indices).

technically, one can combine indices with FOURCC values though, since if
one restricts the valid FOURCC values to being ASCII printable
characters, there is a large range of non-ASCII values which can be used
as index values (512M in the range of 0x00000000..0x1FFFFFFF, and 2G in
the range 0x80000000..0xFFFFFFFF, and possibly another 16M in the range
0x7F000000..0x7FFFFFFF).

IIRC, I once devised a TLV format based on this strategy, but I don't
remember ever making use of this (many of my TLV formats use a
variable-length encoding more like that used in the Matroska/MKV format).


or such...
 
S

s_dubrovich

On 12/29/2011 11:52 AM, James Harris wrote:
...
it depends on what exactly one means by "standard".
I will assume here "can work with a typical C compiler on a typical
target without the need/ability to modify said compiler to make it work".

Yes, that's the intention.

...
TLS variables are not standard C (but are supported by MSVC and GCC).
potentially though one could implement them as status-getter function
calls, which is allowed in standard C.

I can't see TLS as C or not C. Wouldn't TLS be a characteristic of the
environment in which the C code runs?

...
more so, assuming one had TLS variables, one can't typically access them
in a single instruction.

I was thinking of thread_excep being at a fixed location so it could
literally be checked with one instruction:

  cmp [thread_excep], 0
more typically it is some magic like:
mov ecx, [fs:magicOffset]
mov edx, [__myVariable_index]
mov eax, [ecx+edx*4]
and eax, eax
jnz ...
on some systems, they amount to an internal function call.

As shown, with static linking

  if (thread_excep)

should compile to

  cmp [thread_excep], 0
unclear how this would work exactly.

I'll try to explain what I have in mind. There would be a machine word
(which I've called thread_excep above) which would be initialised to
zero. A non-zero value would be set in it to indicate that an
exception had occurred. I'm not sure how best to use the word but one
option is to use different bits to indicate exception classes: data
exception, index exception, io exception, user exception etc.

In the main body of the code exceptions would be detected by

  if (thread_excep) goto handler_2;

Within a function there would be as many handlers as are needed. Say
we wanted to deal with only index-type exceptions in this module. We
could code

handler_2:
  if (thread_excep & EXCEP_INDEX) {
    /* Deal with an index exception */
    thread_excep &= ~EXCEP_INDEX;
    /* Delete the exception object */
    /* Next step: return, go on to next iteration, whatever */
  }
  else return 0;

In the else path the value of thread_excep has not been changed so
whatever called this routine will get a chance to handle the
exception(s).

Does that make sense? I'd welcome corrections or improvements.

James

Yeah, these sound like state flags. The example is rather abstract
from the problem, so it is hard to suggest C or I.

A couple of decades ago I had a commecial dish service that provided
delayed stock data to a terminal with a serial port which could be
connected to a PC and with commercial software, a screen full of
stock, bond, options, mutual funds and news headlines, could be
followed. Of course the PC software was an addon expense with a hefty
monthy charge to follow a handful of additional securities above what
the terminal provided. So I wrote my own in C.

The serial output was the raw data feed at 9600 baud, the data used a
poor man's encryption that wasn't to hard to figure out after snooping
the output. But the output records for a security could vary from
complete (all the fields of the record were transmitted) to sparse
(one or more 'update' fields were in the stream). They were 15-20
minute quote delayed, so the vast majority of records were just
partial updates to things like last quote, new high/low, volume, etc.
And the fields would vary by security type, of course. It was
relatively easy to see that each record transmitted for a 'new'
security started with SOH`byte followed by SecurityNumber`word. The
word was transmitted little endian as I recall. There after, various
control codes, like SOH, told for which field or security type, etc.
The SecurityNumber wasn't the cusip number, it was just a sequence
number and the outfit published a directory every so often showing
which number went to which security, The only help was that
securities were group by number ranges so say bonds were numbers >
50,000. But there were small number ranges inbetween that were
unassigned also.

My program development was stepwise and somewhat haphazard because I
didn't know, I had to deduce, the data stream. I wrote this on and
for a PC XT clone that ran at 10mhz, and its serial port card used an
i8250, that's an early model with a one byte tx/rx buffer, so the
handler can't dally or you lose data.

I wrote the C code in two main chunks, part one to handle the input
stream to a ring buffer, and when the status port said RX register was
empty, part 2 extracted data off the ring buffer like a secondary
stream treating non-printables like SOH (control codes) as state flags
within a for(;;) block - those were control codes for the meaning of
the following data in the stream. That's alot like 'exceptions'
without the overhead in that the input stream is random and not
repeatable. Also a difference is that the control codes were like
trees, one lead to check for a limited number of succession leaves
(controls for individual update fields, if present).

Nowadays you could use threads, one per control codes like SOH to
decode for the next data, but you need a 'sink' thread to catch
undefined or errant data. In my case with the state flags in the
for(;;) block, I discarded errant and unknown records until I figured
out what they meant, then added in a state flag and handler for them.

I also did something silly, which was to do random writes to my data
file on a ramdisk. IIRC is was a 256k ramdisk, so the file held only
a small subset of the securities. Every so often the ram disk data
file would be copied to the hard disk.

It actually worked well with surprisingly few bad records due to data
loss.

I used char flags for the data controls like SOH instead of dealing
with bit shifts to extract control data, so they; ie. SOH_flg, was
either null or non-zero.

Pretty definitely standard C, fwiw, without setjmp, longjmp, raise,
signal or even malloc for that matter.

Good for a chuckle eh?

Steve
 
R

Rod Pemberton

James Harris said:
There would be a machine word [...]

He said: "machine word"... Ok, that screams "volatile" keyword to me.

Are these x86 exceptions or x86 interrupts or events external to C, e.g.,
for your OS, that are being passed to C via a machine word? I suspect they
are, but you probably wanted to keep things topical for c.l.c. ...

At a minimum, I'd suggest reading Eric Sosman's post and also
my post on a.o.d. ;-)

You said Kaz's setjmp/longjmp code looks good to you, so I'd take that as an
intuitive proxy that you should use it. I'm sure others on c.l.c. could
help you fix whatever issues you have with it.


Rod Pemberton
 
R

Rod Pemberton

A couple of decades ago I had a commecial dish service
that provided delayed stock data ...
[...]
They were 15-20 minute quote delayed

Quote feeds are delayed 15 minutes for non-market makers.
My program development was stepwise and somewhat haphazard
because I didn't know, I had to deduce, the [stock quote] data stream.

They probably weren't using NASDAQ or NYSE protocols for the
satellite feed. You weren't subscribed to an official NASDAQ or NYSE quote
feed. If you had been, you could've gotten the protocols for the quote
stream after you (i.e., your company's legal department) signed a
non-disclosure agreement. I worked at a brokerage a decade or so ago
programming on their stock trading application and also software for a
couple of their feeds: SOES and SuperSOES (obsolete). The system had
numerous other computers and related market feeds: quote feeds, market
feeds, alternate markets, ECNs, Instinet, etc. I was so complicated that no
one knew how it was all connected. As part of a merger, new management
wanted to know. It took numerous departments months to obtain all the info
and produce a diagram. Apparently, they thought they could reduce costs by
replacing the system ... laugh.
I wrote this on and for a PC XT clone that ran at 10mhz,
and its serial port card used an i8250, that's an early model
with a one byte tx/rx buffer, so the handler can't dally
or you lose data.

That's tough to do on a PC ... Market volume was probably much less back
then. I'm assuming this was late 1980's from the PC and dish service. You
probably weren't getting a full quote feed anyway. We had mainframes
handling the quote feed data during early 2000's: four T1s. These T1s were
completely maxed out while the market was open. After a mistake in our
networking department, we got connected to all the T1's available at the
time for NYSE, all fourteen of them ... I think it was NYSE, but it may
have been for NASDAQ. The company relocated around that time to an
adjacent building, and then we started losing quotes ... "Where is the
opening quote for security XYZ?" The extra distance on the rerouted T1s
was causing the data to time out. That was all about a decade ago when
market volume was much lower than it is today.
IIRC is was a 256k ramdisk, so the file held only
a small subset of the securities.

Yes ...

We also used a software ramdisk on the mainframes which had some of the
fastest memory manufactured at the time. They were considering hardware
ramdisks, but found out the hardware ramdisks used the same memory as in the
mainframes ... So, they stayed with software ramdisks.
Every so often the ram disk data
file would be copied to the hard disk.

Start of day ... end of day. Batch processing, to produce reports required
for NASDAQ and NYSE, ran from about 8PM to 5AM ... Another department
burned it all to laser disks to comply with securities and other Federal
laws. We also had to submit a huge amount of data to another firm that
generated compliance reports.
Pretty definitely standard C, fwiw, without setjmp, longjmp,
raise, signal or even malloc for that matter.

Good for a chuckle eh?

Looping works!


Rod Pemberton
 
J

Jens Gustedt

Am 12/30/2011 10:38 PM, schrieb James Harris:
Officially they are since some weeks now, they come with C11
I can't see TLS as C or not C. Wouldn't TLS be a characteristic of the
environment in which the C code runs?

according to C11 thread local variables and atomics are part of the
language (new keywords _Thread_local and _Atomic) and functions
associated to threads and atomic operations form optional components
of the C library for hosted environments.

And these tools clearly ease the programming of some try/catch mechnism
with macros in C.

Jens
 
B

BGB

Am 12/30/2011 10:38 PM, schrieb James Harris:

Officially they are since some weeks now, they come with C11

ok, I hadn't looked much at C11 recently...

who knows how long until it is really well supported though, as it has
been long enough just waiting for MSVC to support C99 stuff...

according to C11 thread local variables and atomics are part of the
language (new keywords _Thread_local and _Atomic) and functions
associated to threads and atomic operations form optional components
of the C library for hosted environments.

and alas they couldn't just reuse GCC's keywords?...
well, in any case, probably macro magic.

And these tools clearly ease the programming of some try/catch mechnism
with macros in C.

yep.


I had forgot about a trick I had actually used before:
one can fake TLS using a macro wrapping a function call.

say:
int *foo_errno_fcn()
{ ... }

#define foo_errno (*foo_errno_fcn())

granted, this still doesn't allow a single operation to access it (as in
the OP), but really, who really needs to care?...


one case, something was due to me having initially used straightforward
TLS variables, but then running into a case where they "didn't really
work so well" on WinXP (top-level declared TLS variables are fussy /
prone-to-breaking on WinXP and earlier), but a wrapped function call was
more reliable.

not that I think status codes located within TLS are really a sane
exception mechanism (they can be sanely used, but IMO can't be sanely
called exceptions...).


most often, I pass state in passed context structs, but in some cases
have used "context binding" where one may need a TLS variable mostly to
hold a pointer to the context.

nevermind the potential need for TLS to implement things like nested
exceptions.


hmm:
in C one could use macros to be like:
BEGIN_TRY
...
END_TRY
BEGIN_CATCH(FooException, ex)
...
END_CATCH
BEGIN_FINALLY
...
END_FINALLY

nevermind if the syntax is horrid...
(technically, this could probably be implemented via if/else magic).

#define BEGIN_TRY if(fooBeginTry()) {
#define END_TRY fooEndTry();}

#define BEGIN_CATCH(ext, exn) else if(fooCatchException(#ext)) { \
ext exn=fooGetException();
#define END_CATCH }

#define BEGIN_FINALLY else{
#define END_FINALLY fooThrowCurrentException();}

(one may forsake using all caps...).


or such...
 
N

Nick Keighley

Yes, that's the intention.
I can't see TLS as C or not C. Wouldn't TLS be a characteristic of the
environment in which the C code runs?
I was thinking of thread_excep being at a fixed location so it could
literally be checked with one instruction:
  cmp [thread_excep], 0
more typically it is some magic like:
mov ecx, [fs:magicOffset]
mov edx, [__myVariable_index]
mov eax, [ecx+edx*4]
and eax, eax
jnz ...
on some systems, they amount to an internal function call.
As shown, with static linking
  if (thread_excep)
should compile to
  cmp [thread_excep], 0
If set up to do so the exception handler would handle the exception
itself and clear thread_excep. If not or if it decided to propagate
the exception it would return to its caller with the thread_excep word
still holding the exception value. Or it could alter thread_excep and
return.
unclear how this would work exactly.
I'll try to explain what I have in mind. There would be a machine word
(which I've called thread_excep above) which would be initialised to
zero. A non-zero value would be set in it to indicate that an
exception had occurred. I'm not sure how best to use the word but one
option is to use different bits to indicate exception classes: data
exception, index exception, io exception, user exception etc.
In the main body of the code exceptions would be detected by
  if (thread_excep) goto handler_2;
Within a function there would be as many handlers as are needed. Say
we wanted to deal with only index-type exceptions in this module. We
could code
handler_2:
  if (thread_excep & EXCEP_INDEX) {
    /* Deal with an index exception */
    thread_excep &= ~EXCEP_INDEX;
    /* Delete the exception object */
    /* Next step: return, go on to next iteration, whatever */
  }
  else return 0;
In the else path the value of thread_excep has not been changed so
whatever called this routine will get a chance to handle the
exception(s).
Does that make sense? I'd welcome corrections or improvements.

Yeah, these sound like state flags.  The example is rather abstract
from the problem, so it is hard to suggest C or I.

A couple of decades ago I had a commecial dish service that provided
delayed stock data to a terminal with a serial port which could be
connected to a PC and with commercial software, a screen full of
stock, bond, options, mutual funds and news headlines, could be
followed.  Of course the PC software was an addon expense with a hefty
monthy charge to follow a handful of additional securities above what
the terminal provided.  So I wrote my own in C.

The serial output was the raw data feed at 9600 baud, the data used a
poor man's encryption that wasn't to hard to figure out after snooping
the output.  But the output records for a security could vary from
complete (all the fields of the record were transmitted) to sparse
(one or more 'update' fields were in the stream).  They were 15-20
minute quote delayed, so the vast majority of records were just
partial updates to things like last quote, new high/low, volume, etc.
And the fields would vary by security type, of course.  It was
relatively easy to see that each record transmitted for a 'new'
security started with SOH`byte followed by SecurityNumber`word.  The
word was transmitted little endian as I recall.  There after, various
control codes, like SOH, told for which field or security type, etc.
The SecurityNumber wasn't the cusip number, it was just a sequence
number and the outfit published a directory every so often showing
which number went to which security,  The only help was that
securities were group by number ranges so say bonds were numbers >
50,000.  But there were small number ranges inbetween that were
unassigned also.

My program development was stepwise and somewhat haphazard because I
didn't know, I had to deduce, the data stream.  I wrote this on and
for a PC XT clone that ran at 10mhz, and its serial port card used an
i8250, that's an early model with a one byte tx/rx buffer, so the
handler can't dally or you lose data.

I wrote the C code in two main chunks, part one to handle the input
stream to a ring buffer, and when the status port said RX register was
empty, part 2 extracted data off the ring buffer like a secondary
stream treating non-printables like SOH (control codes) as state flags
within a for(;;) block - those were control codes for the meaning of
the following data in the stream.  That's alot like 'exceptions'
without the overhead in that the input stream is random and not
repeatable. Also a difference is that the control codes were like
trees, one lead to check for a limited number of succession leaves
(controls for individual update fields, if present).

Nowadays you could use threads, one per control codes like SOH to
decode for the next data, but you need a 'sink' thread to catch
undefined or errant data.  In my case with the state flags in the
for(;;) block, I discarded errant and unknown records until I figured
out what they meant, then added in a state flag and handler for them.

I also did something silly, which was to do random writes to my data
file on a ramdisk.  IIRC is was a 256k ramdisk, so the file held only
a small subset of the securities.  Every so often the ram disk data
file would be copied to the hard disk.

It actually worked well with surprisingly few bad records due to data
loss.

I used char flags for the data controls like SOH instead of dealing
with bit shifts to extract control data, so they; ie. SOH_flg, was
either null or non-zero.

Pretty definitely standard C, fwiw, without setjmp, longjmp, raise,
signal or even malloc for that matter.

interesting war story. Um but where are the exceptions? Sounds like a
for(;;) loop with a big switch statement. It's kind of "well,how else
would you write it?"
 
J

James Harris

....
There would be a machine word [...]

He said: "machine word"...  Ok, that screams "volatile" keyword to me.

May be a bad term. I mean a 16-bit word on a 16-bit machine, a 32-bit
word on a 32-bit machine, a 64-bit word on a 64-bit machine. I
sometimes avoid the plain term "word" as it has been somewhat hijacked
by the 16-bit brigade. I come from an older era where the word size
depended on the machine.
Are these x86 exceptions or x86 interrupts or events external to C, e.g.,
for your OS, that are being passed to C via a machine word?  I suspect they
are, but you probably wanted to keep things topical for c.l.c. ...

This fully applies to C (or another language for that matter). The
machine word is one that could be set, altered and cleared all in the
language or in a called module whether in the same language or not.
That said, it could just as easily be modified by the OS or something
else. Perhaps you could think of it and the operations as a protocol.
Whatever raises an exception that it doesn't handle itself sets the
word and returns to its caller. Whatever does handle outstanding
exceptions clears the word. And the word is checked whenever
appropriate.
At a minimum, I'd suggest reading Eric Sosman's post and also
my post on a.o.d.  ;-)

LOL! You think I wouldn't read all the replies?
You said Kaz's setjmp/longjmp code looks good to you, so I'd take that asan
intuitive proxy that you should use it.  I'm sure others on c.l.c. could
help you fix whatever issues you have with it.

My comment was a general one. It looked like good code but it's not
what I was looking for. In fact I wasn't looking for a solution as
such but for feedback on my preferred option (which I got). Some good
points were made and I have some replies to get round to making.

James
 
R

Rod Pemberton

James Harris said:
"James Harris" <[email protected]> wrote in message
....
There would be a machine word [...]
He said: "machine word"... Ok, that screams "volatile" keyword to me.

May be a bad term. I mean a 16-bit word on a 16-bit machine, a 32-bit
word on a 32-bit machine, a 64-bit word on a 64-bit machine. I
sometimes avoid the plain term "word" as it has been somewhat hijacked
by the 16-bit brigade. I come from an older era where the word size
depended on the machine.

[ ... continuing the thought ...]
This fully applies to C (or another language for that matter). The
machine word is one that could be set, altered and cleared all in the
language or in a called module whether in the same language or not.
That said, it could just as easily be modified by the OS or something
else. Perhaps you could think of it and the operations as a protocol.
Whatever raises an exception that it doesn't handle itself sets the
word and returns to its caller. Whatever does handle outstanding
exceptions clears the word. And the word is checked whenever
appropriate.

Ok, that screams "volatile" keyword to me. Anything modified outside of C
and used by C is unknown to C so needs a volatile so it's not optimized away
or to ensure it's reloaded from memory instead of being cached in a register
since it's value could changed by non-C code ...
LOL! You think I wouldn't read all the replies?

No. First, I wasn't sure you were reading from a.o.d. Second, it was just
easier than saying I didn't agree with some comments of other post-ers
without starting an argument with them. My experience with c.l.c people
over the past decade is that many of them are insanely hostile, intolerant
of any criticism, and will argue to their death that they are correct when
in fact they are provably wrong ...


Rod Pemberton
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,049
Latest member
Allen00Reed

Latest Threads

Top