Contents of an exception object

Discussion in 'C Programming' started by James Harris, May 28, 2014.

  1. (snip)
    The first time I remember seeing it was in descriptions of NFS.
    As UDP doesn't provide any protection against loss or duplication
    of requests, that was important. In the case of no (or delay in)
    acknowledgement, repeat the request.

    (snip discussing alternate meaning)

    -- glen
    glen herrmannsfeldt, May 30, 2014
    1. Advertisements

  2. I first learned about it with HTTP; GET requests are supposed to be
    idempotent for proper interaction with caches (including the local one
    in the browser), whereas other methods are generally not.

    SIP, which was designed to run on top of UDP, has idempotent
    transactions, which are simple request-response, and non-idempotent
    ones, which require a three-way handshake to complete.

    A _lot_ of braindead protocols, e.g. SMB (aka CIFS), encounter problems
    because they don't treat the two cases differently, which either results
    in poor performance (too much handshaking for idempotent requests) or
    data consistency errors (not enough handshaking for non-idempotent

    Stephen Sprunk, May 30, 2014
    1. Advertisements

  3. (snip, someone wrote)

    Interesting, and it shows some about the complications of mapping
    between mathematics and computer science.
    OK, but if you consider a system with a state x, and an operation
    that changes the state as x=f(x) (that is, C assignment, not
    mathematical equality) then doing the operation twice, that is


    (now written as C statements with semicolons.)

    is the same as:


    As I mentioned previously, I first learned about this through NFS.
    And NFS read or write request includes the file offset, such that
    duplicating the request gives the same result. (Though maybe not
    if requests from different hosts are interleaved.)

    Duplicating a file creation or deletion request works, as long as
    there is no error reported on the second try. (I am actually not
    sure how NFS does this. It is usual, for example, to report an
    error on attempt to create a directory that exists.)
    Since mathematical functions tend not to have state, this isn't
    so easy to describe.
    -- glen
    glen herrmannsfeldt, May 30, 2014
  4. (snip)
    The fun thing about NFS, at least in the ones I know, is that it
    is the time stamp on the server that applies. I have a nice little
    NAS box running NFS that I use to store files on. Sometimes it
    restarts, and then fails to set the time, possible instead to
    many years ago, or even years in the future. Newly modified files
    then get that time. I now have a file created in 1937 (I don't remember
    if this was the reason) which always sorts as newest in any ls -t

    -- glen
    glen herrmannsfeldt, May 30, 2014
  5. Of course; the server can't trust the clients' clocks to be correct or,
    more importantly, to agree with each other. Whether the server's clock
    happens to be right or wrong is almost irrelevant; one should use the
    _same_ clock for all files, and the server's is the obvious one to choose.
    It's common to see clocks come up as 31 Dec 1969 (western hemisphere) or
    1 Jan 1970 (eastern hemisphere), which is (time_t)0 on most *ix systems.
    The cost of a battery-backed clock is not justified for a device that
    should get the correct time from NTP during their boot cycle, though
    Wintel boxes typically have them anyway for historical reasons.

    Either 2038, i.e. (time_t)0xffffffff, or 1901, i.e. (time_t)0x80000000,
    would also be somewhat understandable; I don't see how you got 1937.

    Stephen Sprunk, May 31, 2014
  6. Obviously, this doesn't work in C except in the degenerate case where
    f(x)==f(f(x)), which isn't a terribly interesting state machine.

    A typical solution is to use transaction numbers, with the last N
    results cached. If you receive a new request with an existing number,
    e.g. your response was lost, you just send back the cached result for
    that number rather than performing the transaction again. Note that due
    to various factors, transactions may not arrive or complete in numerical
    order (or arrive at all!), so a strict N+1 system will have poor
    performance over slow and/or noisy networks, if it works at all.

    Stephen Sprunk, May 31, 2014
  7. James Harris

    Ian Collins Guest

    Who writes "general bit-shuffling" functions? What are they?

    Nearly all code is written to perform a specific task. Most <32 bit
    embedded code uses static fixed memory and shuns recursion. This
    enables static code analyses to confirm it will never use more than the
    available memory. I've written tools to perform this analysis, it's an
    interesting challenge.
    Ian Collins, May 31, 2014
  8. (snip, I wrote)
    Well, NFS was designed to be stateless. That is, the server (other
    than what is on disk) doesn't store any state. If the server
    reboots (either intentionally or after a crash) the client retries
    the request and goes on as if nothing happened.

    A request might say "write these bytes to a disk file at offset
    12345", which, if repeated, gives the same result.

    (One complication with NFS is file locking, which is a separate
    Original NFS used UDP, but now TCP is more common. Even so, the
    stream contains a series of requests, and if the stream is broken
    and restored, the server will continue on. (Some requests might
    have been lost in buffers at the time.)

    One that I remember from many years ago (SunOS 3.x days), I made
    a hard link to a directory (I didn't know you weren't supposed to
    do that, and didn't know about symlinks) which immediately crashed
    the NFS server system. After some work to restore the server,
    including undoing the link, the client immediately retried the
    request (since it didn't know why the server crashed).

    -- glen
    glen herrmannsfeldt, May 31, 2014
  9. (snip, I wrote)
    Sorry, yes, it is July 8, 2037 15:53:53.

    -- glen
    glen herrmannsfeldt, May 31, 2014

  10. Does an indirect call count?

    There are several issues. The main one is, is the function now defined
    in terms of the effect of bits, or in terms of the calls made to the
    callback? qsort() for example, isn't guaranteed to call the callback any
    set number of times, in any particular order, or on any particular pair
    of passed-in values. Also, qsort() is pretty clearly a bit (un-) shuffling
    But you can pass qsort() a callback with side effects, you can mis-use it
    to write code which depends on a particular order of callback evaluation.

    So that's a major proviso. If we allow function pointers, then we've
    lost the compiler-imposed guarantee that the function can be replaced
    by any with equivalent bit effects. But we can in reality replace qsort()
    with any sorting function. The guarantee has to be imposed by the
    human programmer keeping to the specified interfaces.

    The other issue is, can the function be proved correct if passed a
    callback that doesn't do IO? Generally the answer is yes, because you
    can fake up IO by passing bytes from memory buffers. But not, for
    example, if the IO routines are very low level and rely on being called
    with certain timings.

    Another issue is that, if you allow function pointers, a perverse
    programmer can defeat the system with a a perverse use of function
    pointers, basically passing in a table of function pointers to make
    liberal IO calls. So you're almost but not quite losing the benefit.

    The final issue I can think of is that the use of serialisation
    streams is a practical problem for showing correctness. Bit shuffling
    functions are defined in terms of their effects on bits, which means
    the bits they have access to, which is a finite and usually either
    small or simple set (by a simple set, I mean you might have a million
    employees, but you do essentially the same operation on each one).
    When you pass a stream as a parameter, you lose that, and it doesn't
    make any practical difference whether it is in or out of memory
    (it makes a theoretical difference, of course). Streams are usually
    implemented via callbacks, though they don't have to be.

    But callbacks are too useful to ban from bit-shuffling functions.
    Malcolm McLean, May 31, 2014
  11. I do.
    Most of what I do is designed to produce an output given an input.
    Of course the bits are humanly meaningful. The input bits represent
    vertices of polygons, all of which are might be "similar", they
    might represent what a human would understand as a worm in various
    poses. The output bits represent the worm being taken through its
    repertoire of poses. How does it work? You build a matrix of
    differences, then take eigenvectors. You throw away all but the top
    few eigenvectors, say three, then you move a point around in 3D
    space, representing the three top eigenvectors. You then convert the
    eigenvectors back to xy space for polygons, and the worm wriggles.

    That was a program to help analyse the motion of real worms for a
    laboratory. Of course it was then hooked up to a graphics system
    so you could actually see the worms, but that step was trivial, all
    the interest was in the bit shuffling.
    So that's another bit shuffling function, surely?
    Malcolm McLean, May 31, 2014
  12. James Harris

    James Harris Guest

    Not the original query but I don't mind explaining what I have got set up.
    It is not a "normal" exception mechanism in that it is not automatic.
    However, exceptions can be thrown, caught, replaced, cancelled and rethrown.
    Once thrown they persist until they are cancelled or replaced. They are easy
    to catch and handle. That lot seems to me fairly close to covering the

    Downsides: Unwinding is manual and I don't at present add any call stack
    info to an exception object.

    The call stack info is not normally needed because I catch the exceptions
    and deal with them rather than reporting them.
    See below.
    The latter.

    In fact the changes are not that intrusive. After a call to something which
    could throw an exception (or which could itself call something which could
    generate an exception) I usually have just

    if (exception) return;

    (Or "if (exception) return 0;" if the function returns an integer, etc.)

    Naturally given the above, resources can be freed before returning, if
    needed. For example, here's a piece of code where two things need to be
    dealt with: interrupts have been turned off and a timeout has to be
    cancelled. So the code restores the interrupt state and cancels the timeout
    before returning.

    if (exception) {
    return 0;

    Handling timeouts has been the main use for this so far. I am interfacing
    with hardware and have a number of places where I have nested timeouts. If
    the lowest level of timeout expires it throws an exception which gets caught
    higher up in the module which set the timeout. That module can then decide
    what to do but it can also be operating under a timeout from a higher level.
    That is unaffected. It just works. There is a stack of timeout times but
    just one exception word.

    Frankly, it's awesome to have this in C! There is not much coding to do and
    the result is useful and simple.

    You can guess the rest but to be clear.... As you can see, exception == 0
    means no exception. Where an exception is to be handled the code might be

    switch (exception) {
    case 0: break; /* No exception */
    /* Deal with a timeout */
    /* Deal with memory depletion */
    return 0; /* Pass unhandled exception to caller */

    Or it could be

    if (exception) {
    switch (exception) {
    ... etc

    The coding for these is similar to the coding for an automatic mechanism so
    there is no loss there.
    The intial query was about potential contents of the exception object, not
    about the mechanism.
    I can write to the screen; that's not a problem. In fact my top-level module
    has a catch all. If an exception reaches it it writes what the exception was
    and, without a call stack, the file and line number where the exception was
    thrown, something like

    user-defined exception in src/kbc_io.c at line 203

    James Harris, May 31, 2014
  13. James Harris

    Nobody Guest


    The two options are separated by the word "or".

    The first option is "a similar effect". The second option is "at least one
    that suits my purposes".

    The OP implied that they're roughly the same thing. Jorgen took advantage
    of the phrasing and responded as if it had been posed as an explicit
    choice between two distinct alternatives.
    Nobody, May 31, 2014

  14. It looks like the answer is yes -- it's a bit-shuffling function
    according to your classification.

    What about the converse? Is the function

    void format_int(FILE *fp, int i) { fprintf(fp, "%08d", x); }

    An IO procedure even when called with a FILE * representing an in-memory
    stream? (It does not apply to standard C, but it's a very common
    extension to provide such a thing.)
    Ben Bacarisse, May 31, 2014
  15. I don't see it yet. How does control get here if the called function
    (or one it calls) throws an exception?

    Ben Bacarisse, May 31, 2014
  16. As I said, stream parameters are a difficult case.
    If we provide a version of fprintf() that cannot attach to an external
    stream, then indeed it becomes a bit-shuffling function.
    It's possible to fake up most IO devices, converting procedures into
    bit-shuffling functions. But whilst that's often good and useful
    for getting most errors out, it cannot be the final test of correctness.

    The other question is whether the function still remains correct
    if we replace fprintf() with another piece of code which has the same
    effect on bits. Probably it doesn't. If we replace fprintf() with
    sprintf(fp->internalbuff, "%08d", x), you'll probably say that the
    function is now broken. So the structure's opaque. Shouldn't really
    be allowed, because the bit-shuffling function is then defined as
    correct /incorrect based on the subroutine calls it makes, whilst
    bit-shuffling functions should be defined as correct / incorrect
    based on bit state on entry / exit.
    However if the subroutines are bit-shuffling functions, then that's
    not a problem, because those subroutines are defined in terms of
    bit state, even if the definition is cast at rather higher level
    than the bit level.

    What do I mean by that?

    sprintf("%08x", x);

    we can define the behaviour of the function as "to write 8 ascii characters
    in the range 0-f, representing a hexadecimal number". So any implementation
    of sprintf() that returns the result 0000000a when passed "10" is correct,
    at least for that input.


    sprintf("%p", ptr)

    we can define as "the behaviour of this function is to write a humanly-
    meaningful representation of the pointer ptr.

    So we can implement sprintf to return "8000DA7A", or "8000-da7a:",
    both are correct. We've defined correctness in a non-mathematical way.
    Computers can cope with that, so can human programmers. But a formal
    definition in terms of functions that map input to output can't. So
    we can actually have opaque structures, we don't need to ban them,
    but we need to understand that the bit-shuffling subroutines are
    defined at a high level, in terms of what the bits represent, not
    what their actual on/off state is.

    Parameters which are streams, indirect calls, and opaque structures are
    difficult areas. But they're not fatal. The simple rule that a
    bit-shuffling function cannot call IO or do IO remains. It's just
    that some of the implications go a bit deeper than might first appear.
    Malcolm McLean, May 31, 2014
  17. I could not determine an answer to my question from this. Maybe some
    functions can't classified?
    Ben Bacarisse, May 31, 2014
  18. James Harris

    James Harris Guest

    Basically, each function in the call chain would have the same
    test-and-return line at selected points - usually after each call to a
    function which could result in an exception - though there are variations.

    It is probably clearest if I try to explain what I mean by variations by
    making up an example showing different scenarios. Bear in mind that the
    overall idea is something I am experimenting with. It is not yet a fully
    worked solution that I would recommend to others. It is just something that
    seems to be working well for me.

    Imagine there are four functions which form a call chain as in

    A -> B -> C -> D

    So function A calls B, B calls C, and C calls D. Each deals with exceptions
    differently but they have to work together. Let's say that D may throw an
    exception. Function C does not handle any exceptions at all. That is
    probably the most common case. B handles some specific exceptions. Function
    A is the final "catcher" for the application and handles all catchable
    exceptions one way or another. It could take remedial action in some cases
    but in this example it will just report them to a person.

    The code in the four functions would be along the following lines.

    Function A catches all catchable exceptions and reports them. So after a
    call to B it only needs to test whether variable "exception" is non-zero and
    if it is to act accordingly.

    void A(void) {
    .... other code ....
    if (exception) {
    printf("\n%s exception in %s at line %u\n",
    .... other code ....

    Function B handles some specific exception types so it might use a switch
    statement in order to select which path to take. This is similar to the
    traditional use of multiple catch clauses.

    void B(void) {
    .... other code ....
    switch (exception) {
    case 0:
    break; /* Nothing went wrong */
    /* Handle range exception */
    return; /* Not handled here. Go to caller */
    .... other code ....

    Function C doesn't catch any exceptions at all so if one occurs it
    immediately returns to its caller.

    void C(void) {
    .... other code ....
    if (exception) return;
    .... other code ....

    Finally, function D doesn't handle exceptions but may throw one or two.

    void D(void) {
    .... other code ....
    if (condition1) {
    exception_throw(EXCEPTION_RANGE, __FILE__, __LINE__,
    "D", 0);
    .... other code ....
    if (condition2) {
    exception_throw(EXCEPTION_USER, __FILE__, __LINE__,
    "D", 0);
    .... other code ....

    Note that 'throwing' the exception just sets the exception variables. D must
    still return to its caller.

    In the above scenario, if D throws EXCEPTION_RANGE the effect will be as if
    function C was skipped and the exception will be caught by B. If D throws
    EXCEPTION_USER the effect will be as if both C and B were skipped and the
    exception will be caught by function A.

    Note that although B catches the EXCEPTION_RANGE type it does not have to
    deal with all such exceptions. After other tests on what specifically went
    wrong it could decide either to deal with the issue - in which case it will
    call exception_cancel() - or that it doesn't want to deal with the issue -
    in which case it will return to its caller with the exception object

    The above illustrative code is just as typed in and is untested. I may have
    missed something but I hope it gives the idea.

    Comments welcome.

    James Harris, May 31, 2014
  19. sleep() is an oddball.

    But functions either call IO or they don't, defining memory as "not IO",
    even though physically the memory might be implemented as a peripheral
    device. So that's relatively crisp definition, it can be enforced

    But callbacks are an interesting special case. You need to enforce
    callbacks not modifying any internal or intermediate state used by a
    bit-shuffling function, except for the bits that the callback is passed
    to modify, or it's not really right to call it a "function",
    and you need to enforce that the program doesn't rely on any particular
    order or number of calls to the callback. Most programmers do that
    anyway, but it's not a nice simple rule like the previous one.

    If we're not relying on any particular order or number of callback
    calls, it's difficult to see how programs can do IO in the callback
    and still have correct behaviour for any pattern of calls.

    Most IO can be replaced by non-IO dummy or stub functions, and the
    procedures tested for correctness. That's a common technique. If you
    don't have barcode reader, write a little function that returns one
    from a list of ten barcodes. But it's difficult to test everything
    like that.

    The idea is simple, the implications aren't so simple.
    Malcolm McLean, May 31, 2014
  20. What about raise, longjmp, atexit, thrd_create (and friends), mtx_lock
    (etc.)? Are they oddballs or simple to classify?
    I don't know what you are saying here. I just can't follow the words.
    Is a simple yes/no answer possible for the two examples I gave? If not,
    that's fine, I'd just like to know.

    Your classification is based on the function, but the examples were
    supposed to suggest that it's sometimes the call that determines what
    the function does (bit-shuffling or IO).
    I'd say simplistic rather than simple -- I think it misses the
    complexity of most programming languages. Maybe it will seem simple
    when it's defined well-enough for me to see what functions are in which
    Ben Bacarisse, May 31, 2014
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.