Is this setjmp/jongjmp usable legal?


P

Player

I'm not quite familiar with setjmp/longjmp (and neither with macros either)but I was trying to make something similar to this "yielding return" construct from C#, namely: when a function X "yields returns" some value everything happens like it has actually returned that value; however the state (local vars, IP) is saved, so that when X is called again it starts execution in the next statement after the last "yield return".

Is this a portable legal solution, as long as I keep local variables volatile:
1 #include <stdio.h>
2 #include <setjmp.h>
3 #define STATE_FUNC static jmp_buf _env; static int first_time = 1; if(first_time) first_time = 0; else longjmp(_env, 1)
4 #define yield(X) do { if (!setjmp(_env)) return X; } while(0)
5
6 int next_integer()
7 {
8 STATE_FUNC;
9 while(1) {
10 yield(1);
11 yield(2);
12 yield(3);
13 }
14 }
15
16
17 int main()
18 {
19 int p = 20;
20 while(p--) printf("%d", next_integer());
21 return 0;
22 }
 
Ad

Advertisements

T

Tim Rentsch

Player said:
I'm not quite familiar with setjmp/longjmp (and neither with macros
either) but I was trying to make something similar to this "yielding
return" construct from C#, namely: when a function X "yields returns"
some value everything happens like it has actually returned that
value; however the state (local vars, IP) is saved, so that when X is
called again it starts execution in the next statement after the last
"yield return".

Is this a portable legal solution, as long as I keep local variables
volatile:

1 #include <stdio.h>
2 #include <setjmp.h>
3 #define STATE_FUNC static jmp_buf _env; static int first_time = 1; \
if (first_time) first_time = 0; else longjmp(_env, 1)
4 #define yield(X) do { if (!setjmp(_env)) return X; } while(0)
5
6 int next_integer()
7 {
8 STATE_FUNC;
9 while(1) {
10 yield(1);
11 yield(2);
12 yield(3);
13 }
14 }
15
16
17 int main()
18 {
19 int p = 20;
20 while(p--) printf("%d", next_integer());
21 return 0;
22 }

No. The function invocation containing the setjmp() call has
exited. Attempting to do a longjmp() on a buffer filled by a
setjmp() whose containing function's activation has exited is
big no-no, what the ISO standard calls 'undefined behavior'.
This means your program can misbehave in any way at all, and
there is nothing to be done about it. The worst thing is, trying
it in simple cases like this may appear to work, but later the
program might fail in a way worse than you can imagine, and at
the worst possible moment. Using setjmp() like this is not
the way to do what you are hoping to achieve, either portably
or practically.
 
J

James Kuyper

I'm not quite familiar with setjmp/longjmp (and neither with macros either) but I was trying to make something similar to this "yielding return" construct from C#, namely: when a function X "yields returns" some value everything happens like it has actually returned that value; however the state (local vars, IP) is saved, so that when X is called again it starts execution in the next statement after the last "yield return".

Is this a portable legal solution, as long as I keep local variables volatile:
1 #include <stdio.h>
2 #include <setjmp.h>
3 #define STATE_FUNC static jmp_buf _env; static int first_time = 1; if (first_time) first_time = 0; else longjmp(_env, 1)
4 #define yield(X) do { if (!setjmp(_env)) return X; } while(0)
5
6 int next_integer()
7 {
8 STATE_FUNC;
9 while(1) {
10 yield(1);
11 yield(2);
12 yield(3);
13 }
14 }
15
16
17 int main()
18 {
19 int p = 20;
20 while(p--) printf("%d", next_integer());
21 return 0;

It's implementation-defined whether the last character in a text stream
must by a new-line character. It's safest to assume that it is required.
To avoid problems, just add

putchar('\n');

Note: you should not display source code with line numbers; it's
inconvenient for anyone who wants to copy your code and try to compile it.

I didn't notice any other problems with your code, but that doesn't mean
as much as it should - I often miss things. After fixing that issue, it
compiles without error messages using gcc and my preferred options. When
executed, it does what I thought it was supposed to do.

However, I think this is fundamentally a bad idea. Many people coming to
C from other languages have the bright idea of using various kinds of
macros to make C work like the other language. These schemes usually
produce code that is easily understood only by the person who wrote it
(and often not even by that person, if they come back to the code a few
years later). I'm a C expert, and I had to do a lot of careful thinking
to reach the (possibly incorrect) conclusion that this code is correct.
You'd be much better off getting familiar with the C way of doing
things, rather than trying to make C work like C#.

This example was obviously written solely to test out your idea; there's
clearly much simpler ways of producing the same output. Can you give a
simple example of C# code that doesn't just demonstrate this feature,
but has a good reason for using it? Be sure to explain any other
features of C# that the code relies on, for the benefit of those of us
who don't know C#. We could show you what the corresponding C way would
be to achieve similar results. The best way to do it in C might, in
fact, make use of something like your STATE_FUNC and yield() macros -
but I doubt it. It's more likely to involve a substantial rearrangement
of the code, so it's not necessary to call setjmp() or longjmp() to make
things happen in the desired order.
 
E

Eric Sosman

I'm not quite familiar with setjmp/longjmp (and neither with macros either) but I was trying to make something similar to this "yielding return" construct from C#, namely: when a function X "yields returns" some value everything happens like it has actually returned that value; however the state (local vars, IP) is saved, so that when X is called again it starts execution in the next statement after the last "yield return".

Is this a portable legal solution, as long as I keep local variables volatile:
1 #include <stdio.h>
2 #include <setjmp.h>
3 #define STATE_FUNC static jmp_buf _env; static int first_time = 1; if (first_time) first_time = 0; else longjmp(_env, 1)
4 #define yield(X) do { if (!setjmp(_env)) return X; } while(0)
5
6 int next_integer()
7 {
8 STATE_FUNC;
9 while(1) {
10 yield(1);
11 yield(2);
12 yield(3);
13 }
14 }
15
16
17 int main()
18 {
19 int p = 20;
20 while(p--) printf("%d", next_integer());
21 return 0;
22 }

No: The `return' actually returns, destroying its function's
entire invocation and all the context that accompanies it. When
the function is re-entered a brand-new invocation begins, having
no connection to the previous one. You cannot expect longjmp()
to restore all of a destroyed context.

Concrete example: On many systems, entering a function reserves
a greater or lesser amount of stack space for the function's use.
Returning releases that space, allowing subsequent functions (or
the system itself) to use it. When your next_integer() function
releases its stack space, printf() and all the things printf()
calls are at liberty to reserve it, use it, and release it in
their turn. Re-entering next_integer() will reserve the space
again (or perhaps an equivalent amount of space at some other
memory address!), but it may have been scribbled on in the meantime
and anything the prior next_integer() invocation may have stored
there -- like `volatile' variables -- may well be garbage.

What problem are you trying to solve?
 
G

glen herrmannsfeldt

(snip)
Note: you should not display source code with line numbers; it's
inconvenient for anyone who wants to copy your code and try to
compile it.

Sometimes it does make the explanation easier, though.

And it isn't that hard to remove them. One (ex) command in vi,
I presume also in many other editors.

-- glen
 
J

James Kuyper

(snip)


Sometimes it does make the explanation easier, though.

Comments are more useful than line numbers for that purpose, in my
experience.
And it isn't that hard to remove them. One (ex) command in vi,

Well, yes, but the ex commands supported by vi are a pretty powerful
system; the command needed for this case wasn't very complicated, but
I've known many people who were only moderately familiar with vi who
would not have been able to figure out how to do it.
I presume also in many other editors.

I've used many text editors other than 'vi' which had no method I could
use to remove the line numbers that was any faster than manually going
to each line and deleting them.
 
Ad

Advertisements

J

James Kuyper

....
I didn't notice any other problems with your code, but that doesn't mean
as much as it should - I often miss things. ...

And as it turned out, I did miss a very important issue, which Eric
pointed out. I've almost never used longjmp(), which explains why I
forgot that rule. Let me explain in more detail than he did.

Calling longjmp() when the corresponding setjmp() occurred inside a
function or thread that has already terminated has undefined behavior.
Termination includes executing a return statement, or implicitly
returning when the final '}' of a function definition is reached, or
calling longjmp(), abort(), exit(), _Exit(), quick_exit() or thrd_exit()
from within the function or any of its subroutines. In other words, the
only places from which you can call longjmp() with defined behavior are
inside the same function in which you made the corresponding setjmp()
call, or in a subroutine of that function.

Your function always returns immediately after calling setjmp(), which
guarantees that you will never be able to safely make use of the jmp_buf
filled in by setjmp(). The fact that your program actually appears to
work is one of the options allowed by the phrase "undefined behavior".
That phrase also allows your program to fail catastrophically, or to
malfunction in a subtle way that might (in a more complicated program)
take you years to notice.
 
J

Jorgen Grahn

On 1/3/2013 1:11 PM, Player wrote:
What problem are you trying to solve?

I don't know, but I *do* know yield as implemented in Python[1].

It's neat precisely because it's not easily implemented via anything
else. In interesting cases you have to keep track of state manually,
in a problem-specific manner; the interface tends to become "advance
this state machine one step and look for a generated value".

At that point ... yeah, what problem is the OP trying to solve?
He won't be able to create a general mechanism, that's for sure.

/Jorgen

[1] I suspect both Python and C# borrowed it from elsewhere.
 
P

Player

- About the line numbers:
Yes, I wasn't sure about that either, but I thought it wouldn't hurt since maybe something like "sed s/[0-9]*\ //g" would et rid of them; anyway, If I post code I'll omit it.

- About the "yield return", I was writting an 'iterative' parser that needsto update some global state at each keypress (im using curses) and then, at some time, I thought that this yield return thing would save me a lot of boiler plate code. Anyway, it's usefull for many other things: on articles about C#'s yield return feature there are lots of examples, mainly focused on get things done without locking the UI thread (btw I'm far from being a C# fan, I just liked that feature).

- So, I see the problem is the specification of setjmp was made (maybe) already thinking that the implementation would just save the registers, and once the function which calles setjmp is returned anything the contents at which SP pointed may have already changed.

- Is there a portable way of doing such a thing? Maybe using getcontext/setcontext?
 
S

Shao Miller

- Is there a portable way of doing such a thing? Maybe using getcontext/setcontext?

That's a common approach for POSIX, but it's not Standard C. If you're
using POSIX, you'll be able to find some examples.

- Shao Miller
 
E

Eric Sosman

- About the line numbers:
Yes, I wasn't sure about that either, but I thought it wouldn't hurt since maybe something like "sed s/[0-9]*\ //g" would et rid of them; anyway, If I post code I'll omit it.

- About the "yield return", I was writting an 'iterative' parser that needs to update some global state at each keypress (im using curses) and then, at some time, I thought that this yield return thing would save me a lot of boiler plate code. Anyway, it's usefull for many other things: on articles about C#'s yield return feature there are lots of examples, mainly focused on get things done without locking the UI thread (btw I'm far from being a C# fan, I just liked that feature).

Two C approaches to your problem occur to me (there are almost
surely others, and possibly better):

- Make a concrete representation of the entire state, instead of
encoding part of it as the value of the program counter ("I'm
at instruction X, so the state must be Y"). This may have other
benefits, particularly if you're parsing something that has
recurring sub-structure (so "I'm at X" isn't the whole story).
The "lot of boiler plate" probably amounts to little more than
a `switch' statement.

- Forget the result of a partial parse, and restart from the
beginning each time. "Inefficiency!" I hear somebody cry --
but how rapidly can the user press keys, and how much parsing
can you do in the time between one keystroke and the next?
We're not in the 1970's any more ...
- So, I see the problem is the specification of setjmp was made (maybe) already thinking that the implementation would just save the registers, and once the function which calles setjmp is returned anything the contents at which SP pointed may have already changed.

That's probably how it started. The Standard says very little
about what is stored in a `jmp_buf' -- mostly, it lists things that
are *not* stored, like floating-point flags.
- Is there a portable way of doing such a thing? Maybe using getcontext/setcontext?

Depends on what you mean by "such," I guess. You can't leave a
function -- more generally, a block -- and expect its context to be
preserved so you can magically restart _in medias res_. The two
functions you mention can't be part of "a portable way," since they're
not part of the standard C library. To get out and back in again
portably, you need to re-enter in a "normal" way and then use saved
information to guide you back to the desired state of affairs; C
itself doesn't know what might or mightn't need saving, and so won't
try to do it for you.

ISTM that the fashion nowadays has largely turned away from
coroutines and toward multiple threads communicating via messages,
buffers, queues, and so on. Of all the languages I've written in
four-plus decades of programming, only one had explicit coroutine
support -- and that was an assembler!
 
Ad

Advertisements

G

Greg Martin

On 1/3/2013 1:11 PM, Player wrote:
What problem are you trying to solve?

I don't know, but I *do* know yield as implemented in Python[1].

It's neat precisely because it's not easily implemented via anything
else. In interesting cases you have to keep track of state manually,
in a problem-specific manner; the interface tends to become "advance
this state machine one step and look for a generated value".

At that point ... yeah, what problem is the OP trying to solve?
He won't be able to create a general mechanism, that's for sure.

/Jorgen

[1] I suspect both Python and C# borrowed it from elsewhere.

As I understand the Python version it's a combination iterator and
generator which is a pretty common idiom in functional programming.

It may be possible that the OP's problem can be solved with function
pointers and some form of saved state.

A trivial form of yield and next posted for humours sake:


#include <stdio.h>

struct Items {
int current_pos;
int array[10];
} items = {0, {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} };

int next (int (*fun)(int)) {
return fun (items.array [items.current_pos++]);
}

int yield (int i) {
return 2 * i;
}

int main () {

for (int i = 0; i < 10; ++i) {
printf ("%d\n", next (&yield));
}

return 0;
}
 
T

Tim Rentsch

Player said:
- About the "yield return", I was writting an 'iterative'
parser that needs to update some global state at each keypress
(im using curses) and then, at some time, I thought that this
yield return thing would save me a lot of boiler plate code.
Anyway, it's usefull for many other things: on articles about
C#'s yield return feature there are lots of examples, mainly
focused on get things done without locking the UI thread (btw
I'm far from being a C# fan, I just liked that feature).

It sounds like a great way to make incomprehensible
programs. You're better off looking for a different
design that doesn't make use of this "feature".

- So, I see the problem is the specification of setjmp was made
(maybe) already thinking that the implementation would just
save the registers, and once the function which calles setjmp
is returned anything the contents at which SP pointed may have
already changed.

The specification of setjmp was written some time after lots
of implementations had already been done, not the other way
around.

- Is there a portable way of doing such a thing? [snip]

The mechanism you've outlined is fundamentally incompatible
with C's execution model. There are other mechanisms that
may be close enough to how you conceptualize what you're
looking for so that they could be used. But it's a pretty
good bet that the program will be done sooner and more
reliably if you look for a conceptualization more in line
with how the language already works than looking for ways
to extend it.

Alternatively, work in a different language.
 
A

Andrew Cooper

I'm not quite familiar with setjmp/longjmp (and neither with macros either) but I was trying to make something similar to this "yielding return" construct from C#, namely: when a function X "yields returns" some value everything happens like it has actually returned that value; however the state (local vars, IP) is saved, so that when X is called again it starts execution in the next statement after the last "yield return".

Is this a portable legal solution, as long as I keep local variables volatile:
1 #include <stdio.h>
2 #include <setjmp.h>
3 #define STATE_FUNC static jmp_buf _env; static int first_time = 1; if (first_time) first_time = 0; else longjmp(_env, 1)
4 #define yield(X) do { if (!setjmp(_env)) return X; } while(0)
5
6 int next_integer()
7 {
8 STATE_FUNC;
9 while(1) {
10 yield(1);
11 yield(2);
12 yield(3);
13 }
14 }
15
16
17 int main()
18 {
19 int p = 20;
20 while(p--) printf("%d", next_integer());
21 return 0;
22 }

No. However, it looks as if you want a coroutine.

http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

It is a fantastic read, and solves the problem you appear to have in a
substantially safer way.

~Andrew
 
P

Player

Em sexta-feira, 4 de janeiro de 2013 00h46min50s UTC, Andrew Cooper escreveu:
No. However, it looks as if you want a coroutine.



http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html



It is a fantastic read, and solves the problem you appear to have in a

substantially safer way.



~Andrew

Very nice! My previous attempt, before trying to use setjmp/longjmp were MACROS VERY SIMILAR to his, but using __COUNTER__ (to generate distinct LABELS) and gotos; I had no idea that switch could be used that way. Anyway, I dropped it, in part, mainly because (I guess) COUNTER is gcc-specific and because I thought the setjmp solution would work and would NOT REQUIRE variables to be static.
 
P

Player

- Make a concrete representation of the entire state, instead of

encoding part of it as the value of the program counter ("I'm

at instruction X, so the state must be Y"). This may have other

benefits, particularly if you're parsing something that has

recurring sub-structure (so "I'm at X" isn't the whole story).

The "lot of boiler plate" probably amounts to little more than

a `switch' statement.

Yes, but that's sort of what I was trying to avoid.

- Forget the result of a partial parse, and restart from the

beginning each time. "Inefficiency!" I hear somebody cry --

but how rapidly can the user press keys, and how much parsing

can you do in the time between one keystroke and the next?

We're not in the 1970's any more ...

I actually started coding THAT way. But I was actually hearing voices inside me screamming "Inefficiency!", "Quadratic time!", etc. But I will try notto listem to them and do it that way. The code will be more readable and handling BACKSPACE (or using arrow keys to edit command) won't be a pain.

NOTE: What I'm coding is an ed(-like) editor (for myself) which, as the command gets typed, starts performing the action in vim (since vim can be usedas a client/server). That way I get ed's input syntax (which I find briliant) together with vim's output, with its nice syntax highlighing, etc, etc,etc...
 
Ad

Advertisements

A

Andrew Cooper

No. However, it looks as if you want a coroutine.
Very nice! My previous attempt, before trying to use setjmp/longjmp were MACROS VERY SIMILAR to his, but using __COUNTER__ (to generate distinct LABELS) and gotos; I had no idea that switch could be used that way. Anyway, I dropped it, in part, mainly because (I guess) COUNTER is gcc-specific and because I thought the setjmp solution would work and would NOT REQUIRE variables to be static.

Dont worry - they are not required to be.

Have a look at coroutines.h which provides two examples. The
concurrently reentrable routines (the ccr set) maintain their state
without static variables.

There is nothing to stop you expanding that example to allow state to be
maintained elsewhere as well.

~Andrew
 
N

Nobody

- Is there a portable way of doing such a thing? Maybe using
getcontext/setcontext?

Those aren't part of the ISO C standard; but if they're available, they're
probably what you want. They used to be in POSIX, but were removed to
avoid having to specify how they interact with threads.

Windows has "fibers": user-space, co-operative versions of threads.

Threads would also work, but may be too inefficient if the rate of context
switching is high.
 
P

Phil Carmody

Player said:
I'm not quite familiar with setjmp/longjmp (and neither with macros either) but I was trying to make something similar to this "yielding return" construct from C#, namely: when a function X "yields returns" some value everything happens like it has actually returned that value; however the state (local vars, IP) is saved, so that when X is called again it starts execution in the next statement after the last "yield return".

Is this a portable legal solution, as long as I keep local variables volatile:
1 #include <stdio.h>
2 #include <setjmp.h>
3 #define STATE_FUNC static jmp_buf _env; static int first_time = 1; if (first_time) first_time = 0; else longjmp(_env, 1)

Are you trying to make people's eyes bleed?
4 #define yield(X) do { if (!setjmp(_env)) return X; } while(0)

Oh gawd, a macro which makes use of a variable defined elsewhere.
5
6 int next_integer()
7 {
8 STATE_FUNC;

And you only use that macro once? Why did you make it a macro, that's
just deliberate obfuscation.
9 while(1) {
10 yield(1);
11 yield(2);
12 yield(3);

Did you really think that a simple if-statement needed to be obfuscated
here too?

I'm amazed anyone could be bothered to review this code, as you've
deliberately moved all of the important information away from the
location where it's important. If I saw this in a work context I'd
nack it by the time I reached about line 4.

Phil
 
Ad

Advertisements

J

James Kuyper

Are you trying to make people's eyes bleed?


Oh gawd, a macro which makes use of a variable defined elsewhere.


And you only use that macro once? Why did you make it a macro, that's
just deliberate obfuscation.

This is clearly minimalized example code. Presumable he's planning to
write code where the macro might be used in many different locations.
Did you really think that a simple if-statement needed to be obfuscated
here too?

Again: this is clearly minimalized example code - it's not intended to
justify use of this technique, merely for testing whether or not the
technique would work.

I'm not defending this idea. Eric already pointed out that it has
undefined behavior, and I explained that fact in more detail. Even if
it did have defined behavior, and the defined behavior matched his
expectations, I would still disapprove of this technique as excessively
confusing. Depending upon precisely what it is he's trying to do, I
suspect that callback functions and/or saving the state of function in
structure are more likely than setjmp()/longjmp() to be the best way to
do it.

I'm merely pointing out that these particular criticisms ignore the context.
 

Top