State Machine

H

holysmoke

Hi All,

I have with me two ways to implement a state machine.

One I use the switch statement:
switch ( media_event )
{
case EVENT_HIDE:
if ( media_state != STATE_HIDDEN )
media_state = STATE_HIDDEN;
function(a,b,c,d,e,f);
break;
.
.
.
case EVENT_SCROLL:
if ( media_state != STATE_HIDDEN )
/* Scroll */

Notice for every event there are two conditional statements.

Another way is use an array of function pointers:
static media_callback media_cb [EVENT_MAX][STATE_MAX] =
{
{dummy, media_hide, media_hide, media_hide },
...
};

media_cb[media_event][media_state](a,b,c,d,e,f);

My question is:
1. What is the cost of calling dummy function above compared with cost
of two conditionals? Is question too platform specific? (FWIW I'm on
ARM9, using ADS)

2. Is introducing a conditional at the point of call the better
solution? (& of course replacing dummy with NULL in media_cb array
above)
if (media_cb[media_event][media_state] != NULL)
media_cb[media_event][media_state](x,y,z);

3. Any other better way to implement this?

Regards

PS: Following are the data types

typedef enum _media_event
{
EVENT_HIDE,
EVENT_PLAY,
EVENT_SHOW,
EVENT_SCROLL,
EVENT_MAX
} media_event;

typedef enum _media_state
{
STATE_HIDDEN,
STATE_PLAYING,
STATE_SHOWING,
STATE_SCROLLING,
STATE_MAX
} media_state;
 
W

Walter Roberson

holysmoke said:
My question is:
1. What is the cost of calling dummy function above compared with cost
of two conditionals? Is question too platform specific? (FWIW I'm on
ARM9, using ADS)

Yes, the cost of any particular hardware operation as too variable
for us to get into here.
2. Is introducing a conditional at the point of call the better
solution? (& of course replacing dummy with NULL in media_cb array
above)
if (media_cb[media_event][media_state] != NULL)
media_cb[media_event][media_state](x,y,z);

You'd have to measure to find out; and with the next compiler release
or with different compliation flags you might get a different result.
 
I

Ian Collins

holysmoke said:
Hi All,

I have with me two ways to implement a state machine.

One I use the switch statement:

Notice for every event there are two conditional statements.

Another way is use an array of function pointers:
static media_callback media_cb [EVENT_MAX][STATE_MAX] =
{
{dummy, media_hide, media_hide, media_hide },
...
};

media_cb[media_event][media_state](a,b,c,d,e,f);

My question is:
1. What is the cost of calling dummy function above compared with cost
of two conditionals? Is question too platform specific? (FWIW I'm on
ARM9, using ADS)
Can't say, you'd have to profile it, or at least look at the generated
assembly code.
2. Is introducing a conditional at the point of call the better
solution? (& of course replacing dummy with NULL in media_cb array
above)
if (media_cb[media_event][media_state] != NULL)
media_cb[media_event][media_state](x,y,z);
I'd probably go for something like

yourFunctionType fn = media_cb[media_event][media_state];

if( fn )
{
fn(x,y,z);
}

Unless profiling showed this to be too expensive and there was a faster
alternative.
 
C

Christopher Layne

Ian said:
yourFunctionType fn = media_cb[media_event][media_state];

if( fn )
{
fn(x,y,z);
}

Unless profiling showed this to be too expensive and there was a faster
alternative.

OT: I've heard that modern compilers *can* also inline callbacks to code which
they have direct access to (i.e. not in another library, object file, etc.).
 
B

Ben Pfaff

Christopher Layne said:
OT: I've heard that modern compilers *can* also inline callbacks to code which
they have direct access to (i.e. not in another library, object file, etc.).

I've been testing this from time to time with GCC as it evolves,
because it has wonderful optimization potential, that would allow
some of the best of C++ templates to be implemented in C.

Imagine that you have a header file that defines an inline
qsort-like function named my_qsort(). Now, in a .c file, define
a comparison function for, say, ints, and write an extern
function that just calls my_qsort() passing that comparison
function. If all goes well, the sort and all the comparisons get
inlined by the compiler, so that you end up with a nice, fast
sort function with the modularity of callbacks but without the
performance penalty.

Now extend it by doing this with ADTs such as balanced binary
search trees and hash tables. With reasonable use of macros,
this is pretty spiffy.

But so far I haven't been able to get GCC to cooperate to the
extent that I want.
 
C

Christopher Layne

Ben said:
Imagine that you have a header file that defines an inline
qsort-like function named my_qsort(). Now, in a .c file, define
a comparison function for, say, ints, and write an extern
function that just calls my_qsort() passing that comparison
function. If all goes well, the sort and all the comparisons get
inlined by the compiler, so that you end up with a nice, fast
sort function with the modularity of callbacks but without the
performance penalty.

Now extend it by doing this with ADTs such as balanced binary
search trees and hash tables. With reasonable use of macros,
this is pretty spiffy.

But so far I haven't been able to get GCC to cooperate to the
extent that I want.

Yes. I seek that same holy grail that you also seek Ben. About the best I can
get out of it is this:

/* no error checking example */
#include <stdlib.h>
#include <stdio.h>

typedef struct cbt__ cbt;
typedef void (*cbt_cb_display)(cbt *);

struct cbt__ {
char *text;
cbt_cb_display display;
};

static void cbt_display(cbt *c)
{
fprintf(stdout, "c->text == %s\n", c->text);

return;
}

extern cbt *cbt_new(void)
{
cbt *c;

c = malloc(sizeof *c);
c->text = NULL;
c->display = cbt_display;

return c;
}

int main(int argc, char **argv)
{
cbt *c;

if (argc <= 1) return EXIT_FAILURE;

c = cbt_new();
c->text = argv[1];
c->display(c);

return 0;
}

Lots of x86 asm here:

No optimization on this one:
$ gcc -W -Wall -pedantic -O0 -S -o /dev/stdout cb.c
.file "cb.c"
.section .rodata
..LC0:
.string "c->text == %s\n"
.text
.type cbt_display, @function
cbt_display:
pushl %ebp
movl %esp, %ebp
subl $24, %esp
movl 8(%ebp), %eax
movl (%eax), %eax
movl stdout, %edx
movl %eax, 8(%esp)
movl $.LC0, 4(%esp)
movl %edx, (%esp)
call fprintf
leave
ret
.size cbt_display, .-cbt_display
..globl cbt_new
.type cbt_new, @function
cbt_new:
pushl %ebp
movl %esp, %ebp
subl $24, %esp
movl $8, (%esp)
call malloc
movl %eax, -4(%ebp)
movl -4(%ebp), %eax
movl $0, (%eax)
movl -4(%ebp), %eax
movl $cbt_display, 4(%eax)
movl -4(%ebp), %eax
leave
ret
.size cbt_new, .-cbt_new
..globl main
.type main, @function
main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ecx
subl $36, %esp
movl %ecx, -28(%ebp)
movl -28(%ebp), %eax
cmpl $1, (%eax)
jg .L6
movl $1, -24(%ebp)
jmp .L8
..L6:
call cbt_new
movl %eax, -8(%ebp)
movl -28(%ebp), %edx
movl 4(%edx), %eax
addl $4, %eax
movl (%eax), %edx
movl -8(%ebp), %eax
movl %edx, (%eax)
movl -8(%ebp), %eax
movl 4(%eax), %edx
movl -8(%ebp), %eax
movl %eax, (%esp)
call *%edx # the callback from main().
movl $0, -24(%ebp)
..L8:
movl -24(%ebp), %eax
addl $36, %esp
popl %ecx
popl %ebp
leal -4(%ecx), %esp
ret
.size main, .-main
.ident "GCC: (GNU) 4.1.1"
.section .note.GNU-stack,"",@progbits
-----

Now we go hard core optimization:
$ gcc -W -Wall -pedantic -O3 -S -o /dev/stdout cb.c
.file "cb.c"
.text
.p2align 4,,15
..globl cbt_new
.type cbt_new, @function
cbt_new:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
movl $8, (%esp)
call malloc
movl $0, (%eax)
movl $cbt_display, 4(%eax)
leave
ret
.size cbt_new, .-cbt_new
.section .rodata.str1.1,"aMS",@progbits,1
..LC0:
.string "c->text == %s\n"
.text
.p2align 4,,15
.type cbt_display, @function
cbt_display:
pushl %ebp
movl %esp, %ebp
subl $24, %esp
movl 8(%ebp), %eax
movl (%eax), %eax
movl %eax, 8(%esp)
movl $.LC0, %eax
movl %eax, 4(%esp)
movl stdout, %eax
movl %eax, (%esp)
call fprintf
leave
ret
.size cbt_display, .-cbt_display
.p2align 4,,15
..globl main
.type main, @function
main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
movl $1, %eax
pushl %ebp
movl %esp, %ebp
subl $24, %esp
cmpl $1, (%ecx)
movl %ecx, -8(%ebp)
movl %ebx, -4(%ebp)
movl 4(%ecx), %ebx
jle .L8
movl $8, (%esp)
call malloc
movl $0, (%eax)
movl 4(%ebx), %edx
movl $cbt_display, 4(%eax)
movl %edx, (%eax)
movl %eax, (%esp)
call cbt_display # callback is not a deref here.
xorl %eax, %eax
..L8:
movl -8(%ebp), %ecx
movl -4(%ebp), %ebx
movl %ebp, %esp
popl %ebp
leal -4(%ecx), %esp
ret
.size main, .-main
.ident "GCC: (GNU) 4.1.1"
.section .note.GNU-stack,"",@progbits
 
I

Ian Collins

Ben said:
I've been testing this from time to time with GCC as it evolves,
because it has wonderful optimization potential, that would allow
some of the best of C++ templates to be implemented in C.
That would be a good thing to have. Temples give the C++ standard
library algorithms the performance edge over their C counterparts.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top