Handling 5000 different functionalities - Optimised way in C

K

karthikbg

Hi,

I have a Board of Coldfire communicating with a PC on the other side.

The PC will be sending via TCP/IP some 5000 different command
identification numbers only
( consider 1 to 5000 ) at 1 minute interval to the board.

The board has Ucos on Coldfire and 8MB flash .
The board will parse the id and carry out the different operations as
per the specification and send the output back to the PC via TCP/IP.

I think 'if-else' is not a good option as it eats up stack and 'switch
case ' is also not a good option as it internally uses 'if-else' only.

Is there any other optimised way to implement this in C ?

Tonnes of Thx in advans,
Karthik Balaguru
 
S

santosh

karthikbg said:
Hi,

I have a Board of Coldfire communicating with a PC on the other side.

The PC will be sending via TCP/IP some 5000 different command
identification numbers only
( consider 1 to 5000 ) at 1 minute interval to the board.

The board has Ucos on Coldfire and 8MB flash .
The board will parse the id and carry out the different operations as
per the specification and send the output back to the PC via TCP/IP.

I think 'if-else' is not a good option as it eats up stack and 'switch
case ' is also not a good option as it internally uses 'if-else' only.

Is there any other optimised way to implement this in C ?

C says nothing about details of code generation and optimisation. What
makes you say that switch isn't good enough? Did you actually try it?
What makes you conclude that switch uses if else internally? Did you
actually inspect the assembly code emitted? Did you try a simple loop?
 
K

karthikbg

santosh said:
C says nothing about details of code generation and optimisation. What
makes you say that switch isn't good enough? Did you actually try it?
What makes you conclude that switch uses if else internally? Did you
actually inspect the assembly code emitted? Did you try a simple loop?

I came across something as below in net while looking in for the
disadvantages of methods
that can be tried to implement the handling of 5000 different cases.

The cc compiler under UNIX (Silicon Graphics, R3000 processor) produces

something like this in the assembly language:


C PSEUDO ASSEMBLY
------------------------------------------------------
switch(j) LOAD j INTO register_1
{ JUMP label_default (unconditionally)



case 1000: .... label_1000:
.... (no jump to label_end if no
break)
case 2000: .... label_2000:
....
case 500: .... label_500:
....
default: .... label_default:
BRANCH label_1000 if register_1 ==
1000
BRANCH label_2000 if register_1 ==
2000
BRANCH label_500 if register_1 ==
500
.....


} label_end:


--------------------------------------------------------

As you see the "case" statement finally is a serie of "ifs",

So, i believe that "case" statment finally is a series of 'ifs'
If that is wrong, Kindly tell me how the "case" statement would have
been implemented
in C language.

So, I am looking an optmised method other than 'switch-case' .

Karthik Balaguru
 
S

santosh

I came across something as below in net while looking in for the
disadvantages of methods

Why don't you inspect the output of _your_ compiler rather than
searching around on the net?

The cc compiler under UNIX (Silicon Graphics, R3000 processor) produces
So, i believe that "case" statment finally is a series of 'ifs'
If that is wrong, Kindly tell me how the "case" statement would have
been implemented in C language.

The C standard says nothing about code generation. Typically switch
statements with more than a few case labels are implemented as jump
tables. This may or may not be true for a particular implementation.

At this point, you'll have ask in a group specific to your platform or
compiler. Try also comp.programming, alt.lang.asm, comp.embedded and
comp.compilers.
 
J

jacob navia

karthikbg a écrit :
So, i believe that "case" statment finally is a series of 'ifs'
If that is wrong, Kindly tell me how the "case" statement would have
been implemented
in C language.

So, I am looking an optmised method other than 'switch-case' .

Karthik Balaguru

Another approach would be a sparse function table with 5000 entries.
In that case you would index (maybe after some tests) that table with
the value "j" of your example.

This would mean however that you would lose the context of the switch,
i.e. you would have to pass to all those functions the context as a
parameter array or another method.

This would speed up the if/else tests, but it would imply a function
call and maybe some parameter passing.

To avoid parameter passing you can make all variables in that function
static, and put them at global scope so all function calls would be to a
void function (no parameters). This would be faster in most cases.

To avoid writing 5000 function write a "default" function (that would
correspond to the default case" and fill the table with that function.
Then, fill the case indexes you want to treat specially with a function
that will handle that case.

jacob
 
J

jacob navia

santosh a écrit :
Why don't you inspect the output of _your_ compiler rather than
searching around on the net?






The C standard says nothing about code generation.

Who cares? We are speaking about V here. If you want to speak about the
standard goto comp.std.c that is the newgroup that handles that subject.

Typically switch
statements with more than a few case labels are implemented as jump
tables. This may or may not be true for a particular implementation.

You may or may not speak absolutely nonsense.
At this point, you'll have ask in a group specific to your platform or
compiler. Try also comp.programming, alt.lang.asm, comp.embedded and
comp.compilers.

To answer a question about opyimizing C?

This is a question for comp.lang.c!

GOSH!
 
J

jacob navia

jacob navia a écrit :
Who cares? We are speaking about V here.

That should have been "C" but the v is the letter right of it
in my keyboard. Sorry
 
K

Keith Thompson

karthikbg said:
I came across something as below in net while looking in for the
disadvantages of methods
that can be tried to implement the handling of 5000 different cases.

The cc compiler under UNIX (Silicon Graphics, R3000 processor) produces

something like this in the assembly language:


C PSEUDO ASSEMBLY
------------------------------------------------------
switch(j) LOAD j INTO register_1
{ JUMP label_default (unconditionally)



case 1000: .... label_1000:
.... (no jump to label_end if no
break)
case 2000: .... label_2000:
....
case 500: .... label_500:
....
default: .... label_default:
BRANCH label_1000 if register_1 ==
1000
BRANCH label_2000 if register_1 ==
2000
BRANCH label_500 if register_1 ==
500
.....


} label_end:


--------------------------------------------------------

As you see the "case" statement finally is a serie of "ifs",

So, i believe that "case" statment finally is a series of 'ifs' If
that is wrong, Kindly tell me how the "case" statement would have
been implemented in C language.
[...]

A C compiler will generate code for a switch statement that correctly
implements the semantics of the switch statement. That's all that the
language (more precisely the language standard) guarantees. In most
cases, that's all you need to care about.

In practice, a decent compiler will generate code that's as close as
reasonably possible to optimal, in time and/or space. Sometimes time
and space optimization (making the code fast vs. making the code
small) can be in conflict. But if you write a switch statement,
you've already given the compiler all the information it needs; you
can usually trust it to generate good code, especially if you ask for
optimization.

In the code you show above, the switch statement is very sparse; the
cases cover a range of 1501 distinct values, but only 3 of them are
actually used. I speculate that the compiler decided, probably
correctly, that a 1501-entry jump table (or a 2001-entry jump table if
it has to be zero-based?) would be overkill. If you have a switch
statement with 5000 cases, and they're reasonably dense, then the
compiler is more likely to generate a jump table -- and if it doesn't,
then it's probably because there's a good reason not to do so for your
platform.

What you're worrying about is called micro-optimization, and it's
usually a bad idea unless you *really* need to get that extra few
percent of performance. The compiler is often better at optimization
that you or I would be; it's not as intelligent, but it has a number
of advantages that we don't. Write clear code (in your case, a
straightforward switch statement is probably the clearest choice),
compile it, execute it, measure it. If it's small enough and fast
enough for your purposes, you're done; don't waste hours of your time
to save a few milliseconds of your computer's time. If it's *not*
small enough and fast enough, try using a profiler to find out where
the program is spending most of its time, and perhaps examine the
assembly language output of *your* compiler.

Higher-level optimization, on the other hand, can be invaluable. This
means choosing the best data structures and algorithms for the task at
hand. Micro-optimizing the inner loop of a bubblesort might make your
program run significantly faster, but it's nothing like replacing the
bubblesort with a quicksort.

First Rule of Program Optimization: Don't do it.
Second Rule of Program Optimization (for experts only!): Don't do it yet.
 
S

Stephen Sprunk

karthikbg said:
I have a Board of Coldfire communicating with a PC on the other side.

The PC will be sending via TCP/IP some 5000 different command
identification numbers only
( consider 1 to 5000 ) at 1 minute interval to the board.

The board has Ucos on Coldfire and 8MB flash .
The board will parse the id and carry out the different operations as
per the specification and send the output back to the PC via TCP/IP.

I think 'if-else' is not a good option as it eats up stack and 'switch
case ' is also not a good option as it internally uses 'if-else' only.

Have you actually examined the output of _your_ compiler with the _full_
5000-case switch statement and optimization enabled to verify this is
true? Only if that is so, and actual profiling shows that this is a
serious performance problem, should you consider second-guessing the
compiler.
Is there any other optimised way to implement this in C ?

You could implement this via a jump table, either using a binary search
through the table if it's sparse or direct-indexed if it's dense, but
odds are your compiler will do this for you automatically if it is
indeed faster on your system.

S
 
K

Keith Thompson

Stephen Sprunk said:
You could implement this via a jump table, either using a binary
search through the table if it's sparse or direct-indexed if it's
dense, but odds are your compiler will do this for you automatically
if it is indeed faster on your system.

You can't really implement a jump table directly in C (other than by
writing a switch statement), but you can set up an array of function
pointers. The function call overhead *might* make this slower than
the jump table that's typically used to implement a switch statement.
 
C

Chris Torek

You can't really implement a jump table directly in C (other than by
writing a switch statement), but you can set up an array of function
pointers. The function call overhead *might* make this slower than
the jump table that's typically used to implement a switch statement.

You can also "densify the switch" in order to encourage the C compiler
to emit a jump table:

/* original:
switch (x) {
case 12: code_for_12(); break;
case 19: code_for_19(); break;
case 33: code_for_33(); break;
case 42: code_for_42(); break;
case 45: code_for_45(); break;
default: code_for_default(); break;
} */
static const reduce_table[45-12+1] = {
1, /* 12 */
0, 0, 0, 0, 0, 0, /* 13-18 */
2, /* 19 */
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 20-32 */
3, /* 33 */
0, 0, 0, 0, 0, 0, 0, 0, /* 34-41 */
4, /* 42 */
0, 0, /* 43-44 */
5 /* 45 */
};
if (x < 12 || x > 45) goto default_case;
switch (reduce_table[x - 12]) {
case 1: code_for_12(); break;
case 2: code_for_19(); break;
case 3: code_for_33(); break;
case 4: code_for_42(); break;
case 5: code_for_45(); break;
default: default_case: code_for_default(); break;
}

This is quite ugly (though it can be improved a bit by using symbolic
constants instead of 1, 2, 3, ... in the reduction table) and should
be used only after testing proves that other alternatives are in
fact still a bottleneck.

(I did this many years ago in some TeX-DVI file handling code, where
it did actually gain about 10% in runtime on the target machine. Of
course, back then "1 megahertz" was a reasonable speed for a CPU. :) )
 
C

CBFalconer

Chris said:
Keith Thompson said:
You can't really implement a jump table directly in C (other than by
writing a switch statement), but you can set up an array of function
pointers. The function call overhead *might* make this slower than
the jump table that's typically used to implement a switch statement.

You can also "densify the switch" in order to encourage the C compiler
to emit a jump table:

/* original:
switch (x) {
case 12: code_for_12(); break;
case 19: code_for_19(); break;
case 33: code_for_33(); break;
case 42: code_for_42(); break;
case 45: code_for_45(); break;
default: code_for_default(); break;
} */
static const reduce_table[45-12+1] = {
1, /* 12 */
0, 0, 0, 0, 0, 0, /* 13-18 */
2, /* 19 */
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 20-32 */
3, /* 33 */
0, 0, 0, 0, 0, 0, 0, 0, /* 34-41 */
4, /* 42 */
0, 0, /* 43-44 */
5 /* 45 */
};
if (x < 12 || x > 45) goto default_case;
switch (reduce_table[x - 12]) {
case 1: code_for_12(); break;
case 2: code_for_19(); break;
case 3: code_for_33(); break;
case 4: code_for_42(); break;
case 5: code_for_45(); break;
default: default_case: code_for_default(); break;
}

This is quite ugly (though it can be improved a bit by using symbolic
constants instead of 1, 2, 3, ... in the reduction table) and should
be used only after testing proves that other alternatives are in
fact still a bottleneck.

(I did this many years ago in some TeX-DVI file handling code, where
it did actually gain about 10% in runtime on the target machine. Of
course, back then "1 megahertz" was a reasonable speed for a CPU. :) )

My PascalP compiler did it automatically. However it didn't
convert sparse spread out cases into if else if sequences.
 
S

Stephen Sprunk

Keith Thompson said:
You can't really implement a jump table directly in C (other than by
writing a switch statement), but you can set up an array of function
pointers. The function call overhead *might* make this slower than
the jump table that's typically used to implement a switch statement.

This is a (very rough, off-the-cuff) example of what I learned a jump
table was:

typedef struct { char *name; int (*func)(void *); } jump_entry;
jump_entry jump_table[] = {
{ "aardvark", do_aardvark },
// insert other animals here
{ "zebra", do_zebra },
{ NULL, do_animalnotfound } };

int do_animal(char *name, void *param) {
int i = 0;

do {
if ((!jump_table.name) || (!strcmp(name,
jump_table.name)))
return jump_table.func(param);
++i;
} while (1);
}

Perhaps that's not technically a jump table, but AFAIK it's the closest
thing* one can implement in C, therefore the name is descriptive enough.

I agree that the compiler _could_ generate something faster, by jumping
to code inside the same function instead of calling another function,
but that shouldn't be a noticeable difference. Besides, if he's got a
5000-line switch statement, I'd really hope that each case was a
function call and not inline code simply for maintainability reasons.
Something that big will likely fall apart due to human error long before
you have to worry about the compiler generating suboptimal code.

Jump tables (or whatever you call my example above) also have a few
advantages over what you can achieve with a switch statement: you can
put the table in a different file than the dispatching code, which may
make it more maintainable, and you can modify the table at runtime,
unlike a switch. If you also parameterize the table, you can also have
several different tables being used by the same dispatch code depending
on context.

(* I know I should have used a binary search, but (a) that assumes the
table is sorted, which may not always be true, (b) my first attempt was
obviously wrong and I decided getting the point across was more
important than figuring out how to code a binary search on a
non-power-of-two-sized array, and (c) a binary search may have a larger
I-cache footprint, bomb the branch predictor, and/or confuse the data
prefetcher.)

S
 
K

karthikbg

I am placing the following answer which i got via mail which looks good
:
==========================================================


Another thing to note: your compiler may generate different

output depending on the optimization level.



Without any optimization (e.g. gcc foo.c), the compiler will

create a linear search for a switch table. However, if you use

optimization (e.g. gcc -O3 foo.c), the compiler will generate

either a jump table or divide-and-conquer search, depending on

the sparseness of the case labels.



Hint: You can use "gcc -O3 -S foo.c" to generate an assembly

output (foo.s), or you can use "gcc -O3 -c foo.c" to generate

an object file (foo.o), and then you can use "objdump -d foo.o"

to see its disassembly.



In the event that your compiler does not create either a jump

table or divide-and-conquer search, you can implement your own

with nested if's. You can even write a program to generate code

that looks something like:



int func (int n) {

if (n < 1 || n > 5000) return 0;

if (n < 2500) {

if (n < 1250) {

if (n < 625) {

if (n < 312) {

if (n < 156) {

if (n < 78) {

if (n < 39) {

if (n < 19) {

if (n < 9) {

if (n < 4) {

if (n < 2) {

goto L1;

} else {

...

______________________________________________________________________
 
K

karthikbg

Pls find an another good approach below that i got via mail :
=====================================================

1. First, you're still going to have to define all
of the functions. There's no getting around that.

But once you have all of the functions defined such that
they share a common prototype, you'll be able to do
the following:


/* replace with common prototype; I assume some might in fact
* call the same function internally, so it might makes sense
* to allow calling that function directly */
typedef int (*func_t) (int);

int func0 (int x) { assert (0); /* invalid */ }
static func_t functab[5001] = {
func0, func1, func2, func3,
/* ... all of them ... */ func5000 };

Then define:
int call_func (int n)
{
if (n < 1 || n > sizeof(functab) / sizeof(functab[0]))
return crash_and_burn (n); /* report index out of order */
else
return (*functab[n])(n); /* newer compilers functab[n](); */
}

A slightly less error-prone method is to do the following:

struct funcrecord
{
int idx;
func_t func;
};
/* note: this trick removes the usefulness of passing index
* as a parameter unless you manually code a few of them */
#define F(n) { n, func##n }
struct funcrecord functab[5001] = {
F(0), F(1), F(2),
{ 3, func3to5 }, { 4, func3to5 }, { 5, func3to5 },
/* ... all of them ... */ F(5000) };

Then you can easily do:

int call_func (int n)
{
if (n < 1 || n > sizeof(functab) / sizeof(functab[0]) ||
functab[n].idx != n)
return crash_and_burn (n);
else
return (*functab[n].func)(n);
}

______________________________________________________________________
 
K

karthikbg

Hi,
Looks good.

The switch-case statement is implemented using several different
methods. Among them are:

1) Test and branch: when only a few case arguments are used
2) Jump tables: when the case arguments are in order
3) Library Call: when the case arguments are unstructured

The method used is determined by the efficiency of the code that is
generated. The internal specification that is used to implement
switch-case is very complex and relies on numerous conditions. If you
restructure your switch statements to use sequential numbers, the
compiler will generate a jump table. For example, the following
program:

void func (unsigned int aaa)
{
volatile unsigned char car;

switch (aaa)
{
case 0: car = 'A'; break;
case 1: car = 'S'; break;
case 2: car = 'D'; break;
case 3: car = 'F'; break;
case 4: car = 'G'; break;
case 5: car = 'H'; break;
case 6: car = 'J'; break;
}
}

when compiled with C51 Version 7 generates the following code:

0000 EE MOV A,R6
0001 703D JNZ ?C0009
0003 EF MOV A,R7
0004 B40700 CJNE A,#07H,?C0010
0007 ?C0010:
0007 5037 JNC ?C0009
0009 900000 R MOV DPTR,#?C0011
000C F8 MOV R0,A
000D 28 ADD A,R0
000E 28 ADD A,R0
000F 73 JMP @A+DPTR
0010 ?C0011:
0010 020000 R LJMP ?C0002
0013 020000 R LJMP ?C0003
0016 020000 R LJMP ?C0004
0019 020000 R LJMP ?C0005
001C 020000 R LJMP ?C0006
001F 020000 R LJMP ?C0007
0022 020000 R LJMP ?C0008
; SOURCE LINE # 6
; SOURCE LINE # 7
0025 ?C0002:
0025 750041 R MOV car,#041H
0028 22 RET
; SOURCE LINE # 8
0029 ?C0003:
0029 750053 R MOV car,#053H
002C 22 RET
; SOURCE LINE # 9
002D ?C0004:
002D 750044 R MOV car,#044H
0030 22 RET
; SOURCE LINE # 10
0031 ?C0005:
0031 750046 R MOV car,#046H
0034 22 RET
; SOURCE LINE # 11
0035 ?C0006:
0035 750047 R MOV car,#047H
0038 22 RET
; SOURCE LINE # 12
0039 ?C0007:
0039 750048 R MOV car,#048H
003C 22 RET
; SOURCE LINE # 13
003D ?C0008:
003D 75004A R MOV car,#04AH
; SOURCE LINE # 14
; SOURCE LINE # 15
0040 ?C0009:
0040 22 RET

The jump table is created and used here with no timing impact due to
the position of case items within the switch.

Notes The following Compiler directives influence the generation of
switch/case:

The Code ROM size directive (selected under µVision under Options for
Target - Target) changes the optimization possibilities of the C51
Compiler. When you have selected ROM (COMPACT) AJMP instructions are
used and it is more likely that a jump table is generated.
The OPTIMIZE (SIZE) directive (selected under µVision under Options -
C51 - Code Optimization - Emphasis) disables the generation of jump
tables when at the same time the default ROM (LARGE) is selected.

http://www.keil.com/support/docs/1316.htm - got the above from this
link.

Regards,
Karthik Balaguru
 
C

Chris Dollin

karthikbg said:
Hi,
Looks good.

The switch-case statement is implemented using several different
methods. Among them are:

1) Test and branch: when only a few case arguments are used
2) Jump tables: when the case arguments are in order
3) Library Call: when the case arguments are unstructured

The method used is determined by the efficiency of the code that is
generated. The internal specification that is used to implement
switch-case is very complex and relies on numerous conditions. If you
restructure your switch statements to use sequential numbers, the
compiler will generate a jump table.

s/will/very likely/.
 
R

Random832

2006-12-04 said:
Hi,
Looks good.

The switch-case statement is implemented using several different
methods. Among them are:

1) Test and branch: when only a few case arguments are used
2) Jump tables: when the case arguments are in order

It is unclear to me why the use of a jump table depends on the case
arguments being in order. The more likely problem is the _range_ of the
case arguments, that is, the difference between the greatest and the
least of them.
 
D

dcorbit

If you know the exact range of possible inputs and they are dense, then
use an array of function pointers.

If you do not know them, or they are sparse, then use a hash table to
locate your function pointer.

A switch statement is likely to be good enough anyway. I suggest that
you benchmark the possible solutions and choose the simplest one that
meets the project's performance goals.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top