lookup tables

T

tmartsum

try

const char *table1 = "something"
const char table2[] = "something"

and then try
char * x = const_cast<char*>(table_x);
x[0] = 'y';

my guess is that it crahses on table1 but not on table 2.

(But it depends on compiler, but the compiler are allow to put the
first into the codesegment (even if not const) - that means that it is
totally protected from being modified.)
 
T

tmartsum

I am not sure... but let me ask - are you sure a lookuptable is faster
than just regular math?
(But the way is 0.1 and 0.2 and 0.7 const ?). ( Ie divide by 10, 5 and
0,7)

I am not at all sure about the below, but my guess is that
the stack is fastest
then codesegment
then heap.

However if you want it on the stack it must be within the function ...
but then you would have to copy all of the memory and that would give a
terrible performance ....

So you must do something like

char* ptr_globaltable = 0;

int main
{
char global_table[255];
// fill up global table
ptr_globaltable = global_table;
}

(or always pass global_table to functions). An int table would by the
way probably be better.

Now this might be faster or slower than before. I guess it will be
slower. We might just confuse the optimizer here.

I wrote how it is "likely" to put it in the codesegment before.
(This can only be done with char-arrays)

This will also work even if you are within the scope of the function.

Now the last option is the heap ... if you just make it global outside.

But are you sure it is within this code your "problem" is. Have you
tried a profiler or timed it ? (and are you trying to avoid to many
casts)

Can't you post your code and somebody might se a way of doing something
smarter ..
 
B

Ben Pope

tmartsum said:
try

const char *table1 = "something"
const char table2[] = "something"

and then try
char * x = const_cast<char*>(table_x);
x[0] = 'y';

my guess is that it crahses on table1 but not on table 2.

(But it depends on compiler, but the compiler are allow to put the
first into the codesegment (even if not const) - that means that it is
totally protected from being modified.)

I don't really care what happens when you invoke undefined behaviour, but it does serve to point out the difference between the two.

#include <iostream>
#include <string>

int main(int argc, char* argv[])
{
const char *table1 = "something";
const char table2[] = "something";

char * x2 = const_cast<char*>(table2);
std::cout << table2 << std::endl;
x2[0] = 'y';
std::cout << table2 << std::endl;

char * x1 = const_cast<char*>(table1);
std::cout << table1 << std::endl;
x1[0] = 'y';
std::cout << table1 << std::endl; // access violation here in VS71
}

Interesting.

Ben
 
M

makc.the.great

are you sure a lookuptable is faster than just regular math?
No.
So you must do something like
char* ptr_globaltable = 0;
int main
{
char global_table[255];
// fill up global table
ptr_globaltable = global_table; Interesting.

Can't you post your code and somebody might se a way of doing something
smarter ..
If you insist. Sorry if I couldn't be quite clear without doing so :(

---8<------
/* Y lookup tables
full-scale 0..255 Y is 0.299R + 0.587G + 0.114B
rY = Math.round((i*299)/1000)
bY = Math.round((i*114)/1000)
gY = i - rY - bY
*/

byte rY[256] = {
0, 0, 1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4,
4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 7, 8, 8,
8, 9, 9, 9,
10, 10, 10, 10, 11, 11, 11, 12, 12, 12, 13, 13, 13, 13,
14, 14, 14, 15, 15, 15, 16, 16, 16, 16, 17, 17, 17, 18,
18, 18, 19, 19,
19, 19, 20, 20, 20, 21, 21, 21, 22, 22, 22, 22, 23, 23,
23, 24, 24, 24, 25, 25, 25, 25, 26, 26, 26, 27, 27, 27,
28, 28, 28, 28,
29, 29, 29, 30, 30, 30, 30, 31, 31, 31, 32, 32, 32, 33,
33, 33, 33, 34, 34, 34, 35, 35, 35, 36, 36, 36, 36, 37,
37, 37, 38, 38,
38, 39, 39, 39, 39, 40, 40, 40, 41, 41, 41, 42, 42, 42,
42, 43, 43, 43, 44, 44, 44, 45, 45, 45, 45, 46, 46, 46,
47, 47, 47, 48,
48, 48, 48, 49, 49, 49, 50, 50, 50, 51, 51, 51, 51, 52,
52, 52, 53, 53, 53, 54, 54, 54, 54, 55, 55, 55, 56, 56,
56, 57, 57, 57,
57, 58, 58, 58, 59, 59, 59, 60, 60, 60, 60, 61, 61, 61,
62, 62, 62, 62, 63, 63, 63, 64, 64, 64, 65, 65, 65, 65,
66, 66, 66, 67,
67, 67, 68, 68, 68, 68, 69, 69, 69, 70, 70, 70, 71, 71,
71, 71, 72, 72, 72, 73, 73, 73, 74, 74, 74, 74, 75, 75,
75, 76, 76, 76 };
byte gY[256] = {
0, 1, 1, 2, 3, 3, 3, 4, 5, 5, 6, 7, 7, 8,
8, 9, 9, 10, 11, 11, 12, 13, 12, 13, 14, 15, 15, 16,
17, 17, 18, 18,
18, 19, 20, 21, 21, 22, 23, 23, 23, 24, 24, 25, 26, 27,
27, 28, 29, 28, 29, 30, 30, 31, 32, 33, 33, 34, 34, 34,
35, 36, 36, 37,
38, 39, 38, 39, 40, 40, 41, 42, 42, 43, 44, 44, 44, 45,
46, 46, 47, 48, 48, 49, 49, 50, 50, 51, 52, 52, 53, 54,
54, 54, 55, 56,
56, 57, 58, 58, 59, 59, 60, 60, 61, 62, 62, 63, 64, 64,
64, 65, 66, 66, 67, 68, 68, 69, 70, 69, 70, 71, 72, 72,
73, 74, 74, 75,
75, 75, 76, 77, 78, 78, 79, 80, 79, 80, 81, 81, 82, 83,
84, 84, 85, 85, 85, 86, 87, 87, 88, 89, 90, 90, 90, 91,
91, 92, 93, 93,
94, 95, 96, 95, 96, 97, 97, 98, 99, 99, 100, 101, 101, 101,
102, 103, 103, 104, 105, 105, 105, 106, 107, 107, 108, 109, 109, 110,
111, 110, 111, 112,
113, 113, 114, 115, 115, 116, 116, 116, 117, 118, 119, 119, 120, 121,
121, 121, 122, 123, 123, 124, 125, 125, 126, 126, 126, 127, 128, 129,
129, 130, 131, 131,
131, 132, 132, 133, 134, 135, 135, 136, 137, 136, 137, 138, 138, 139,
140, 141, 141, 142, 142, 142, 143, 144, 144, 145, 146, 147, 146, 147,
148, 148, 149, 150 };
byte bY[256] = {
0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1,
2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3,
3, 3, 3, 4,
4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5,
5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7,
7, 7, 7, 7,
7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9,
9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 10,
10, 11, 11, 11,
11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 12, 12, 12, 12,
13, 13, 13, 13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14,
14, 14, 14, 14,
15, 15, 15, 15, 15, 15, 15, 15, 16, 16, 16, 16, 16, 16,
16, 16, 16, 17, 17, 17, 17, 17, 17, 17, 17, 17, 18, 18,
18, 18, 18, 18,
18, 18, 18, 19, 19, 19, 19, 19, 19, 19, 19, 19, 20, 20,
20, 20, 20, 20, 20, 20, 21, 21, 21, 21, 21, 21, 21, 21,
21, 22, 22, 22,
22, 22, 22, 22, 22, 22, 23, 23, 23, 23, 23, 23, 23, 23,
23, 24, 24, 24, 24, 24, 24, 24, 24, 25, 25, 25, 25, 25,
25, 25, 25, 25,
26, 26, 26, 26, 26, 26, 26, 26, 26, 27, 27, 27, 27, 27,
27, 27, 27, 27, 28, 28, 28, 28, 28, 28, 28, 28, 29, 29,
29, 29, 29, 29 };



/* methods implementation */

byte SPixel::y () { return rY[r] + gY[g] + bY; }
 
T

tmartsum

For starters
/* Y lookup tables
full-scale 0..255 Y is 0.299R + 0.587G + 0.114B
rY = Math.round((i*299)/1000)
bY = Math.round((i*114)/1000)
gY = i - rY - bY
*/
// if this is code then 299.0 and 1000.0 is prefered.

byte SPixel::y () { return rY[r] + gY[g] + bY; }
Now the foolowing is hints if you ar doing normal math.

1) Allway use int contra short and char (unless you have good reasons
not to)
2) Bitshifts,+ and - are fast
3) Times is not that fast but faster than divide
4) Often times can be rewritten as servel shifts and +'es that are
faster.

Now if you did real math and you were allow to cheat a bit you could
say
0.299 ~ 0.29828 = 306/512
0.587 ~ 0.5869 = 601/1024
0,114 ~ 0.11401 = 467/4096

(you can figure out better constants and then do the same for the
others)
return (r*153)>>9+ (g*601)>>10 + (b*467)>>14;

Now I not tried to get rid of times (*) - it takes some thinking.

Dont know if it is faster - you could give it a try.

I guess on nocache processors the math must be faster - beside that I
am unsure. Otherwise there is the 3 tableways
(normal heap, codesegment tables or stack-table)

But you must time it or use a profiler. And remember that what is
fast(est) for one CPU might not be the best method for another...

PS : is there any reason not to inline the function ?
(It would (nomally) do a little improvement on such a small function)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,772
Messages
2,569,593
Members
45,108
Latest member
AlbertEste
Top