is there a better way to optimise this code

S

Sid

Hi,

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.

I would greatly appreciate any feedback
 
C

CBFalconer

Sid said:
I am writing an application where I look for a white pixel by
testing if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming
RGB is a pointer to unsigned char)

This statement gets executed for all the pixels in a page, so
if I can find a better way to do this, I could potentially
save a lot of cpu cycles.

Most of the time only the first portion of that test will be
executed, so the efficiency is better than it appears. The order
of the tests might make a difference.
 
K

Keith Thompson

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.

It's possible that something like

if ((RGB[0]<<16 | RGB[1]<<8 | RGB[2]) == 0xFFFFFF)

might be faster, since it avoids the conditional branches that result
from a straightforward compilation of the "&&" operator. As always,
the only way to be sure is to measure the actual performance.
Examining the generated code might also be instructive.

Some judicious casting might be called for; if unsigned int is less
than 24 bits wide, it could overflow.
 
J

Jean-Michel Bechet

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

Hi,

maybe

if(RGB[0] & RGB[1] & RGB[2] )

You should compile different solutions and compare assembly generated ...


JMB.
 
K

Keith Thompson

CBFalconer said:
Sid said:
I am writing an application where I look for a white pixel by
testing if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming
RGB is a pointer to unsigned char)

This statement gets executed for all the pixels in a page, so
if I can find a better way to do this, I could potentially
save a lot of cpu cycles.

Most of the time only the first portion of that test will be
executed, so the efficiency is better than it appears. The order
of the tests might make a difference.

That might or might not speed things up. Conditional branches can
sometimes cause problems with instruction pipelines (though if that's
the case, the compiler should be smart enough to evaluate the
subexpressions unconditionally).

Try several methods, measure the results, and be prepared for the
tradeoffs to change with the next version of the hardware and/or
compiler.
 
F

Fredrik Tolf

Hi,

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.

I would greatly appreciate any feedback

Depending on your desire for portability, you may want to look into
optimizing it with platform-specific SIMD code, such as MMX/SSE
instructions on Intel platforms.

Fredrik Tolf
 
P

Paul

Hi,

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.

I would greatly appreciate any feedback

would this help any way?

typedef union
{
unsigned a;
unsigned char val[3];
}RGB;

int main(void)
{
RGB rgb;
rgb.a = 0;

rgb.val[0] = 255;
rgb.val[1] = 255;
rgb.val[2] = 255;

if(0x00ffffff == rgb.a){
}

return 0;
}

-Paul
 
D

Dan Pop

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

maybe

if(RGB[0] & RGB[1] & RGB[2] )

It's actually

if ((RGB[0] & RGB[1] & RGB[2]) == 255) ...
You should compile different solutions and compare assembly generated ...

Which is going to tell you exactly zilch about which version is likely to
be faster, because the actual data has a strong influence on the original
code behaviour: if RGB[0] != 255 for most pixels, the original version
is likely to execute in less CPU cycles, because it completely ignores
RGB[1] and RGB[2].

The right thing is to ignore the assembly generated by the compiler and
to benchmark the two versions on typical data sets.

Dan
 
R

Richard Tobin

Paul said:
typedef union
{
unsigned a;
unsigned char val[3];
}RGB;

int main(void)
{
RGB rgb;
rgb.a = 0;

rgb.val[0] = 255;
rgb.val[1] = 255;
rgb.val[2] = 255;

if(0x00ffffff == rgb.a){

This relies on the representation of integers. On a typical
little-endian machine with integers of 32 bits it will work.
On a big-endian machine, or one with 16-bit integers, or one
with some strange implementation quirk, it won't.

Unless your code is already conditionalised for such things, forget
it.

It may well be worthwhile packing the values into a suitably-sized
integer in a more portable way (eg shifting and adding), especially
if you pass them around as a group in other contexts.

-- Richard
 
T

Thomas Matthews

Keith said:
I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.


It's possible that something like

if ((RGB[0]<<16 | RGB[1]<<8 | RGB[2]) == 0xFFFFFF)

might be faster, since it avoids the conditional branches that result
from a straightforward compilation of the "&&" operator. As always,
the only way to be sure is to measure the actual performance.
Examining the generated code might also be instructive.

Some judicious casting might be called for; if unsigned int is less
than 24 bits wide, it could overflow.

Sometimes, the overhead in converting multiple bytes into a word
is more than using the multiple comparisons. The truth be told
in the assembly listing.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.comeaucomputing.com/learn/faq/
Other sites:
http://www.josuttis.com -- C++ STL Library book
 
T

Thomas Matthews

Sid said:
Hi,

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.

I would greatly appreciate any feedback

Here are some suggestions:

1. If you are testing many pixels against a know value, you
may want to put the value into a variable, then suggest
to the compiler to use a register for that variable.

unsigned char compare_value = 255;
if ( RGB[0] == compare_value
&& RGB[1] == compare_value
&& RGB[2] == compare_value)

Although many compilers may already do that.

2. Change the order of the expressions. The one that is
most likely to be false should be first.

3. Invert (or reverse) the color scheme so that
white has a value of zero. Many processors have
special instructions for "jumping on zero".

4. Convert the function to assembly language to take
advantage of special processor instructions (or
rewrite the code to make the compiler use those
instructions).

Example: The ARM processor can conditionally
execute instructions (in 32-bit mode). This fragment
may be more efficient because it reduces the possibility
of branching.
register unsigned char result;
register unsigned char compare_value = 255;
register unsigned char rgb0, rgb1, rgb2;

rgb0 = RGB[0]; /* This pattern reflects the ARM */
rgb1 = RGB[1]; /* instruction for loading multiple */
rgb2 = RGB[2]; /* registers from memory. */

result = rgb0 == compare_value;
result = result && rgb1 == compare_value;
result = result && rgb2 == compare_value;
if (result)
{
/* ... */
}

5. Simplify your function by using Boolean Algebra.
Convert the "if" statements into the algebra.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.comeaucomputing.com/learn/faq/
Other sites:
http://www.josuttis.com -- C++ STL Library book
 
C

CBFalconer

Keith said:
I am writing an application where I look for a white pixel by
testing if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming
RGB is a pointer to unsigned char)

This statement gets executed for all the pixels in a page, so
if I can find a better way to do this, I could potentially save
a lot of cpu cycles.

It's possible that something like

if ((RGB[0]<<16 | RGB[1]<<8 | RGB[2]) == 0xFFFFFF)

might be faster, since it avoids the conditional branches that
result from a straightforward compilation of the "&&" operator.
As always, the only way to be sure is to measure the actual
performance. Examining the generated code might also be
instructive.

Some judicious casting might be called for; if unsigned int is
less than 24 bits wide, it could overflow.

Another candidate for testing might be:

if (0xff == (RGB[0] & RGB[1] & RGB[2]))
 
K

Ken Clement

Hi,

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.

I would greatly appreciate any feedback

You could try typecasting the pointer to an integer type big enough to
see all values. i.e., something like --

if ((*(long *)RGB & 0x00ffffff) == 0x00ffffff) {...}

for a little endian machine, or

if ((*(long *)RGB & 0xffffff00) == 0xffffff00 {...}

for a big endian machine. This assumes there is no problem
referencing one byte off of the end of the table for the last RGB
entry, that alignment issues work out (they may or may not) both in
terms of validity and performance, and that there are no other hidden
gotchas. This approach is not particularly portable -- in fact it is
not portable at all, but that does not sound like your concern.

I would strongly recommend in any case encapsulating this logic in a
macro, e.g. -

#define IsWhite(RGB) ((*(long *)(RGB) & 0x00ffffff == 0x00ffffff)

This should aid in experimentation and make it easier to maintain over
the long haul.

The only way to optimize code in a specialized situation like this is
to try anything that has a reasonable chance of working and then test
it for correctness and benchmark it to see what works best.
 
B

bogonic

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.

I'm going to be flamed to death for posting non-portable code to comp.lang.c,
but here's something you could try. You may need to adjust it slightly for
better performance if your target architecture doesn't use 4-byte words.

Have fun.


#include <stdint.h> /* any C99-compliant compiler should provide this */

/* define BYTE_ORDER_VAX for little endian, BYTE_ORDER_NETWORK for big endian */
#define BYTE_ORDER_VAX 1

/*
* find_white --
*
* returns the index of the first white pixel in 'data', an array of
* packed 'npixels' pixels, or -1 if we couldn't find one.
*
* a 'pixel' is a group of three bytes; it's white when all three bytes are
* set to 0xff.
*/
int
find_white(const uint8_t *data, int npixels)
{
const uint32_t *p = (uint32_t *)data;
const uint8_t *q;
int n = (3*npixels) / 12; /* # of groups of 4 pixels (3 words) */
int r = npixels - n*4; /* # of remaining pixels */
int i;

for (i = 0; i < n; i++) {
#if BYTE_ORDER_NETWORK /* UNTESTED */
if ((p[0] & 0xffffff00) == 0xffffff00)
return i*4;

if (((p[0] & 0x000000ff) == 0x000000ff) &&
(p[1] & 0xffff0000) == 0xffff0000)
return i*4 + 1;

if (((p[1] & 0x0000ffff) == 0x0000ffff) &&
(p[2] & 0xff000000) == 0xff000000)
return i*4 + 2;

if ((p[3] & 0x0000ffff) == 0x0000ffff)
return i*4 + 3;
#elif BYTE_ORDER_VAX
if ((p[0] & 0x00ffffff) == 0x00ffffff)
return i*4;

if (((p[0] & 0xff000000) == 0xff000000) &&
(p[1] & 0x0000ffff) == 0x0000ffff)
return i*4 + 1;

if (((p[1] & 0xffff0000) == 0xffff0000) &&
(p[2] & 0x000000ff) == 0x000000ff)
return i*4 + 2;

if ((p[3] & 0xffff0000) == 0xffff0000)
return i*4 + 3;
#else
#error "define either BYTE_ORDER_NETWORK or BYTE_ORDER_VAX"
#endif
p += 3;
}

q = (uint8_t *)p;

for (i = 0; i < r; i++) {
if (q[0] == 0xff && q[1] == 0xff && q[2] == 0xff)
return n*4 + i;

q += 3;
}

return -1;
}
 
E

Eric Sosman

Sid said:
Hi,

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.

I would greatly appreciate any feedback

One possibility I haven't seen mentioned yet is to
use memchr() to search the RGB array for a 255 byte. If
you find one, you then need to check whether it's in an
R position as opposed to a G or B, which you can do by
computing the distance from &RGB[0] and %'ing by three.
Then you can check the G and B values in the obvious way.
This approach requires a relatively large amount of fooling
around after each successful memchr(), so it's likely to be
advantageous only if 255's are fairly rare and if memchr()
is significantly faster than a simple loop.

As with the other suggested methods, the only way to
tell whether this will be faster or slower than what you've
already got is to measure.
 
A

Arthur J. O'Dwyer

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

I am absolutely amazed that no one has mentioned the obvious. You
just want to compare one block of unsigned chars to another block;
so write what you mean!

#include <string.h>

unsigned char to_find[3] = {255, 255, 255};

if (memcmp(RGB, to_find, 3) == 0) { ... }

This is the sort of code that gcc loves to optimize, and I would
be surprised if it compared unfavorably to /any/ of the other proposed
solutions, given a modern compiler.

-Arthur
 
C

CBFalconer

Arthur J. O'Dwyer said:
I am writing an application where I look for a white pixel by
testing if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming
RGB is a pointer to unsigned char)

I am absolutely amazed that no one has mentioned the obvious.
You just want to compare one block of unsigned chars to another
block; so write what you mean!

#include <string.h>

unsigned char to_find[3] = {255, 255, 255};

if (memcmp(RGB, to_find, 3) == 0) { ... }

This is the sort of code that gcc loves to optimize, and I
would be surprised if it compared unfavorably to /any/ of the
other proposed solutions, given a modern compiler.

It is not quite that easy. That will find:

{{0, 0, 255} {255, 255, 0}}

and other similar things. You have to impose the modulo 3 on the
results (which may well not be 3, depending on how things have
been declared).
 
O

Old Wolf

I am writing an application where I look for a white pixel by testing
if all the R,G,B values are 255 i.e. I use

if(RGB[0] == 255 && RGB[1] == 255 && RGB[2] == 255) (assuming RGB is a
pointer to unsigned char)

This statement gets executed for all the pixels in a page, so if I can
find a better way to do this, I could potentially save a lot of cpu
cycles.

If RGB is an array of unsigned char, then you could try:

if (memcmp(RGB, "\xFF\xFF\xFF", 3))

because many compilers have special optimisations for "memcmp"
and "memset" and "memcpy". You may also find a speed increase
if you bump it up to an int size, eg. make RGB[3] always be 0xFF
and compare RGB to "\xFF\xFF\xFF\xFF".

Of course what you should do is benchmark a few options and see
what's best for your system.

NB. strictly speaking you can't compare unsigned char to
signed char this way, but in practice it always works, and
is certainly better than comparing a long.
 
P

Paul

Paul said:
typedef union
{
unsigned a;
unsigned char val[3];
}RGB;

int main(void)
{
RGB rgb;
rgb.a = 0;

rgb.val[0] = 255;
rgb.val[1] = 255;
rgb.val[2] = 255;

if(0x00ffffff == rgb.a){

This relies on the representation of integers. On a typical
little-endian machine with integers of 32 bits it will work.
On a big-endian machine, or one with 16-bit integers, or one
with some strange implementation quirk, it won't.

Unless your code is already conditionalised for such things, forget
it.

It may well be worthwhile packing the values into a suitably-sized
integer in a more portable way (eg shifting and adding), especially
if you pass them around as a group in other contexts.

-- Richard

Ofcourse this is platform dependent, I never said otherwise though,
this is to just to give him/her idea, upto him/her to use it or trash
it, based on the platform.

-Paul.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top