datatype for bit flags

Discussion in 'C++' started by Gr, Sep 15, 2011.

1. GrGuest

I am not an expert C or C++ programmer but know enough to get by.

Currently studying some graphics algorithms as a part of CAD.

In the Cohen Sutherland Algorithm of line clipping, a code/flag is using
for denoting where an endpoint of the line lies as compared to the
clipping rectangle.

This code consists of 4 flags (1 each for left, right, below & above).

type code = 0;

if(point lies to the left of clipping window)
code = code & 1;
if(point lies to right)
code = code & 2;

if(point lies below)
code = code & 4;
if(point lies above)
code = code & 8;

Now in some books the type of code is declared
as
unsigned char code;
In others, it's
int code;

which is better?

unsigned char code saves space
(is the unsigned really necessary anyway?)

but I think int is probably faster.

Can someone explain what's a better way to do this if you were
writing a graphics library?

Gr, Sep 15, 2011

2. jacob naviaGuest

Le 15/09/11 09:07, Gr a écrit :
> I am not an expert C or C++ programmer but know enough to get by.
>
> Currently studying some graphics algorithms as a part of CAD.
>
> In the Cohen Sutherland Algorithm of line clipping, a code/flag is using
> for denoting where an endpoint of the line lies as compared to the
> clipping rectangle.
>
> This code consists of 4 flags (1 each for left, right, below & above).
>
> type code = 0;
>
> if(point lies to the left of clipping window)
> code = code & 1;
> if(point lies to right)
> code = code & 2;
>
> if(point lies below)
> code = code & 4;
> if(point lies above)
> code = code & 8;
>
>
> Now in some books the type of code is declared
> as
> unsigned char code;
> In others, it's
> int code;
>
> which is better?
>

unsigned code;

would be better since you are doing bit operations with
the data.

> unsigned char code saves space

It depends of the alignment requirements of the rest of the
data. If you want to make an array of 20 000 of those flags
then unsigned char would be better. For just a single flag
int would be better since the machine doesn't have to
clear a register and then load the character into it when
operating on it.

> (is the unsigned really necessary anyway?)
>

It depends on the operations you are doing with the data.
By the way are you sure you want
code = code & 1
This sets to zero all bits but the last... If the last was zero
it remains zero.

Or maybe you intended
code = code | 1;

That *sets* the last bit to 1.

> but I think int is probably faster.
>

No

> Can someone explain what's a better way to do this if you were
> writing a graphics library?

You would have to explain more what you want to do. Now, you are doing
nothing, i.e. if you have an int with all zeros, after your operations
the data remains the same: all zeroes.

jacob navia, Sep 15, 2011

3. GrGuest

On 15-09-2011 12:56 PM, jacob navia wrote:
> Le 15/09/11 09:07, Gr a écrit :
>> I am not an expert C or C++ programmer but know enough to get by.
>>
>> Currently studying some graphics algorithms as a part of CAD.
>>
>> In the Cohen Sutherland Algorithm of line clipping, a code/flag is using
>> for denoting where an endpoint of the line lies as compared to the
>> clipping rectangle.
>>
>> This code consists of 4 flags (1 each for left, right, below & above).
>>
>> type code = 0;
>>
>> if(point lies to the left of clipping window)
>> code = code & 1;
>> if(point lies to right)
>> code = code & 2;
>>
>> if(point lies below)
>> code = code & 4;
>> if(point lies above)
>> code = code & 8;
>>
>>
>> Now in some books the type of code is declared
>> as
>> unsigned char code;
>> In others, it's
>> int code;
>>
>> which is better?
>>

>
> unsigned code;
>
> would be better since you are doing bit operations with
> the data.

Are you saying that whenever I am doing bit operations, it's better to
use unsigned data types instead of signed ones.

I guess if you are left shifting or right shifting, then using unsigned
is better. How if you are just & (to check if a bit is set) & | (to set
a bit), does it make a diff if you use a signed datatype or an unsigned
one? I guess signed type would you give you one less bit to work with as
compared to an unsigned one, but other than that would it make a difference?

>
>> unsigned char code saves space

>
> It depends of the alignment requirements of the rest of the
> data. If you want to make an array of 20 000 of those flags
> then unsigned char would be better. For just a single flag
> int would be better since the machine doesn't have to
> clear a register and then load the character into it when
> operating on it.
>
>
>> (is the unsigned really necessary anyway?)
>>

>
> It depends on the operations you are doing with the data.

Here is a link to what I am doing
http://en.wikipedia.org/wiki/Cohen–Sutherland_algorithm#Example_C.2FC.2B.2B_implementation

> By the way are you sure you want
> code = code & 1
> This sets to zero all bits but the last... If the last was zero
> it remains zero.
>
> Or maybe you intended
> code = code | 1;
>
> That *sets* the last bit to 1.

Yes, true.

All the &'s in my psuedocode should have been |'s.

>
>> but I think int is probably faster.
>>

> No
>
>> Can someone explain what's a better way to do this if you were
>> writing a graphics library?

>
> You would have to explain more what you want to do. Now, you are doing
> nothing, i.e. if you have an int with all zeros, after your operations
> the data remains the same: all zeroes.
>

Gr, Sep 15, 2011
4. Malcolm McLeanGuest

On Sep 15, 10:07 am, Gr <> wrote:
>
> unsigned char code;
> In others, it's
> int code;
>
> which is better?
>
> unsigned char code saves space
> (is the unsigned really necessary anyway?)
>
> but I think int is probably faster.
>
> Can someone explain what's a better way to do this if you were
> writing a graphics library?
>

It really doesn't matter, unless you are storing millions of these
codes, or the operation is right within the inner loop.

If storing millions, unsigned char will save space. If the operation
is in the inner loop, int might just save enough time to be
measurable, though it's doubtful.

Malcolm McLean, Sep 15, 2011
5. jacob naviaGuest

Le 15/09/11 10:03, Gr a écrit :
>
> Are you saying that whenever I am doing bit operations, it's better to
> use unsigned data types instead of signed ones.
>

Yes, since the data is not an integer, just a collection of
flags. That way I am sure I can use CHAR_BIT*8 flags with
no problems if I do shifts, etc.

Obviously for ANDing the last 4 bits either declaration is OK.

> I guess if you are left shifting or right shifting, then using unsigned
> is better. How if you are just & (to check if a bit is set) & | (to set
> a bit), does it make a diff if you use a signed datatype or an unsigned
> one? I guess signed type would you give you one less bit to work with as
> compared to an unsigned one, but other than that would it make a
> difference?
>

Not at all.

[snip]

>
>> By the way are you sure you want
>> code = code & 1
>> This sets to zero all bits but the last... If the last was zero
>> it remains zero.
>>
>> Or maybe you intended
>> code = code | 1;
>>
>> That *sets* the last bit to 1.

>
> Yes, true.
>
> All the &'s in my psuedocode should have been |'s.
>

OK, now it is clearer what you want to do.

jacob navia, Sep 15, 2011
6. BartCGuest

"Gr" <> wrote in message news:j4s877\$2a0\$...

> In the Cohen Sutherland Algorithm of line clipping, a code/flag is using
> for denoting where an endpoint of the line lies as compared to the
> clipping rectangle.
>
> This code consists of 4 flags (1 each for left, right, below & above).
>
> type code = 0;
>
> if(point lies to the left of clipping window)
> code = code & 1;
> if(point lies to right)
> code = code & 2;
>
> if(point lies below)
> code = code & 4;
> if(point lies above)
> code = code & 8;
>

I've just looked at my code for this, and I'm using an 'or' operation ('|'
in C), to build the 'outcode'. In fact looking at your code, your 'code'
variable is only ever going to be zero.

As for what type to use for 'code', it probably doesn't really matter. Get
the whole project working first, then worry about whether that tiny aspect
is a bottleneck or not.

--
bartc

BartC, Sep 15, 2011
7. GrGuest

On 15-09-2011 03:26 PM, BartC wrote:
> "Gr" <> wrote in message news:j4s877\$2a0\$...
>
>> In the Cohen Sutherland Algorithm of line clipping, a code/flag is
>> using for denoting where an endpoint of the line lies as compared to
>> the clipping rectangle.
>>
>> This code consists of 4 flags (1 each for left, right, below & above).
>>
>> type code = 0;
>>
>> if(point lies to the left of clipping window)
>> code = code & 1;
>> if(point lies to right)
>> code = code & 2;
>>
>> if(point lies below)
>> code = code & 4;
>> if(point lies above)
>> code = code & 8;
>>

>
> I've just looked at my code for this, and I'm using an 'or' operation
> ('|' in C), to build the 'outcode'. In fact looking at your code, your
> 'code' variable is only ever going to be zero.

Yes - I typed that out by mistake - I am using |, not &.
I indicated that in a followup to jacob navia.

>
> As for what type to use for 'code', it probably doesn't really matter.
> Get the whole project working first, then worry about whether that tiny
> aspect is a bottleneck or not.

It's not a project. I am just a looking at the implementation in 2
different text books - one using an unsigned char & the other using an
int & hence wondered about the choices & if one is a better choice than
the other.

Gr, Sep 15, 2011
8. Phil CarmodyGuest

Gr <> writes:
> I am not an expert C or C++ programmer but know enough to get by.
>
> Currently studying some graphics algorithms as a part of CAD.
>
> In the Cohen Sutherland Algorithm of line clipping, a code/flag is
> using for denoting where an endpoint of the line lies as compared to
> the clipping rectangle.
>
> This code consists of 4 flags (1 each for left, right, below & above).
>
> type code = 0;
>
> if(point lies to the left of clipping window)
> code = code & 1;
> if(point lies to right)
> code = code & 2;
>
> if(point lies below)
> code = code & 4;
> if(point lies above)
> code = code & 8;
>
>
> Now in some books the type of code is declared
> as
> unsigned char code;
> In others, it's
> int code;
>
> which is better?

It depends.

> unsigned char code saves space
> (is the unsigned really necessary anyway?)
>
> but I think int is probably faster.

Unthink that thought.

There is no "think code is faster", there is only
"fact - code is not sufficiently fast" and
"measure which alternative code is faster".

> Can someone explain what's a better way to do this if you were
> writing a graphics library?

If the code-size is similar and the speed is similar,
why are you fussing about such trivialities? I'm sure
there are actual important architectural issues that
need to be adressed with much higher priority. You're
wasting too much time deciding what colour to paint
the bike shed.

Phil
--
"Religion is what keeps the poor from murdering the rich."
-- Napoleon

Phil Carmody, Sep 15, 2011
9. Malcolm McLeanGuest

On Sep 16, 1:16 am, Phil Carmody <>
wrote:
> Gr <> writes:
>
> > but I think int is probably faster.

>
> Unthink that thought.
>
> There is no "think code is faster", there is only
> "fact - code is not sufficiently fast" and
> "measure which alternative code is faster".
>

int should be "the natural integer type" for the platform. Which means
that usually it will be the fastest type. You should use int for an
integer unless there is a clear reason for using another type. Though
each instance will only save a few cycles, if you systematically use
int the speed increment may be worth having.

--
Fuzzy Logic Trees - now in Apple ebooks

Malcolm McLean, Sep 16, 2011
10. Jorgen GrahnGuest

On Fri, 2011-09-16, Malcolm McLean wrote:
> On Sep 16, 1:16 am, Phil Carmody <>
> wrote:
>> Gr <> writes:
>>
>> > but I think int is probably faster.

>>
>> Unthink that thought.
>>
>> There is no "think code is faster", there is only
>> "fact - code is not sufficiently fast" and
>> "measure which alternative code is faster".

I partly disagree. You also need some vague idea about what's
efficient and what's not, so you don't spend 20% more here and 18%
more there. Distributed fat is hard to profile.

> int should be "the natural integer type" for the platform. Which means
> that usually it will be the fastest type.

If you write a compiler today for an architecture where 16-bit ints
would be a bit faster than 32-bit, you'd still use 32-bit because
otherwise too much code would break.

And even on 32-bit architectures that generalization is too ...
generalized. What if you need an array of a million numbers between 0
and 7 -- will int be as fast as unsigned char? Most likely not.

> You should use int for an
> integer unless there is a clear reason for using another type. Though
> each instance will only save a few cycles, if you systematically use
> int the speed increment may be worth having.

In this case the problem didn't call for signedness, so I'd use
'unsigned'.

If the compiler had enough information to do further optimizations,
I'd stop worrying after that and leave it to the compiler.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Jorgen Grahn, Sep 16, 2011
11. TonyGuest

Malcolm McLean wrote:

> You should use int for an
> integer unless there is a clear reason for using another type. Though
> each instance will only save a few cycles, if you systematically use
> int the speed increment may be worth having.

Did you mean that just in the context of the OP/this thread, or in
general?

Tony, Sep 16, 2011
12. Keith ThompsonGuest

pete <> writes:
> Tony wrote:
>>
>> Malcolm McLean wrote:
>>
>> > You should use int for an
>> > integer unless there is a
>> > clear reason for using another type. Though
>> > each instance will only save a few cycles, if you systematically use
>> > int the speed increment may be worth having.

>>
>> Did you mean that just in the context of the OP/this thread, or in
>> general?

>
> I only use data types which are smaller than int,
> when there is a special reason.
>
> It's possible that small types may be aligned on int boundaries,
> meaning that there may be unusable bytes in between them.

No, it's not. In an array of char, or an array of short, the elements
must be contiguous; unlike structs arrays have no padding. This is necessary
to make array indexing and pointer arithmetic work.

> It's also possible that there may be masking operations
> invloved with simple arithmetic operations on small types.
> Small types tend to get converted to type int
> in various operations.

That's true.

--
Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Keith Thompson, Sep 17, 2011
13. TonyGuest

"pete" <> wrote in message
news:...
> Tony wrote:
>>
>> Malcolm McLean wrote:
>>
>> > You should use int for an
>> > integer unless there is a
>> > clear reason for using another type. Though
>> > each instance will only save a few cycles, if you systematically use
>> > int the speed increment may be worth having.

>>
>> Did you mean that just in the context of the OP/this thread, or in
>> general?

>
> I only use data types which are smaller than int,
> when there is a special reason.
>
> It's possible that small types may be aligned on int boundaries,
> meaning that there may be unusable bytes in between them.
>
> It's also possible that there may be masking operations
> invloved with simple arithmetic operations on small types.
> Small types tend to get converted to type int
> in various operations.
>

I didn't cut out enough of your post. Here's what I wanted to know:

>> Malcolm McLean wrote:
>>
>> > You should use int for an
>> > integer unless there is a
>> > clear reason for using another type.

>> Did you mean that just in the context of the OP/this thread, or in
>> general?

Tony, Sep 17, 2011
14. Keith ThompsonGuest

pete <> writes:
> Keith Thompson wrote:
>> pete <> writes:
>> > Tony wrote:
>> >>
>> >> Malcolm McLean wrote:
>> >>
>> >> > You should use int for an
>> >> > integer unless there is a
>> >> > clear reason for using another type. Though
>> >> > each instance will only save a few cycles,
>> >> > if you systematically use
>> >> > int the speed increment may be worth having.
>> >>
>> >> Did you mean that just in the context of the OP/this thread, or in
>> >> general?
>> >
>> > I only use data types which are smaller than int,
>> > when there is a special reason.
>> >
>> > It's possible that small types may be aligned on int boundaries,
>> > meaning that there may be unusable bytes in between them.

>>
>> No, it's not. In an array of char, or an array of short, the elements
>> must be contiguous; unlike structs arrays have no padding.
>> This is necessary
>> to make array indexing and pointer arithmetic work.

>
> It's possible that small types may exist outside of an array.

Of course.

In a struct, there can be arbitrary padding between any two members, or
after the last member. But in practice, I'd be very surprised to see
any padding between c0 and c1 in
struct {
char c0;
char c1;
};

As for individually declared objects, you're not likely to have
very many of them, so any padding between them should have only
minimal effect (and if it's there, it probably saves you more in code
space than it costs in data space).

--
Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Keith Thompson, Sep 17, 2011
15. Malcolm McLeanGuest

On Sep 16, 11:46 pm, Jorgen Grahn <> wrote:
>
> And even on 32-bit architectures that generalization is too ...
> generalized. What if you need an array of a million numbers between 0
> and 7 -- will int be as fast as unsigned char? Most likely not.
>

int should read and write on a single "operation". Accessing the
bitshifting, thus several "operations".

However ints will exhaust the cache faster. It can become difficult to
get an intuitive feel for speed. For instance are we stepping through
the array in sequence in the inner loop, or accessing it randomly?

Malcolm McLean, Sep 18, 2011
16. Jorgen GrahnGuest

On Sun, 2011-09-18, Malcolm McLean wrote:
> On Sep 16, 11:46 pm, Jorgen Grahn <> wrote:
>>
>> And even on 32-bit architectures that generalization is too ...
>> generalized. What if you need an array of a million numbers between 0
>> and 7 -- will int be as fast as unsigned char? Most likely not.
>>

> int should read and write on a single "operation". Accessing the
> unsigned chars may create unaligned reads, or may require masking and
> bitshifting, thus several "operations".

In recent years, I have come to see a linear series of instructions as
cheap, compared to memory access. As a general rule. People sometimes
point out that precalculated tables aren't as useful today as they
were maybe 20 years ago -- it's often faster to recalculate than to hit
the table.

> However ints will exhaust the cache faster. It can become difficult to
> get an intuitive feel for speed. For instance are we stepping through
> the array in sequence in the inner loop, or accessing it randomly?

True. I take back my "most likely not" above; the problem as stated by
me still has too little information.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Jorgen Grahn, Sep 18, 2011
17. hanukasGuest

On Sep 15, 10:07 am, Gr <> wrote:
> I am not an expert C or C++ programmer but know enough to get by.
>
> Currently studying some graphics algorithms as a part of CAD.
>
> In the Cohen Sutherland Algorithm of line clipping, a code/flag is using
> for denoting where an endpoint of the line lies as compared to the
> clipping rectangle.
>
> This code consists of 4 flags (1 each for left, right, below & above).
>
> type code = 0;
>
> if(point lies to the left of clipping window)
>     code = code & 1;
> if(point lies to right)
>     code = code & 2;
>
> if(point lies below)
>    code = code & 4;
> if(point lies above)
>    code = code & 8;
>
> Now in some books the type of code is declared
> as
> unsigned char code;
> In others, it's
> int code;
>
> which is better?
>
> unsigned char code saves space
> (is the unsigned really necessary anyway?)
>
> but I think int is probably faster.
>
> Can someone explain what's a better way to do this if you were
> writing a graphics library?

The outcodes are temporary variables, space savings are less critical
here than using natural register width; use unsigned int.

If your library does multiple passes over your data and "caches" the
outcodes, you should instead use design which encourages sharing of
endpoints; this allows efficient single pass and generally requires
less bandwidth.

It boils down to this..

if (codeA & codeB) -> trivial reject
else if (!(codeA | codeB)) -> trivial accept (= no clipping required)
else -> clip

hanukas, Sep 23, 2011