datatype for bit flags

Discussion in 'C++' started by Gr, Sep 15, 2011.

  1. Gr

    Gr Guest

    I am not an expert C or C++ programmer but know enough to get by.

    Currently studying some graphics algorithms as a part of CAD.

    In the Cohen Sutherland Algorithm of line clipping, a code/flag is using
    for denoting where an endpoint of the line lies as compared to the
    clipping rectangle.

    This code consists of 4 flags (1 each for left, right, below & above).

    type code = 0;

    if(point lies to the left of clipping window)
    code = code & 1;
    if(point lies to right)
    code = code & 2;

    if(point lies below)
    code = code & 4;
    if(point lies above)
    code = code & 8;


    Now in some books the type of code is declared
    as
    unsigned char code;
    In others, it's
    int code;

    which is better?

    unsigned char code saves space
    (is the unsigned really necessary anyway?)

    but I think int is probably faster.

    Can someone explain what's a better way to do this if you were
    writing a graphics library?
     
    Gr, Sep 15, 2011
    #1
    1. Advertising

  2. Gr

    jacob navia Guest

    Le 15/09/11 09:07, Gr a écrit :
    > I am not an expert C or C++ programmer but know enough to get by.
    >
    > Currently studying some graphics algorithms as a part of CAD.
    >
    > In the Cohen Sutherland Algorithm of line clipping, a code/flag is using
    > for denoting where an endpoint of the line lies as compared to the
    > clipping rectangle.
    >
    > This code consists of 4 flags (1 each for left, right, below & above).
    >
    > type code = 0;
    >
    > if(point lies to the left of clipping window)
    > code = code & 1;
    > if(point lies to right)
    > code = code & 2;
    >
    > if(point lies below)
    > code = code & 4;
    > if(point lies above)
    > code = code & 8;
    >
    >
    > Now in some books the type of code is declared
    > as
    > unsigned char code;
    > In others, it's
    > int code;
    >
    > which is better?
    >


    unsigned code;

    would be better since you are doing bit operations with
    the data.

    > unsigned char code saves space


    It depends of the alignment requirements of the rest of the
    data. If you want to make an array of 20 000 of those flags
    then unsigned char would be better. For just a single flag
    int would be better since the machine doesn't have to
    clear a register and then load the character into it when
    operating on it.


    > (is the unsigned really necessary anyway?)
    >


    It depends on the operations you are doing with the data.
    By the way are you sure you want
    code = code & 1
    This sets to zero all bits but the last... If the last was zero
    it remains zero.

    Or maybe you intended
    code = code | 1;

    That *sets* the last bit to 1.

    > but I think int is probably faster.
    >

    No

    > Can someone explain what's a better way to do this if you were
    > writing a graphics library?


    You would have to explain more what you want to do. Now, you are doing
    nothing, i.e. if you have an int with all zeros, after your operations
    the data remains the same: all zeroes.
     
    jacob navia, Sep 15, 2011
    #2
    1. Advertising

  3. Gr

    Gr Guest

    On 15-09-2011 12:56 PM, jacob navia wrote:
    > Le 15/09/11 09:07, Gr a écrit :
    >> I am not an expert C or C++ programmer but know enough to get by.
    >>
    >> Currently studying some graphics algorithms as a part of CAD.
    >>
    >> In the Cohen Sutherland Algorithm of line clipping, a code/flag is using
    >> for denoting where an endpoint of the line lies as compared to the
    >> clipping rectangle.
    >>
    >> This code consists of 4 flags (1 each for left, right, below & above).
    >>
    >> type code = 0;
    >>
    >> if(point lies to the left of clipping window)
    >> code = code & 1;
    >> if(point lies to right)
    >> code = code & 2;
    >>
    >> if(point lies below)
    >> code = code & 4;
    >> if(point lies above)
    >> code = code & 8;
    >>
    >>
    >> Now in some books the type of code is declared
    >> as
    >> unsigned char code;
    >> In others, it's
    >> int code;
    >>
    >> which is better?
    >>

    >
    > unsigned code;
    >
    > would be better since you are doing bit operations with
    > the data.


    Are you saying that whenever I am doing bit operations, it's better to
    use unsigned data types instead of signed ones.

    I guess if you are left shifting or right shifting, then using unsigned
    is better. How if you are just & (to check if a bit is set) & | (to set
    a bit), does it make a diff if you use a signed datatype or an unsigned
    one? I guess signed type would you give you one less bit to work with as
    compared to an unsigned one, but other than that would it make a difference?

    >
    >> unsigned char code saves space

    >
    > It depends of the alignment requirements of the rest of the
    > data. If you want to make an array of 20 000 of those flags
    > then unsigned char would be better. For just a single flag
    > int would be better since the machine doesn't have to
    > clear a register and then load the character into it when
    > operating on it.
    >
    >
    >> (is the unsigned really necessary anyway?)
    >>

    >
    > It depends on the operations you are doing with the data.


    Here is a link to what I am doing
    http://en.wikipedia.org/wiki/Cohen–Sutherland_algorithm#Example_C.2FC.2B.2B_implementation


    > By the way are you sure you want
    > code = code & 1
    > This sets to zero all bits but the last... If the last was zero
    > it remains zero.
    >
    > Or maybe you intended
    > code = code | 1;
    >
    > That *sets* the last bit to 1.


    Yes, true.

    All the &'s in my psuedocode should have been |'s.

    >
    >> but I think int is probably faster.
    >>

    > No
    >
    >> Can someone explain what's a better way to do this if you were
    >> writing a graphics library?

    >
    > You would have to explain more what you want to do. Now, you are doing
    > nothing, i.e. if you have an int with all zeros, after your operations
    > the data remains the same: all zeroes.
    >
     
    Gr, Sep 15, 2011
    #3
  4. On Sep 15, 10:07 am, Gr <> wrote:
    >
    > unsigned char code;
    > In others, it's
    > int code;
    >
    > which is better?
    >
    > unsigned char code saves space
    > (is the unsigned really necessary anyway?)
    >
    > but I think int is probably faster.
    >
    > Can someone explain what's a better way to do this if you were
    > writing a graphics library?
    >

    It really doesn't matter, unless you are storing millions of these
    codes, or the operation is right within the inner loop.

    If storing millions, unsigned char will save space. If the operation
    is in the inner loop, int might just save enough time to be
    measurable, though it's doubtful.
     
    Malcolm McLean, Sep 15, 2011
    #4
  5. Gr

    jacob navia Guest

    Le 15/09/11 10:03, Gr a écrit :
    >
    > Are you saying that whenever I am doing bit operations, it's better to
    > use unsigned data types instead of signed ones.
    >


    Yes, since the data is not an integer, just a collection of
    flags. That way I am sure I can use CHAR_BIT*8 flags with
    no problems if I do shifts, etc.


    Obviously for ANDing the last 4 bits either declaration is OK.

    > I guess if you are left shifting or right shifting, then using unsigned
    > is better. How if you are just & (to check if a bit is set) & | (to set
    > a bit), does it make a diff if you use a signed datatype or an unsigned
    > one? I guess signed type would you give you one less bit to work with as
    > compared to an unsigned one, but other than that would it make a
    > difference?
    >


    Not at all.

    [snip]

    >
    >> By the way are you sure you want
    >> code = code & 1
    >> This sets to zero all bits but the last... If the last was zero
    >> it remains zero.
    >>
    >> Or maybe you intended
    >> code = code | 1;
    >>
    >> That *sets* the last bit to 1.

    >
    > Yes, true.
    >
    > All the &'s in my psuedocode should have been |'s.
    >


    OK, now it is clearer what you want to do.
     
    jacob navia, Sep 15, 2011
    #5
  6. Gr

    BartC Guest

    "Gr" <> wrote in message news:j4s877$2a0$...

    > In the Cohen Sutherland Algorithm of line clipping, a code/flag is using
    > for denoting where an endpoint of the line lies as compared to the
    > clipping rectangle.
    >
    > This code consists of 4 flags (1 each for left, right, below & above).
    >
    > type code = 0;
    >
    > if(point lies to the left of clipping window)
    > code = code & 1;
    > if(point lies to right)
    > code = code & 2;
    >
    > if(point lies below)
    > code = code & 4;
    > if(point lies above)
    > code = code & 8;
    >


    I've just looked at my code for this, and I'm using an 'or' operation ('|'
    in C), to build the 'outcode'. In fact looking at your code, your 'code'
    variable is only ever going to be zero.

    As for what type to use for 'code', it probably doesn't really matter. Get
    the whole project working first, then worry about whether that tiny aspect
    is a bottleneck or not.

    --
    bartc
     
    BartC, Sep 15, 2011
    #6
  7. Gr

    Gr Guest

    On 15-09-2011 03:26 PM, BartC wrote:
    > "Gr" <> wrote in message news:j4s877$2a0$...
    >
    >> In the Cohen Sutherland Algorithm of line clipping, a code/flag is
    >> using for denoting where an endpoint of the line lies as compared to
    >> the clipping rectangle.
    >>
    >> This code consists of 4 flags (1 each for left, right, below & above).
    >>
    >> type code = 0;
    >>
    >> if(point lies to the left of clipping window)
    >> code = code & 1;
    >> if(point lies to right)
    >> code = code & 2;
    >>
    >> if(point lies below)
    >> code = code & 4;
    >> if(point lies above)
    >> code = code & 8;
    >>

    >
    > I've just looked at my code for this, and I'm using an 'or' operation
    > ('|' in C), to build the 'outcode'. In fact looking at your code, your
    > 'code' variable is only ever going to be zero.


    Yes - I typed that out by mistake - I am using |, not &.
    I indicated that in a followup to jacob navia.

    >
    > As for what type to use for 'code', it probably doesn't really matter.
    > Get the whole project working first, then worry about whether that tiny
    > aspect is a bottleneck or not.


    It's not a project. I am just a looking at the implementation in 2
    different text books - one using an unsigned char & the other using an
    int & hence wondered about the choices & if one is a better choice than
    the other.
     
    Gr, Sep 15, 2011
    #7
  8. Gr

    Phil Carmody Guest

    Gr <> writes:
    > I am not an expert C or C++ programmer but know enough to get by.
    >
    > Currently studying some graphics algorithms as a part of CAD.
    >
    > In the Cohen Sutherland Algorithm of line clipping, a code/flag is
    > using for denoting where an endpoint of the line lies as compared to
    > the clipping rectangle.
    >
    > This code consists of 4 flags (1 each for left, right, below & above).
    >
    > type code = 0;
    >
    > if(point lies to the left of clipping window)
    > code = code & 1;
    > if(point lies to right)
    > code = code & 2;
    >
    > if(point lies below)
    > code = code & 4;
    > if(point lies above)
    > code = code & 8;
    >
    >
    > Now in some books the type of code is declared
    > as
    > unsigned char code;
    > In others, it's
    > int code;
    >
    > which is better?


    It depends.

    > unsigned char code saves space
    > (is the unsigned really necessary anyway?)
    >
    > but I think int is probably faster.


    Unthink that thought.

    There is no "think code is faster", there is only
    "fact - code is not sufficiently fast" and
    "measure which alternative code is faster".

    > Can someone explain what's a better way to do this if you were
    > writing a graphics library?


    If the code-size is similar and the speed is similar,
    why are you fussing about such trivialities? I'm sure
    there are actual important architectural issues that
    need to be adressed with much higher priority. You're
    wasting too much time deciding what colour to paint
    the bike shed.

    Phil
    --
    "Religion is what keeps the poor from murdering the rich."
    -- Napoleon
     
    Phil Carmody, Sep 15, 2011
    #8
  9. On Sep 16, 1:16 am, Phil Carmody <>
    wrote:
    > Gr <> writes:
    >
    > > but I think int is probably faster.

    >
    > Unthink that thought.
    >
    > There is no "think code is faster", there is only
    > "fact - code is not sufficiently fast" and
    > "measure which alternative code is faster".
    >

    int should be "the natural integer type" for the platform. Which means
    that usually it will be the fastest type. You should use int for an
    integer unless there is a clear reason for using another type. Though
    each instance will only save a few cycles, if you systematically use
    int the speed increment may be worth having.

    --
    Fuzzy Logic Trees - now in Apple ebooks
    Read all about them on http://www.malcolmmclean.site11.com/www
     
    Malcolm McLean, Sep 16, 2011
    #9
  10. Gr

    Jorgen Grahn Guest

    On Fri, 2011-09-16, Malcolm McLean wrote:
    > On Sep 16, 1:16 am, Phil Carmody <>
    > wrote:
    >> Gr <> writes:
    >>
    >> > but I think int is probably faster.

    >>
    >> Unthink that thought.
    >>
    >> There is no "think code is faster", there is only
    >> "fact - code is not sufficiently fast" and
    >> "measure which alternative code is faster".


    I partly disagree. You also need some vague idea about what's
    efficient and what's not, so you don't spend 20% more here and 18%
    more there. Distributed fat is hard to profile.

    > int should be "the natural integer type" for the platform. Which means
    > that usually it will be the fastest type.


    If you write a compiler today for an architecture where 16-bit ints
    would be a bit faster than 32-bit, you'd still use 32-bit because
    otherwise too much code would break.

    And even on 32-bit architectures that generalization is too ...
    generalized. What if you need an array of a million numbers between 0
    and 7 -- will int be as fast as unsigned char? Most likely not.

    > You should use int for an
    > integer unless there is a clear reason for using another type. Though
    > each instance will only save a few cycles, if you systematically use
    > int the speed increment may be worth having.


    In this case the problem didn't call for signedness, so I'd use
    'unsigned'.

    If the compiler had enough information to do further optimizations,
    I'd stop worrying after that and leave it to the compiler.

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
     
    Jorgen Grahn, Sep 16, 2011
    #10
  11. Gr

    Tony Guest

    Malcolm McLean wrote:

    > You should use int for an
    > integer unless there is a clear reason for using another type. Though
    > each instance will only save a few cycles, if you systematically use
    > int the speed increment may be worth having.


    Did you mean that just in the context of the OP/this thread, or in
    general?
     
    Tony, Sep 16, 2011
    #11
  12. pete <> writes:
    > Tony wrote:
    >>
    >> Malcolm McLean wrote:
    >>
    >> > You should use int for an
    >> > integer unless there is a
    >> > clear reason for using another type. Though
    >> > each instance will only save a few cycles, if you systematically use
    >> > int the speed increment may be worth having.

    >>
    >> Did you mean that just in the context of the OP/this thread, or in
    >> general?

    >
    > I only use data types which are smaller than int,
    > when there is a special reason.
    >
    > It's possible that small types may be aligned on int boundaries,
    > meaning that there may be unusable bytes in between them.


    No, it's not. In an array of char, or an array of short, the elements
    must be contiguous; unlike structs arrays have no padding. This is necessary
    to make array indexing and pointer arithmetic work.

    > It's also possible that there may be masking operations
    > invloved with simple arithmetic operations on small types.
    > Small types tend to get converted to type int
    > in various operations.


    That's true.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Sep 17, 2011
    #12
  13. Gr

    Tony Guest

    "pete" <> wrote in message
    news:...
    > Tony wrote:
    >>
    >> Malcolm McLean wrote:
    >>
    >> > You should use int for an
    >> > integer unless there is a
    >> > clear reason for using another type. Though
    >> > each instance will only save a few cycles, if you systematically use
    >> > int the speed increment may be worth having.

    >>
    >> Did you mean that just in the context of the OP/this thread, or in
    >> general?

    >
    > I only use data types which are smaller than int,
    > when there is a special reason.
    >
    > It's possible that small types may be aligned on int boundaries,
    > meaning that there may be unusable bytes in between them.
    >
    > It's also possible that there may be masking operations
    > invloved with simple arithmetic operations on small types.
    > Small types tend to get converted to type int
    > in various operations.
    >


    I didn't cut out enough of your post. Here's what I wanted to know:

    >> Malcolm McLean wrote:
    >>
    >> > You should use int for an
    >> > integer unless there is a
    >> > clear reason for using another type.


    >> Did you mean that just in the context of the OP/this thread, or in
    >> general?
     
    Tony, Sep 17, 2011
    #13
  14. pete <> writes:
    > Keith Thompson wrote:
    >> pete <> writes:
    >> > Tony wrote:
    >> >>
    >> >> Malcolm McLean wrote:
    >> >>
    >> >> > You should use int for an
    >> >> > integer unless there is a
    >> >> > clear reason for using another type. Though
    >> >> > each instance will only save a few cycles,
    >> >> > if you systematically use
    >> >> > int the speed increment may be worth having.
    >> >>
    >> >> Did you mean that just in the context of the OP/this thread, or in
    >> >> general?
    >> >
    >> > I only use data types which are smaller than int,
    >> > when there is a special reason.
    >> >
    >> > It's possible that small types may be aligned on int boundaries,
    >> > meaning that there may be unusable bytes in between them.

    >>
    >> No, it's not. In an array of char, or an array of short, the elements
    >> must be contiguous; unlike structs arrays have no padding.
    >> This is necessary
    >> to make array indexing and pointer arithmetic work.

    >
    > It's possible that small types may exist outside of an array.


    Of course.

    In a struct, there can be arbitrary padding between any two members, or
    after the last member. But in practice, I'd be very surprised to see
    any padding between c0 and c1 in
    struct {
    char c0;
    char c1;
    };

    As for individually declared objects, you're not likely to have
    very many of them, so any padding between them should have only
    minimal effect (and if it's there, it probably saves you more in code
    space than it costs in data space).

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Sep 17, 2011
    #14
  15. On Sep 16, 11:46 pm, Jorgen Grahn <> wrote:
    >
    > And even on 32-bit architectures that generalization is too ...
    > generalized. What if you need an array of a million numbers between 0
    > and 7 -- will int be as fast as unsigned char? Most likely not.
    >

    int should read and write on a single "operation". Accessing the
    unsigned chars may create unaligned reads, or may require masking and
    bitshifting, thus several "operations".

    However ints will exhaust the cache faster. It can become difficult to
    get an intuitive feel for speed. For instance are we stepping through
    the array in sequence in the inner loop, or accessing it randomly?
     
    Malcolm McLean, Sep 18, 2011
    #15
  16. Gr

    Jorgen Grahn Guest

    On Sun, 2011-09-18, Malcolm McLean wrote:
    > On Sep 16, 11:46 pm, Jorgen Grahn <> wrote:
    >>
    >> And even on 32-bit architectures that generalization is too ...
    >> generalized. What if you need an array of a million numbers between 0
    >> and 7 -- will int be as fast as unsigned char? Most likely not.
    >>

    > int should read and write on a single "operation". Accessing the
    > unsigned chars may create unaligned reads, or may require masking and
    > bitshifting, thus several "operations".


    In recent years, I have come to see a linear series of instructions as
    cheap, compared to memory access. As a general rule. People sometimes
    point out that precalculated tables aren't as useful today as they
    were maybe 20 years ago -- it's often faster to recalculate than to hit
    the table.

    > However ints will exhaust the cache faster. It can become difficult to
    > get an intuitive feel for speed. For instance are we stepping through
    > the array in sequence in the inner loop, or accessing it randomly?


    True. I take back my "most likely not" above; the problem as stated by
    me still has too little information.

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
     
    Jorgen Grahn, Sep 18, 2011
    #16
  17. Gr

    hanukas Guest

    On Sep 15, 10:07 am, Gr <> wrote:
    > I am not an expert C or C++ programmer but know enough to get by.
    >
    > Currently studying some graphics algorithms as a part of CAD.
    >
    > In the Cohen Sutherland Algorithm of line clipping, a code/flag is using
    > for denoting where an endpoint of the line lies as compared to the
    > clipping rectangle.
    >
    > This code consists of 4 flags (1 each for left, right, below & above).
    >
    > type code = 0;
    >
    > if(point lies to the left of clipping window)
    >     code = code & 1;
    > if(point lies to right)
    >     code = code & 2;
    >
    > if(point lies below)
    >    code = code & 4;
    > if(point lies above)
    >    code = code & 8;
    >
    > Now in some books the type of code is declared
    > as
    > unsigned char code;
    > In others, it's
    > int code;
    >
    > which is better?
    >
    > unsigned char code saves space
    > (is the unsigned really necessary anyway?)
    >
    > but I think int is probably faster.
    >
    > Can someone explain what's a better way to do this if you were
    > writing a graphics library?


    The outcodes are temporary variables, space savings are less critical
    here than using natural register width; use unsigned int.

    If your library does multiple passes over your data and "caches" the
    outcodes, you should instead use design which encourages sharing of
    endpoints; this allows efficient single pass and generally requires
    less bandwidth.

    It boils down to this..

    if (codeA & codeB) -> trivial reject
    else if (!(codeA | codeB)) -> trivial accept (= no clipping required)
    else -> clip
     
    hanukas, Sep 23, 2011
    #17
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jeff
    Replies:
    2
    Views:
    1,170
    Steve C. Orr [MCSD, MVP, CSM, ASP Insider]
    Apr 16, 2007
  2. Marcel Müller

    Enumerations for flags (bit vectors)

    Marcel Müller, Mar 21, 2008, in forum: C++
    Replies:
    4
    Views:
    355
    James Kanze
    Mar 22, 2008
  3. Steve Holden
    Replies:
    0
    Views:
    788
    Steve Holden
    Feb 8, 2009
  4. Replies:
    8
    Views:
    525
  5. Gr

    datatype for bit flags

    Gr, Sep 15, 2011, in forum: C Programming
    Replies:
    16
    Views:
    590
    hanukas
    Sep 23, 2011
Loading...

Share This Page