array of bits

Discussion in 'C Programming' started by \\, May 13, 2010.

  1. \\

    \\ Guest

    Hello,
    I would like to implement array of bits using the appropriate data
    type for the machine
    in which the code is compiled. So, using 32 bits for 32 bits machines
    (and os), and 64 bits
    for 64 bits machines (and os).
    How can I know which is the natural word size of the machine and do
    all the rest?
    Thanks,
    tano
    \\, May 13, 2010
    #1
    1. Advertising

  2. \\

    Tom St Denis Guest

    On May 13, 12:39 pm, "\"\"" <> wrote:
    > Hello,
    > I would like to implement array of bits using the appropriate data
    > type for the machine
    > in which the code is compiled. So, using 32 bits for 32 bits machines
    > (and os), and 64 bits
    > for 64 bits machines (and os).
    > How can I know which is the natural word size of the machine and do
    > all the rest?
    > Thanks,
    >                      tano


    Make an array of unsigned int then use CHAR_BIT * sizeof(int) to
    figure out how many bits are in it.

    Tom
    Tom St Denis, May 13, 2010
    #2
    1. Advertising

  3. \\

    \\ Guest

    Well, I hope there is a better way.
    I think that even a 64 bit machine could use 32 bit registers for
    unsigned int.
    In 64 bit machines, I would like to use 64 bit data types.
    Since I have to to AND and OR operation between bitsets, I think there
    is a
    good point in using 64 bits when available.
    Is it possible that there is no way to get if I am compiling under a
    64 bit machine?
    \\, May 13, 2010
    #3
  4. \\

    Dann Corbit Guest

    In article <dcf74d76-a93c-480d-b657-43b2bd116812
    @o14g2000yqb.googlegroups.com>, says...
    >
    > Hello,
    > I would like to implement array of bits using the appropriate data
    > type for the machine
    > in which the code is compiled. So, using 32 bits for 32 bits machines
    > (and os), and 64 bits
    > for 64 bits machines (and os).
    > How can I know which is the natural word size of the machine and do
    > all the rest?
    > Thanks,
    > tano



    It's a FAQ:

    20.8: How can I implement sets or arrays of bits?

    A: Use arrays of char or int, with a few macros to access the
    desired bit at the proper index. Here are some simple macros to
    use with arrays of char:

    #include <limits.h> /* for CHAR_BIT */

    #define BITMASK(b) (1 << ((b) % CHAR_BIT))
    #define BITSLOT(b) ((b) / CHAR_BIT)
    #define BITSET(a, b) ((a)[BITSLOT(b)] |= BITMASK(b))
    #define BITTEST(a, b) ((a)[BITSLOT(b)] & BITMASK(b))

    (If you don't have <limits.h>, try using 8 for CHAR_BIT.)

    References: H&S Sec. 7.6.7 pp. 211-216.

    BTW, the book has a more detailed answer. If you want to actually
    implement it as an array, you will have to use C++, because C does not
    have operator overloading.

    But if you are using C++ then it is already done for you in the STL
    bitset template.
    Dann Corbit, May 13, 2010
    #4
  5. \\

    \\ Guest

    Yes, I read the FAQ.

    My doubts are not how to do operations, but how can I choose
    the most appropriate type for bitset.
    I.e. choose a 32 bits type on a 32 bits machine,
    and a 64 bits type on a 64 bits machine.
    However thanks,
    tano
    \\, May 13, 2010
    #5
  6. "\"\"" <> writes:
    > Well, I hope there is a better way.
    > I think that even a 64 bit machine could use 32 bit registers for
    > unsigned int.
    > In 64 bit machines, I would like to use 64 bit data types.
    > Since I have to to AND and OR operation between bitsets, I think there
    > is a
    > good point in using 64 bits when available.
    > Is it possible that there is no way to get if I am compiling under a
    > 64 bit machine?


    (Please find a way to format your paragraphs more sanely. The jagged
    lines are difficult to read.)

    What exactly is a "64 bit machine"? If you can rigorously define the
    term (I can't), then you can probably figure out a way to determine
    whether you're on one.

    My best suggestion: Look at a variety of systems, and for each one,
    see which integer type is the one you'd like to use. Ideally
    unsigned int should be the best choice, since it's *supposed*
    to have "the natural size suggested by the architecture of the
    execution environment" (C99 6.2.5p5), but requirements for backward
    compatibility with 32-bit systems often override this suggestion.
    Some supposedly 64-bit systems even have 32-bit longs.

    It may be that unsigned long will be the best choice on a wide
    variety of systems, or perhaps one of the int*fast_t types defined in
    <stdint.h> (for implementations that provide <stdint.h>). Or make it
    a build-time configuration option, chosen manually for each target.

    You should also profile your code with both 32-bit and 64-bit
    integers on each platform. You might be surprised by the results.
    I don't have any particular surpise in mind, I'm just saying that
    you shouldn't make assumptions.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, May 13, 2010
    #6
  7. \\

    Ben Pfaff Guest

    "\"\"" <> writes:

    > My doubts are not how to do operations, but how can I choose
    > the most appropriate type for bitset.
    > I.e. choose a 32 bits type on a 32 bits machine,
    > and a 64 bits type on a 64 bits machine.


    I would probably just use "unsigned long". In the most common
    cases it will have the sizes that you request, and in other cases
    it will still work.
    --
    Ben Pfaff
    http://benpfaff.org
    Ben Pfaff, May 13, 2010
    #7
  8. \\

    Eric Sosman Guest

    On 5/13/2010 3:48 PM, "" wrote:
    > Yes, I read the FAQ.
    >
    > My doubts are not how to do operations, but how can I choose
    > the most appropriate type for bitset.
    > I.e. choose a 32 bits type on a 32 bits machine,
    > and a 64 bits type on a 64 bits machine.


    If you can define precisely what you mean by "32 bits machine"
    and "64 bits machine," perhaps someone will think of a test you can
    make to tell them apart. My bet is that a rigorous definition will
    prove to be elusive -- but go ahead, give it a try, surprise me.

    --
    Eric Sosman
    lid
    Eric Sosman, May 13, 2010
    #8
  9. "Datesfat Chicks" <> writes:
    > """" <> wrote in message
    > news:...
    >> Hello,
    >> I would like to implement array of bits using the appropriate data
    >> type for the machine
    >> in which the code is compiled. So, using 32 bits for 32 bits machines
    >> (and os), and 64 bits
    >> for 64 bits machines (and os).
    >> How can I know which is the natural word size of the machine and do
    >> all the rest?

    >
    > That depends on whether you want to do it at compile time or runtime.
    >
    > At compile time ... other posters have suggested various limit files.
    >
    > Also (and somebody will flame me for this--don't blame them), the standards
    > may (or may not) require a correlation between the way the preprocessor and
    > the compiler behave. So something like:
    >
    > #if ((1 << 32) == 0)
    > ... this is a 32-bit platform
    > #else
    > ... this is a 64-bit platform
    > #endif
    >
    > might (or might not) be required to work.


    Why would anyone flame you for being unsure about this?

    The actual requirement in C99 is that preprocessor arithmetic is
    done in the equivalent of intmax_t and uintmax_t, so the above
    won't do what you're trying to do. And since intmax_t must be at
    least 64 bits, 1<<32 will never be 0 in the preprocessor. Even if
    that weren't the case, shifting a 32-bit value by 32 or more bits
    is undefined.

    > As far as at runtime ... the standard technique is to keep left-shifting "1"
    > as an unsigned integer and count how many times it takes to become 0, i.e.
    >
    > unsigned how_many_bits(void)
    > {
    > unsigned mask=1, count=0;
    >
    > while(mask)
    > {
    > mask <<= 1;
    > count++;
    > }
    >
    > return(count);
    > }


    This is much simpler:

    #include <limits.h>
    sizeof(int) * CHAR_BIT

    Your function is likely to give a more useful answer if unsigned
    int has padding bits, but I don't know of any implementation where
    it does.

    > I'm gonna get so flamed ...


    No, just corrected. (Some people can't tell the difference; I
    trust you can.)

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, May 13, 2010
    #9
  10. On Thu, 13 May 2010, Eric Sosman wrote:

    > On 5/13/2010 3:48 PM, "" wrote:
    >> Yes, I read the FAQ.
    >>
    >> My doubts are not how to do operations, but how can I choose
    >> the most appropriate type for bitset.
    >> I.e. choose a 32 bits type on a 32 bits machine,
    >> and a 64 bits type on a 64 bits machine.

    >
    > If you can define precisely what you mean by "32 bits machine"
    > and "64 bits machine," perhaps someone will think of a test you can
    > make to tell them apart.


    I believe the OP thinks of something like:

    "64 bits machine": where uint64_t is supported, and the C compiler
    generates machine instructions (or assembly code) that access such objects
    as single machine words.

    "32 bits machine": a machine -- which is not also a "64 bits machine" --
    where uint32_t is supported, and where the C compiler generates machine
    instructions (or assembly code) that access such objects as single machine
    words.

    I guess he's looking for the widest unsigned integer type that fits into a
    machine register. Yes, I know. There is probably nothing in the standard
    that would help us determine the answer portably. ("What's a register to
    begin with?") The OP seems to ask for a portable guess that will work fast
    on most sensible implemntations. I'd consider

    #include <stdint.h> // SIZE_MAX
    #include <limits.h> // UINT_MAX
    #include <stddef.h> // size_t -- perhaps move this after the #if

    #if SIZE_MAX > UINT_MAX
    typedef size_t mytype;
    #else
    typedef unsigned mytype;
    #endif

    Cheers,
    lacos
    Ersek, Laszlo, May 13, 2010
    #10
  11. "Ersek, Laszlo" <> writes:
    > On Thu, 13 May 2010, Eric Sosman wrote:
    >> On 5/13/2010 3:48 PM, "" wrote:
    >>> Yes, I read the FAQ.
    >>>
    >>> My doubts are not how to do operations, but how can I choose
    >>> the most appropriate type for bitset.
    >>> I.e. choose a 32 bits type on a 32 bits machine,
    >>> and a 64 bits type on a 64 bits machine.

    >>
    >> If you can define precisely what you mean by "32 bits machine"
    >> and "64 bits machine," perhaps someone will think of a test you can
    >> make to tell them apart.

    >
    > I believe the OP thinks of something like:
    >
    > "64 bits machine": where uint64_t is supported, and the C compiler
    > generates machine instructions (or assembly code) that access such objects
    > as single machine words.
    >
    > "32 bits machine": a machine -- which is not also a "64 bits machine" --
    > where uint32_t is supported, and where the C compiler generates machine
    > instructions (or assembly code) that access such objects as single machine
    > words.


    I think that nearly boils down to "64-bit machines have 64-bit machine
    words, and 32-bit machines have 32-bit machine words". All that remains
    is to define "machine word".

    Which you've more or less already done: a machine word can be accessed
    by a single instruction. But there's no portable way to detect that in
    C.

    > I guess he's looking for the widest unsigned integer type that fits into a
    > machine register. Yes, I know. There is probably nothing in the standard
    > that would help us determine the answer portably. ("What's a register to
    > begin with?") The OP seems to ask for a portable guess that will work fast
    > on most sensible implemntations. I'd consider
    >
    > #include <stdint.h> // SIZE_MAX
    > #include <limits.h> // UINT_MAX
    > #include <stddef.h> // size_t -- perhaps move this after the #if
    >
    > #if SIZE_MAX > UINT_MAX
    > typedef size_t mytype;
    > #else
    > typedef unsigned mytype;
    > #endif


    Why not just this?

    typedef size_t mytype;

    Your version uses unsigned int only when it and size_t are the same
    size (so it probably doesn't make any difference) or when unsigned
    int is bigger than size_t (which is possible but I've never heard
    of it).

    Size_t is probably the same size as a machine address register,
    which is probably one "word" (whatever that means). It's not
    really what size_t is intended for, but it looks like it could be
    a reasonable heuristic.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, May 13, 2010
    #11
  12. \\

    Ben Pfaff Guest

    Keith Thompson <> writes:

    > Which you've more or less already done: a machine word can be accessed
    > by a single instruction. But there's no portable way to detect that in
    > C.


    I'm not sure that even such a definition is foolproof. For
    example, I have heard that the IBM 360 architecture supports
    decimal arithmetic of variable-length quantities in a single
    instruction, but I doubt that considering such instructions would
    give an answer that the OP would be comfortable with.
    --
    Ben Pfaff
    http://benpfaff.org
    Ben Pfaff, May 13, 2010
    #12
  13. On Thu, 13 May 2010, Keith Thompson wrote:

    > "Ersek, Laszlo" <> writes:


    [snip]

    >> I guess he's looking for the widest unsigned integer type that fits
    >> into a machine register. Yes, I know. There is probably nothing in the
    >> standard that would help us determine the answer portably. ("What's a
    >> register to begin with?") The OP seems to ask for a portable guess that
    >> will work fast on most sensible implemntations. I'd consider
    >>
    >> #include <stdint.h> // SIZE_MAX
    >> #include <limits.h> // UINT_MAX
    >> #include <stddef.h> // size_t -- perhaps move this after the #if
    >>
    >> #if SIZE_MAX > UINT_MAX
    >> typedef size_t mytype;
    >> #else
    >> typedef unsigned mytype;
    >> #endif

    >
    > Why not just this?
    >
    > typedef size_t mytype;
    >
    > Your version uses unsigned int only when it and size_t are the same size
    > (so it probably doesn't make any difference) or when unsigned int is
    > bigger than size_t (which is possible but I've never heard of it).


    Yes, I wanted to protect against a fortuitous "degenerate" case, in which
    the size_t elements of the array would be constantly promoted when
    manipulated, and then converted back when stored.

    Thanks!
    lacos
    Ersek, Laszlo, May 14, 2010
    #13
  14. \\

    Lew Pitcher Guest

    On May 13, 2010 18:40, in comp.lang.c, wrote:

    > Keith Thompson <> writes:
    >
    >> Which you've more or less already done: a machine word can be accessed
    >> by a single instruction. But there's no portable way to detect that in
    >> C.

    >
    > I'm not sure that even such a definition is foolproof. For
    > example, I have heard that the IBM 360 architecture supports
    > decimal arithmetic of variable-length quantities in a single
    > instruction,


    The AP ("Add Packed") instruction is one of the IBM360 "Storage/Storage"
    machine language instructions. Consequently, each of the two operands can
    be, independantly, between 1 and 256 bytes in length, representing numbers
    with between 1 and 511 decimal digits. With this instruction, it *is*
    possible to perform a decimal add between two packed-decimal values with
    different lengths.

    > but I doubt that considering such instructions would
    > give an answer that the OP would be comfortable with.


    No, Indeed, the IBM360 would give the OP nightmares, because it is
    simultaneously an 8bit, a 16bit, a 32bit and a 64bit machine, which uses
    24-bit and 31-bit addresses. Later versions would expand the hardware
    address space to 63-bit addresses (note that none of the address types fit
    exactly into the C uint*t space).

    HTH
    --
    Lew Pitcher
    Master Codewright & JOAT-in-training | Registered Linux User #112576
    Me: http://pitcher.digitalfreehold.ca/ | Just Linux: http://justlinux.ca/
    ---------- Slackware - Because I know what I'm doing. ------
    Lew Pitcher, May 14, 2010
    #14
  15. \\

    \\ Guest

    Ok,
    it seems that the question is not simply to define
    so I try to put it clear.

    I have to use array of bits not only with operations
    like set/reset a bit, but also to compare a piece of
    an array with a piece of another array, with AND/OR
    logical operations, so I would like to use the widest
    type available on the machine the code is compiled in.
    The widest type naturally supported, obviosly. If I
    compile on a 32 bit system, and the compiler could
    simulate 64 bit data types through multiple 32 bit
    operations, I am not interested in 64, I should
    choose 32.

    > I guess he's looking for the widest unsigned
    > integer type that fits into a machine register.


    Yes!!

    >(beginner rule: don't do performance optimization;
    > advanced rule: don't do it *yet*).


    Well, normally I am not interested in the trick of
    the month to get faster code, but in this case, I
    think that choosing the right type is even a way to
    write better code.

    I am studying your answers, however, and thank you
    very much.
    tano
    \\, May 14, 2010
    #15
  16. "\"\"" <> writes:
    > it seems that the question is not simply to define
    > so I try to put it clear.
    >
    > I have to use array of bits not only with operations
    > like set/reset a bit, but also to compare a piece of
    > an array with a piece of another array, with AND/OR
    > logical operations, so I would like to use the widest
    > type available on the machine the code is compiled in.
    > The widest type naturally supported, obviosly. If I
    > compile on a 32 bit system, and the compiler could
    > simulate 64 bit data types through multiple 32 bit
    > operations, I am not interested in 64, I should
    > choose 32.
    >
    >> I guess he's looking for the widest unsigned
    >> integer type that fits into a machine register.

    >
    > Yes!!
    >
    >>(beginner rule: don't do performance optimization;
    >> advanced rule: don't do it *yet*).

    >
    > Well, normally I am not interested in the trick of
    > the month to get faster code, but in this case, I
    > think that choosing the right type is even a way to
    > write better code.
    >
    > I am studying your answers, however, and thank you
    > very much.


    My advice:

    Use a single typedef that defines which unsigned type you want to use.
    Other than that single declaration, write all your code so that it
    doesn't care which type it's dealing with. It could even be unsigned
    char.

    Once you've got your code working, then worry about which type is more
    efficient on a given system. You can easily tweak the typedef and
    rebuild to do performance comparisons.

    Don't assume that 32-bit integers are going to be faster than 64-bit
    integer, or vice versa, until you've *measured* it.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, May 14, 2010
    #16
  17. \\

    Guest

    On May 13, 6:16 pm, Lew Pitcher <> wrote:
    > On May 13, 2010 18:40, in comp.lang.c, wrote:
    >
    > > Keith Thompson <> writes:

    >
    > >> Which you've more or less already done: a machine word can be accessed
    > >> by a single instruction.  But there's no portable way to detect that in
    > >> C.

    >
    > > I'm not sure that even such a definition is foolproof.  For
    > > example, I have heard that the IBM 360 architecture supports
    > > decimal arithmetic of variable-length quantities in a single
    > > instruction,

    >
    > The AP ("Add Packed") instruction is one of the IBM360 "Storage/Storage"
    > machine language instructions. Consequently, each of the two operands can
    > be, independantly, between 1 and 256 bytes in length, representing numbers
    > with between 1 and 511 decimal digits. With this instruction, it *is*
    > possible to perform a decimal add between two packed-decimal values with
    > different lengths.



    Actually the decimal instructions allow two four bit lengths, and so
    support operands of 1-16 bytes (and 1-31 decimal digits). Other "SS"
    format instructions use the same field to represent a single eight bit
    length instead.


    > > but I doubt that considering such instructions would
    > > give an answer that the OP would be comfortable with.

    >
    > No, Indeed, the IBM360 would give the OP nightmares, because it is
    > simultaneously an 8bit, a 16bit, a 32bit and a 64bit machine, which uses
    > 24-bit and 31-bit addresses. Later versions would expand the hardware
    > address space to 63-bit addresses (note that none of the address types fit
    > exactly into the C uint*t space).



    zArch extended the address space to 64, not 63 bits. And 24 and 32
    bit addresses are almost always handled as a full (32 bit) register
    (not that there haven't weren't cases of software taking pains to only
    use three bytes to store a 24 bit address). And for the most part, at
    the architectural level, and in user mode, the address size impacts
    only where the address space wraps around (with the notable exception
    of the two original S/360 subroutine call instructions, BAL and BALR,
    which require a noteworthy semantic change when moving from 24 bit to
    31 bit addressing - although subroutine call instructions without the
    oddities of BAL and BALR have been in the ISA since the seventies).
    , May 14, 2010
    #17
  18. \\

    \\ Guest

    Wow, it seems that C99 provides something that could help me
    (quoted from libc manual):

    > If you want an integer with the widest range possible on
    > the platform on which it is being used, use one of the
    > following. If you use these, you should write code that
    > takes into account the variable size and range of the integer.


    > intmax_t
    > uintmax_t


    Tell me if I'm wrong.
    \\, May 14, 2010
    #18
  19. \\

    bart.c Guest

    """" <> wrote in message
    news:...
    > Wow, it seems that C99 provides something that could help me
    > (quoted from libc manual):
    >
    >> If you want an integer with the widest range possible on
    >> the platform on which it is being used, use one of the
    >> following. If you use these, you should write code that
    >> takes into account the variable size and range of the integer.

    >
    >> intmax_t
    >> uintmax_t

    >
    > Tell me if I'm wrong.


    On my machine, these are 64-bits, even though my processor is 32-bits (or
    used in 32-bit mode anyway).

    --
    Bartc
    bart.c, May 14, 2010
    #19
  20. \\

    Seebs Guest

    On 2010-05-14, \"" <> wrote:
    > Wow, it seems that C99 provides something that could help me
    > (quoted from libc manual):
    >
    >> If you want an integer with the widest range possible on
    >> the platform on which it is being used, use one of the
    >> following. If you use these, you should write code that
    >> takes into account the variable size and range of the integer.

    >
    >> intmax_t
    >> uintmax_t

    >
    > Tell me if I'm wrong.


    You're wrong.

    See, intmax_t is, on many systems, very inefficient, because it's a larger
    type than any of the hardware types.

    -s
    --
    Copyright 2010, all wrongs reversed. Peter Seebach /
    http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
    http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
    Seebs, May 14, 2010
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. GGG
    Replies:
    10
    Views:
    12,517
    Donar
    Jul 6, 2006
  2. sarmin kho
    Replies:
    2
    Views:
    819
    A. Lloyd Flanagan
    Jun 15, 2004
  3. Miki Tebeka
    Replies:
    1
    Views:
    434
    Marcin 'Qrczak' Kowalczyk
    Jun 14, 2004
  4. sergey

    "casting" bits to bits?

    sergey, Nov 8, 2006, in forum: VHDL
    Replies:
    1
    Views:
    691
    sergey
    Nov 8, 2006
  5. Tomás

    Value Bits Vs Object Bits

    Tomás, Jun 2, 2006, in forum: C Programming
    Replies:
    13
    Views:
    537
    Hallvard B Furuseth
    Jul 1, 2006
Loading...

Share This Page