vector and bool

Discussion in 'C++' started by daniel, Oct 12, 2004.

  1. daniel

    daniel Guest

    1)
    is C++ smart enough to automatically use "bits" for bool or will
    a bool have the size of a charcter (byte).
    2) The index of a vector is it an integer (4 byte) or a "long long"
    with 8 bytes or something else?

    I ask because I want to use

    vector<bool> with a few billion entries exceeding the int32 range for
    indexing and I want to use as few memory as possible for the whole
    vector<bool>

    e.g.
    vector<bool> B;
    B.resize(2^34);

    thanks, daniel
     
    daniel, Oct 12, 2004
    #1
    1. Advertising

  2. "daniel" <> wrote in message
    news:...
    > 1)
    > is C++ smart enough to automatically use "bits" for bool or will
    > a bool have the size of a charcter (byte).

    The standard C++ library is actually required to specialize the
    std::vector for bool, and to provide 'compact' storage.
    [NB: for many uses, this is actually seen as an inconvenience]

    > 2) The index of a vector is it an integer (4 byte) or a "long long"
    > with 8 bytes or something else?

    This depends on the implementation, but it may well be limited
    to size_t, or a 4-byte unsigned integer on many platforms.
    You can check the return value of vector<bool>::max_size() to
    find out.

    > I ask because I want to use
    >
    > vector<bool> with a few billion entries exceeding the int32 range for
    > indexing and I want to use as few memory as possible for the whole
    > vector<bool>

    May we ask for what type of application or what kind of data?
    In many cases, using some form of compression might be better
    that allocating gigabytes of RAM. Does your system have enough
    memory?


    Regards,
    Ivan
    --
    http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form
     
    Ivan Vecerina, Oct 12, 2004
    #2
    1. Advertising

  3. daniel

    JKop Guest

    daniel posted:

    > 1)
    > is C++ smart enough to automatically use "bits" for bool or will
    > a bool have the size of a charcter (byte).


    I believe that sizeof(bool) must be >= 1, which implies that a "bool" is of
    one byte. But that doesn't mean that your particular system might do
    something different in the background.

    It's to do with the actual CPU of the system, what is the smallest unit of
    memory that can be accessed, which is called a "byte", which 9 times out of
    10 is equal to "8 bits". Code which would store 8 bools in 1 byte would be
    slower and (ironically) more memory consumptive as the code which
    manipulates the byte would have to be kept in memory.

    > 2) The index of a vector is it an integer (4 byte) or a "long long"
    > with 8 bytes or something else?


    There is no such intrinsic type "long long".

    There is a type "long int", which is guaranteed to be atleast 32-bit on all
    systems, but that doesn't guarantee that it'll be 4 bytes long. For
    instance:

    If "char" were 9-Bit, then although "long" would be atleast 32-Bit, it
    wouldn't necessarily be 4 bytes long, as 4 bytes = 36 bits on that system.

    > I ask because I want to use
    >
    > vector<bool> with a few billion entries exceeding the int32 range for
    > indexing and I want to use as few memory as possible for the whole
    > vector<bool>
    >
    > e.g.
    > vector<bool> B;
    > B.resize(2^34);
    >
    > thanks, daniel


    I'm sure there's people here that've done this before. . .

    I'm not one of them!


    -JKop
     
    JKop, Oct 12, 2004
    #3
  4. daniel

    Rolf Magnus Guest

    JKop wrote:

    > daniel posted:
    >
    >> 1)
    >> is C++ smart enough to automatically use "bits" for bool or will
    >> a bool have the size of a charcter (byte).

    >
    > I believe that sizeof(bool) must be >= 1, which implies that a "bool" is
    > of one byte. But that doesn't mean that your particular system might do
    > something different in the background.


    AFAIK, some systems use even 4 bytes for a bool, because they can access it
    faster.

    > It's to do with the actual CPU of the system, what is the smallest unit of
    > memory that can be accessed, which is called a "byte", which 9 times out
    > of 10 is equal to "8 bits".


    However, a CPU byte might be different from a C++ byte. I have heared there
    are signal processors that have a smallest memory unit of 24 bits and still
    the C++ implementation gives you a char with 8 bit. And there are 4 bit
    CPUS which would need to combine two bytes to meet the minimum requirement
    of 8 bits for char. I guess there might be C implelemntations, but probably
    no C++ ones on such CPUs, but C has the same requirements in that case.

    > Code which would store 8 bools in 1 byte would be slower and (ironically)
    > more memory consumptive as the code which manipulates the byte would have
    > to be kept in memory.
    >
    >> 2) The index of a vector is it an integer (4 byte) or a "long long"
    >> with 8 bytes or something else?

    >
    > There is no such intrinsic type "long long".
    >
    > There is a type "long int", which is guaranteed to be atleast 32-bit on
    > all systems, but that doesn't guarantee that it'll be 4 bytes long.


    That's right.

    > For instance:
    >
    > If "char" were 9-Bit, then although "long" would be atleast 32-Bit, it
    > wouldn't necessarily be 4 bytes long, as 4 bytes = 36 bits on that system.


    That example isn't really chosen very well. 3 bytes would only be 27 bits,
    so you would still need 4 bytes to meet the requirement of at least 32
    bits.
     
    Rolf Magnus, Oct 12, 2004
    #4
  5. daniel

    assaarpa Guest

    > Code which would store 8 bools in 1 byte would be
    > slower and (ironically) more memory consumptive as the code which
    > manipulates the byte would have to be kept in memory.


    Who gives a damn if the code takes 10 or 30 bytes, when the guy wants to
    store billions of bool's and the choises are 128 MB vs. 1024 MB for storing
    one billion flags. And you cannot be so sure about the performance either,
    very often smaller data translates to better performance in the real world
    applications.
     
    assaarpa, Oct 12, 2004
    #5
  6. daniel wrote:
    > 1)
    > is C++ smart enough to automatically use "bits" for bool or will
    > a bool have the size of a charcter (byte).



    In C++ it is

    1 <= sizeof(bool) <= sizeof(long)


    However especially for vector<bool> the standard says:


    "To optimize space allocation, a specialization of vector for bool
    elements is provided:

    ....


    reference is a class that simulates the behaviour of references of a
    single bit in vector<bool>."




    "The C++ Programming Language 3rd Edition" may help here:


    "16.3.11 Vector<bool>

    The specialization (§13.5) vector<bool> is provided as a compact vector
    of bool. A bool variable is addressable, so it takes up at least one
    byte. However, it is easy to implement vector<bool> so that each element
    takes up only a bit.
    The usual vector operations work for vector<bool> and retain their usual
    meanings. In particular, subscripting and iteration work as expected.

    For example:
    void f(vector<bool>& v)
    {
    // iterate using subscripting
    for (int i = 0; i<v.size() ; ++i) cin >> v ;

    typedef vector<bool>: :const_ iterator VI;

    // iterate using iterators
    for (VI p = v.begin() ; p!=v.end() ; ++p) cout<<*p;
    }

    To achieve this, an implementation must simulate addressing of a single
    bit. Since a pointer cannot address a unit of memory smaller than a
    byte, vector<bool>: :iterator cannot be a pointer. In particular,
    one cannot rely on bool* as an iterator for a vector<bool>:

    void f(vector<bool>& v)
    {
    bool* p = v.begin() ; // error: type mismatch
    // ...
    }

    A technique for addressing a single bit is outlined in §17.5.3.

    The library also provides bitset as a set of Boolean values with Boolean
    set operations (§17.5.3)."










    > 2) The index of a vector is it an integer (4 byte) or a "long long"
    > with 8 bytes or something else?



    It is vector<T>::size_type which is some unsigned integer type.


    Example:


    vector<int> someVec(100);

    for(vector<int>::size_type i=0; i<someVec.size(); ++i)
    // ...









    >
    > I ask because I want to use
    >
    > vector<bool> with a few billion entries exceeding the int32 range for
    > indexing and I want to use as few memory as possible for the whole
    > vector<bool>
    >
    > e.g.
    > vector<bool> B;
    > B.resize(2^34);




    Well, it is not guaranteed that bits will be used, although in most
    cases this is what should happen.

    You may also want to check bitset which uses bits for sure.



    --
    Ioannis Vranos

    http://www23.brinkster.com/noicys
     
    Ioannis Vranos, Oct 12, 2004
    #6
  7. Ioannis Vranos wrote:

    > Well, it is not guaranteed that bits will be used, although in most
    > cases this is what should happen.
    >
    > You may also want to check bitset which uses bits for sure.



    My mistake! It is *guaranteed* that bits are used:


    From the standard:

    "23.2.5 Class vector<bool>

    To optimize space allocation, a specialization of vector for bool
    elements is provided:

    ....

    // bit reference:
    class reference {
    friend class vector;
    reference();
    public:
    ˜reference();
    operator bool() const;
    reference& operator=(const bool x);
    reference& operator=(const reference& x);
    void flip(); // flips the bit
    };

    ....

    void flip(); // flips all bits

    ....

    reference is a class that simulates the behavior of references of a
    single bit in vector<bool>."



    --
    Ioannis Vranos

    http://www23.brinkster.com/noicys
     
    Ioannis Vranos, Oct 12, 2004
    #7
  8. daniel

    Ron Natalie Guest

    JKop wrote:

    > I believe that sizeof(bool) must be >= 1, which implies that a "bool" is of
    > one byte.


    Actually it implies bool is at least one byte. Many systems
    sizeof(bool) == sizeof(int).

    > But that doesn't mean that your particular system might do
    > something different in the background.


    Actually, for vector it is REQUIRED to do something different. I think
    it's stupid but the standard requires vector<bool> to be specialized to
    act differently than vectors of anything else. It should have been a
    different class.

    >
    > It's to do with the actual CPU of the system, what is the smallest unit of
    > memory that can be accessed, which is called a "byte", which 9 times out of
    > 10 is equal to "8 bits".


    Actually it has nothing to do with the actual CPU. C++ could just as
    easily have bit types. There are machines where the smallest
    addressable unit of storage is NOT a byte. A byte is carved up out of
    a larger word.

    However, getting back on topic, a byte (and it's related C++ type char)
    is the smallest unit of memory C++ will address. This unfornuately is
    another C/C++ stupidity. There really shouldn't be a hard relationship
    between "character" and "atomic memory unit." Of course, this is 30
    some years of hindsight speaking.
     
    Ron Natalie, Oct 12, 2004
    #8
  9. In message <1097586912.771988@athnrd02>, Ioannis Vranos
    <> writes
    >Ioannis Vranos wrote:
    >
    >> Well, it is not guaranteed that bits will be used, although in most
    >>cases this is what should happen.
    >> You may also want to check bitset which uses bits for sure.

    >
    >My mistake! It is *guaranteed* that bits are used:


    I don't think so.
    >
    >From the standard:
    >
    >"23.2.5 Class vector<bool>
    >
    >To optimize space allocation, a specialization of vector for bool
    >elements is provided:
    >
    >...
    >
    >// bit reference:
    >class reference {
    >friend class vector;
    >reference();
    >public:
    >Ëœreference();
    >operator bool() const;
    >reference& operator=(const bool x);
    >reference& operator=(const reference& x);
    >void flip(); // flips the bit
    >};
    >
    >...
    >
    >void flip(); // flips all bits
    >
    >...
    >
    >reference is a class that simulates the behavior of references of a
    >single bit in vector<bool>."
    >

    That just redefines the semantics of operator[] etc. to return a proxy
    instead of an actual bool &, thus removing the requirement for elements
    to have distinct addresses. Nothing there says that it _must_ use only
    use one bit per element.

    --
    Richard Herring
     
    Richard Herring, Oct 12, 2004
    #9
  10. daniel

    Jeff Flinn Guest

    "daniel" <> wrote in message
    news:...
    > 1)
    > is C++ smart enough to automatically use "bits" for bool or will
    > a bool have the size of a charcter (byte).
    > 2) The index of a vector is it an integer (4 byte) or a "long long"
    > with 8 bytes or something else?
    >
    > I ask because I want to use
    >
    > vector<bool> with a few billion entries exceeding the int32 range for
    > indexing and I want to use as few memory as possible for the whole
    > vector<bool>


    Use either std::bitset or boost::dynamic_bitset. Both "pack" the bits for
    better memory usage. You'll most likely have to modify these classes to
    account for indices > 32 bit.

    Jeff F
     
    Jeff Flinn, Oct 12, 2004
    #10
  11. Richard Herring wrote:

    >> reference is a class that simulates the behavior of references of a
    >> single bit in vector<bool>."

    >
    > That just redefines the semantics of operator[] etc. to return a proxy
    > instead of an actual bool &, thus removing the requirement for elements
    > to have distinct addresses. Nothing there says that it _must_ use only
    > use one bit per element.




    Well since it says that

    "reference is a class that simulates the behavior of references of a
    single bit in vector<bool>"

    I think this implies that vector<bool> should be implemented using bits.



    --
    Ioannis Vranos

    http://www23.brinkster.com/noicys
     
    Ioannis Vranos, Oct 12, 2004
    #11
  12. In message <1097598729.868638@athnrd02>, Ioannis Vranos
    <> writes
    >Richard Herring wrote:
    >
    >>> reference is a class that simulates the behavior of references of a
    >>>single bit in vector<bool>."

    >>
    >> That just redefines the semantics of operator[] etc. to return a
    >>proxy instead of an actual bool &, thus removing the requirement for
    >>elements to have distinct addresses. Nothing there says that it _must_
    >>use only use one bit per element.

    >
    >Well since it says that
    >
    >"reference is a class that simulates the behavior of references of a
    >single bit in vector<bool>"
    >
    >I think this implies that vector<bool> should be implemented using bits.
    >

    For a weak enough meaning of "should", I agree that that's the
    intention. But I don't think anything in the Standard rules out a naive
    implementation that makes no attempt to optimise, and simply implements
    vector<bool> in terms of an underlying vector<some_int_type>.

    --
    Richard Herring
     
    Richard Herring, Oct 12, 2004
    #12
  13. Richard Herring wrote:

    > For a weak enough meaning of "should", I agree that that's the
    > intention. But I don't think anything in the Standard rules out a naive
    > implementation that makes no attempt to optimise, and simply implements
    > vector<bool> in terms of an underlying vector<some_int_type>.



    There is not an explicit prohibition on this, however explicit
    prohibitions are rare in the standard.

    Also the implementer has to implement this particular reference:


    // bit reference:
    class reference {
    friend class vector;
    reference();
    public:
    ˜reference();
    operator bool() const;
    reference& operator=(const bool x);
    reference& operator=(const reference& x);
    void flip(); // flips the bit
    };


    with the note:

    reference is a class that simulates the behavior of references of a
    single bit in vector<bool>.



    and this particular member function:

    void flip(); // flips all bits




    Also vector<bool> is mentioned as a separate case in the standard, and
    thus I think the legislator intended this to be implemented only with
    bits. :)



    --
    Ioannis Vranos

    http://www23.brinkster.com/noicys
     
    Ioannis Vranos, Oct 12, 2004
    #13
  14. hmm not sure but i think somethink lieke that can be done with structures i
    mena this:


    struct item {

    unsigned int IsTrue:1;
    unsigned int watever:1;
    unsigned int number:14;

    }

    some cpu dosen't support that or the compiler doesen't use it because of
    speed loss

    but if it work is should looks like this in the ram:

    0 | 1 | 2.............15 |
    IsTrue | waterver | number |


    you could use it like an int but IMPORTANT

    if you forgot the unsignet the int will always be 0

    gnu says this:

    ...cpp: In function 'int main()':
    ....cpp:13: warning: comparison is always 0 due to width of bitfield


    here is also the name they#re called bitfields

    hope this can help ;-)







    --
    This message has been sent using the astalavista.net newsreader
    webinterface.
    http://www.astalavista.net/
     
    Nonenix (astalavista.net), Oct 12, 2004
    #14
  15. daniel

    daniel Guest

    Well I found something:
    http://www.sgi.com/tech/stl/bit_vector.html states
    ----------------
    A bit_vector is essentially a vector<bool>: it is a Sequence that has
    the same interface as vector. The main difference is that bit_vector
    is optimized for space efficiency. A vector always requires at least
    one byte per element, but a bit_vector only requires one bit per
    element.

    Warning: The name bit_vector will be removed in a future release of
    the STL. The only reason that bit_vector is a separate class, instead
    of a template specialization of vector<bool>, is that this would
    require partial specialization of templates. On compilers that support
    partial specialization, bit_vector is a specialization of
    vector<bool>. The name bit_vector is a typedef. This typedef is not
    defined in the C++ standard, and is retained only for backward
    compatibility.
    ----------------
    The std::bitset seems not to be useful because I cannot change its
    size during runtime (no ->resize()).

    So that should answer my first question.

    Thanks for the help, daniel



    "Jeff Flinn" <> wrote in message news:<ckgnot$q0h$>...
    > "daniel" <> wrote in message
    > news:...
    > > 1)
    > > is C++ smart enough to automatically use "bits" for bool or will
    > > a bool have the size of a charcter (byte).
    > > 2) The index of a vector is it an integer (4 byte) or a "long long"
    > > with 8 bytes or something else?
    > >
    > > I ask because I want to use
    > >
    > > vector<bool> with a few billion entries exceeding the int32 range for
    > > indexing and I want to use as few memory as possible for the whole
    > > vector<bool>

    >
    > Use either std::bitset or boost::dynamic_bitset. Both "pack" the bits for
    > better memory usage. You'll most likely have to modify these classes to
    > account for indices > 32 bit.
    >
    > Jeff F
     
    daniel, Oct 12, 2004
    #15
  16. Nonenix (astalavista.net) wrote:
    > hmm not sure but i think somethink lieke that can be done with structures i
    > mena this:
    >
    >
    > struct item {
    >
    > unsigned int IsTrue:1;
    > unsigned int watever:1;
    > unsigned int number:14;
    >
    > }
    >
    > some cpu dosen't support that or the compiler doesen't use it because of
    > speed loss
    >
    > but if it work is should looks like this in the ram:
    >
    > 0 | 1 | 2.............15 |
    > IsTrue | waterver | number |
    >
    >
    > you could use it like an int but IMPORTANT
    >
    > if you forgot the unsignet the int will always be 0



    Why?


    > gnu says this:
    >
    > ..cpp: In function 'int main()':
    > ...cpp:13: warning: comparison is always 0 due to width of bitfield



    May you provide that code?


    Actually it could be even like this:


    struct item
    {
    bool a:1;
    bool b:1;
    bool c:1;
    bool d:1;
    };



    --
    Ioannis Vranos

    http://www23.brinkster.com/noicys
     
    Ioannis Vranos, Oct 13, 2004
    #16
  17. In message <1097606565.479989@athnrd02>, Ioannis Vranos
    <> writes
    >Richard Herring wrote:
    >
    >> For a weak enough meaning of "should", I agree that that's the
    >>intention. But I don't think anything in the Standard rules out a
    >>naive implementation that makes no attempt to optimise, and simply
    >>implements vector<bool> in terms of an underlying vector<some_int_type>.

    >
    >
    >There is not an explicit prohibition on this, however explicit
    >prohibitions are rare in the standard.


    Just so. The standard specifies an interface and complexity (speed)
    requirements, not the mechanism used to implement them. But it says
    nothing prescriptive about the memory requirements for containers.
    >
    >Also the implementer has to implement this particular reference:
    >
    >
    >// bit reference:
    >class reference {
    >friend class vector;
    >reference();
    >public:
    >Ëœreference();
    >operator bool() const;
    >reference& operator=(const bool x);
    >reference& operator=(const reference& x);
    >void flip(); // flips the bit
    >};
    >

    Yes, and the fact that this is what operator[] is mandated to return is
    what makes it *possible* for the implementor to provide access to data
    packed into something smaller than an addressable unit.
    >
    >with the note:
    >
    >reference is a class that simulates the behavior of references of a
    >single bit in vector<bool>.


    Yes; note the word "simulates".
    >
    >and this particular member function:
    >
    >void flip(); // flips all bits
    >
    >Also vector<bool> is mentioned as a separate case in the standard, and
    >thus I think the legislator intended this to be implemented only with
    >bits.


    I'd say "intended it to be _possible_ to implement with bits".

    >:)


    --
    Richard Herring
     
    Richard Herring, Oct 13, 2004
    #17
  18. daniel

    Jeff Flinn Guest

    "daniel" <> wrote in message
    news:...
    > Well I found something:
    > http://www.sgi.com/tech/stl/bit_vector.html states
    > ----------------
    > A bit_vector is essentially a vector<bool>: it is a Sequence that has
    > the same interface as vector. The main difference is that bit_vector
    > is optimized for space efficiency. A vector always requires at least
    > one byte per element, but a bit_vector only requires one bit per
    > element.
    >
    > Warning: The name bit_vector will be removed in a future release of
    > the STL. The only reason that bit_vector is a separate class, instead
    > of a template specialization of vector<bool>, is that this would
    > require partial specialization of templates. On compilers that support
    > partial specialization, bit_vector is a specialization of
    > vector<bool>. The name bit_vector is a typedef. This typedef is not
    > defined in the C++ standard, and is retained only for backward
    > compatibility.
    > ----------------
    > The std::bitset seems not to be useful because I cannot change its
    > size during runtime (no ->resize()).



    Which is why I also suggested boost::dynamic_bitset. See
    http://www.boost.org/libs/dynamic_bitset/dynamic_bitset.html

    Jeff F



    > So that should answer my first question.
    >
    > Thanks for the help, daniel
    >
    >
    >
    > "Jeff Flinn" <> wrote in message

    news:<ckgnot$q0h$>...
    > > "daniel" <> wrote in message
    > > news:...
    > > > 1)
    > > > is C++ smart enough to automatically use "bits" for bool or will
    > > > a bool have the size of a charcter (byte).
    > > > 2) The index of a vector is it an integer (4 byte) or a "long long"
    > > > with 8 bytes or something else?
    > > >
    > > > I ask because I want to use
    > > >
    > > > vector<bool> with a few billion entries exceeding the int32 range for
    > > > indexing and I want to use as few memory as possible for the whole
    > > > vector<bool>

    > >
    > > Use either std::bitset or boost::dynamic_bitset. Both "pack" the bits

    for
    > > better memory usage. You'll most likely have to modify these classes to
    > > account for indices > 32 bit.
    > >
    > > Jeff F
     
    Jeff Flinn, Oct 13, 2004
    #18
  19. Richard Herring wrote:

    > I'd say "intended it to be _possible_ to implement with bits".



    Well in the reference type definition



    // bit reference:
    class reference {
    friend class vector;
    reference();
    public:
    Ëœreference();
    operator bool() const;
    reference& operator=(const bool x);
    reference& operator=(const reference& x);
    void flip(); // flips the bit
    };



    the remark for flip() is "flips the bit". Which bit?



    --
    Ioannis Vranos

    http://www23.brinkster.com/noicys
     
    Ioannis Vranos, Oct 13, 2004
    #19
  20. In message <1097678253.422286@athnrd02>, Ioannis Vranos
    <> writes
    >Richard Herring wrote:
    >
    >> I'd say "intended it to be _possible_ to implement with bits".

    >
    >
    >Well in the reference type definition
    >
    >
    >// bit reference:
    >class reference {
    >friend class vector;
    >reference();
    >public:
    >Ëœreference();
    >operator bool() const;
    >reference& operator=(const bool x);
    >reference& operator=(const reference& x);
    >void flip(); // flips the bit
    >};
    >
    >
    >the remark for flip() is "flips the bit". Which bit?


    The one "simulated" by the reference in the note you quoted earlier:

    >reference is a class that simulates the behavior of references of a
    >single bit in vector<bool>.



    --
    Richard Herring
     
    Richard Herring, Oct 13, 2004
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Weng Tianxiang
    Replies:
    2
    Views:
    496
    Weng Tianxiang
    Jun 21, 2005
  2. Alex Vinokur

    vector<int> and vector<bool>

    Alex Vinokur, Jan 15, 2005, in forum: C++
    Replies:
    1
    Views:
    358
    Sharad Kala
    Jan 15, 2005
  3. pmatos
    Replies:
    6
    Views:
    24,135
  4. Bo Peng
    Replies:
    8
    Views:
    430
    Bob Hairgrove
    Feb 1, 2006
  5. Replies:
    8
    Views:
    2,007
    Csaba
    Feb 18, 2006
Loading...

Share This Page