vector and bool

daniel · Oct 12, 2004

1)
is C++ smart enough to automatically use "bits" for bool or will
a bool have the size of a charcter (byte).
2) The index of a vector is it an integer (4 byte) or a "long long"
with 8 bytes or something else?

I ask because I want to use

vector<bool> with a few billion entries exceeding the int32 range for
indexing and I want to use as few memory as possible for the whole
vector<bool>

e.g.
vector<bool> B;
B.resize(2^34);

thanks, daniel

Ivan Vecerina · Oct 12, 2004

daniel said:
1)
is C++ smart enough to automatically use "bits" for bool or will
a bool have the size of a charcter (byte).

The standard C++ library is actually required to specialize the
std::vector for bool, and to provide 'compact' storage.
[NB: for many uses, this is actually seen as an inconvenience]

2) The index of a vector is it an integer (4 byte) or a "long long"
with 8 bytes or something else?

This depends on the implementation, but it may well be limited
to size_t, or a 4-byte unsigned integer on many platforms.

I ask because I want to use

vector<bool> with a few billion entries exceeding the int32 range for
indexing and I want to use as few memory as possible for the whole
vector<bool>

May we ask for what type of application or what kind of data?
In many cases, using some form of compression might be better
that allocating gigabytes of RAM. Does your system have enough
memory?

Regards,
Ivan

JKop · Oct 12, 2004

daniel posted:

1)
is C++ smart enough to automatically use "bits" for bool or will
a bool have the size of a charcter (byte).

I believe that sizeof(bool) must be >= 1, which implies that a "bool" is of
one byte. But that doesn't mean that your particular system might do
something different in the background.

It's to do with the actual CPU of the system, what is the smallest unit of
memory that can be accessed, which is called a "byte", which 9 times out of
10 is equal to "8 bits". Code which would store 8 bools in 1 byte would be
slower and (ironically) more memory consumptive as the code which
manipulates the byte would have to be kept in memory.

2) The index of a vector is it an integer (4 byte) or a "long long"
with 8 bytes or something else?

There is no such intrinsic type "long long".

There is a type "long int", which is guaranteed to be atleast 32-bit on all
systems, but that doesn't guarantee that it'll be 4 bytes long. For
instance:

If "char" were 9-Bit, then although "long" would be atleast 32-Bit, it
wouldn't necessarily be 4 bytes long, as 4 bytes = 36 bits on that system.

I ask because I want to use

vector<bool> with a few billion entries exceeding the int32 range for
indexing and I want to use as few memory as possible for the whole
vector<bool>

e.g.
vector<bool> B;
B.resize(2^34);

thanks, daniel

I'm sure there's people here that've done this before. . .

I'm not one of them!

-JKop

Rolf Magnus · Oct 12, 2004

JKop said:
daniel posted:

I believe that sizeof(bool) must be >= 1, which implies that a "bool" is
of one byte. But that doesn't mean that your particular system might do
something different in the background.

AFAIK, some systems use even 4 bytes for a bool, because they can access it
faster.

It's to do with the actual CPU of the system, what is the smallest unit of
memory that can be accessed, which is called a "byte", which 9 times out
of 10 is equal to "8 bits".

However, a CPU byte might be different from a C++ byte. I have heared there
are signal processors that have a smallest memory unit of 24 bits and still
the C++ implementation gives you a char with 8 bit. And there are 4 bit
CPUS which would need to combine two bytes to meet the minimum requirement
of 8 bits for char. I guess there might be C implelemntations, but probably
no C++ ones on such CPUs, but C has the same requirements in that case.

Code which would store 8 bools in 1 byte would be slower and (ironically)
more memory consumptive as the code which manipulates the byte would have
to be kept in memory.

There is no such intrinsic type "long long".

There is a type "long int", which is guaranteed to be atleast 32-bit on
all systems, but that doesn't guarantee that it'll be 4 bytes long.

That's right.

For instance:

If "char" were 9-Bit, then although "long" would be atleast 32-Bit, it
wouldn't necessarily be 4 bytes long, as 4 bytes = 36 bits on that system.

That example isn't really chosen very well. 3 bytes would only be 27 bits,
so you would still need 4 bytes to meet the requirement of at least 32
bits.

assaarpa · Oct 12, 2004

Code which would store 8 bools in 1 byte would be

slower and (ironically) more memory consumptive as the code which
manipulates the byte would have to be kept in memory.

Who gives a damn if the code takes 10 or 30 bytes, when the guy wants to
store billions of bool's and the choises are 128 MB vs. 1024 MB for storing
one billion flags. And you cannot be so sure about the performance either,
very often smaller data translates to better performance in the real world
applications.

Ioannis Vranos · Oct 12, 2004

daniel said:
1)
is C++ smart enough to automatically use "bits" for bool or will
a bool have the size of a charcter (byte).

In C++ it is

1 <= sizeof(bool) <= sizeof(long)

However especially for vector<bool> the standard says:

"To optimize space allocation, a specialization of vector for bool
elements is provided:

....

reference is a class that simulates the behaviour of references of a
single bit in vector<bool>."

"The C++ Programming Language 3rd Edition" may help here:

"16.3.11 Vector<bool>

The specialization (§13.5) vector<bool> is provided as a compact vector
of bool. A bool variable is addressable, so it takes up at least one
byte. However, it is easy to implement vector<bool> so that each element
takes up only a bit.
The usual vector operations work for vector<bool> and retain their usual
meanings. In particular, subscripting and iteration work as expected.

For example:
void f(vector<bool>& v)
{
// iterate using subscripting
for (int i = 0; i<v.size() ; ++i) cin >> v ;

typedef vector<bool>: :const_ iterator VI;

// iterate using iterators
for (VI p = v.begin() ; p!=v.end() ; ++p) cout<<*p;
}

To achieve this, an implementation must simulate addressing of a single
bit. Since a pointer cannot address a unit of memory smaller than a
byte, vector<bool>: :iterator cannot be a pointer. In particular,
one cannot rely on bool* as an iterator for a vector<bool>:

void f(vector<bool>& v)
{
bool* p = v.begin() ; // error: type mismatch
// ...
}

A technique for addressing a single bit is outlined in §17.5.3.

The library also provides bitset as a set of Boolean values with Boolean
set operations (§17.5.3)."

2) The index of a vector is it an integer (4 byte) or a "long long"
with 8 bytes or something else?

Click to expand...

It is vector<T>::size_type which is some unsigned integer type.

Example:

vector<int> someVec(100);

for(vector<int>::size_type i=0; i<someVec.size(); ++i)
// ...

I ask because I want to use

vector<bool> with a few billion entries exceeding the int32 range for
indexing and I want to use as few memory as possible for the whole
vector<bool>

e.g.
vector<bool> B;
B.resize(2^34);

Click to expand...

Well, it is not guaranteed that bits will be used, although in most
cases this is what should happen.

You may also want to check bitset which uses bits for sure.

Ioannis Vranos · Oct 12, 2004

Ioannis said:
Well, it is not guaranteed that bits will be used, although in most
cases this is what should happen.

You may also want to check bitset which uses bits for sure.

My mistake! It is *guaranteed* that bits are used:

From the standard:

"23.2.5 Class vector<bool>

To optimize space allocation, a specialization of vector for bool
elements is provided:

....

// bit reference:
class reference {
friend class vector;
reference();
public:
˜reference();
operator bool() const;
reference& operator=(const bool x);
reference& operator=(const reference& x);
void flip(); // flips the bit
};

....

void flip(); // flips all bits

....

reference is a class that simulates the behavior of references of a
single bit in vector<bool>."

Ron Natalie · Oct 12, 2004

JKop said:
I believe that sizeof(bool) must be >= 1, which implies that a "bool" is of
one byte.

Actually it implies bool is at least one byte. Many systems
sizeof(bool) == sizeof(int).

But that doesn't mean that your particular system might do
something different in the background.

Actually, for vector it is REQUIRED to do something different. I think
it's stupid but the standard requires vector<bool> to be specialized to
act differently than vectors of anything else. It should have been a
different class.

It's to do with the actual CPU of the system, what is the smallest unit of
memory that can be accessed, which is called a "byte", which 9 times out of
10 is equal to "8 bits".

Actually it has nothing to do with the actual CPU. C++ could just as
easily have bit types. There are machines where the smallest
addressable unit of storage is NOT a byte. A byte is carved up out of
a larger word.

However, getting back on topic, a byte (and it's related C++ type char)
is the smallest unit of memory C++ will address. This unfornuately is
another C/C++ stupidity. There really shouldn't be a hard relationship
between "character" and "atomic memory unit." Of course, this is 30
some years of hindsight speaking.

Richard Herring · Oct 12, 2004

Ioannis Vranos said:
My mistake! It is *guaranteed* that bits are used:

I don't think so.

From the standard:

"23.2.5 Class vector<bool>

To optimize space allocation, a specialization of vector for bool
elements is provided:

...

// bit reference:
class reference {
friend class vector;
reference();
public:
Ëœreference();
operator bool() const;
reference& operator=(const bool x);
reference& operator=(const reference& x);
void flip(); // flips the bit
};

...

void flip(); // flips all bits

...

reference is a class that simulates the behavior of references of a
single bit in vector<bool>."

That just redefines the semantics of operator[] etc. to return a proxy
instead of an actual bool &, thus removing the requirement for elements
to have distinct addresses. Nothing there says that it _must_ use only
use one bit per element.

Jeff Flinn · Oct 12, 2004

daniel said:
1)
is C++ smart enough to automatically use "bits" for bool or will
a bool have the size of a charcter (byte).
2) The index of a vector is it an integer (4 byte) or a "long long"
with 8 bytes or something else?

I ask because I want to use

vector<bool> with a few billion entries exceeding the int32 range for
indexing and I want to use as few memory as possible for the whole
vector<bool>

Use either std::bitset or boost::dynamic_bitset. Both "pack" the bits for
better memory usage. You'll most likely have to modify these classes to
account for indices > 32 bit.

Jeff F

Ioannis Vranos · Oct 12, 2004

Richard said:
reference is a class that simulates the behavior of references of a
single bit in vector<bool>."

Click to expand...

That just redefines the semantics of operator[] etc. to return a proxy
instead of an actual bool &, thus removing the requirement for elements
to have distinct addresses. Nothing there says that it _must_ use only
use one bit per element.

Well since it says that

"reference is a class that simulates the behavior of references of a
single bit in vector<bool>"

I think this implies that vector<bool> should be implemented using bits.

Richard Herring · Oct 12, 2004

Ioannis Vranos said:
Richard said:

reference is a class that simulates the behavior of references of a
single bit in vector<bool>."

Click to expand...

That just redefines the semantics of operator[] etc. to return a
proxy instead of an actual bool &, thus removing the requirement for
elements to have distinct addresses. Nothing there says that it _must_
use only use one bit per element.

Click to expand...

Well since it says that

"reference is a class that simulates the behavior of references of a
single bit in vector<bool>"

I think this implies that vector<bool> should be implemented using bits.

For a weak enough meaning of "should", I agree that that's the
intention. But I don't think anything in the Standard rules out a naive
implementation that makes no attempt to optimise, and simply implements
vector<bool> in terms of an underlying vector<some_int_type>.

Ioannis Vranos · Oct 12, 2004

Richard said:
For a weak enough meaning of "should", I agree that that's the
intention. But I don't think anything in the Standard rules out a naive
implementation that makes no attempt to optimise, and simply implements
vector<bool> in terms of an underlying vector<some_int_type>.

There is not an explicit prohibition on this, however explicit
prohibitions are rare in the standard.

Also the implementer has to implement this particular reference:

// bit reference:
class reference {
friend class vector;
reference();
public:
˜reference();
operator bool() const;
reference& operator=(const bool x);
reference& operator=(const reference& x);
void flip(); // flips the bit
};

with the note:

reference is a class that simulates the behavior of references of a
single bit in vector<bool>.

and this particular member function:

void flip(); // flips all bits

Also vector<bool> is mentioned as a separate case in the standard, and
thus I think the legislator intended this to be implemented only with
bits.

Nonenix (astalavista.net) · Oct 12, 2004

hmm not sure but i think somethink lieke that can be done with structures i
mena this:

struct item {

unsigned int IsTrue:1;
unsigned int watever:1;
unsigned int number:14;

}

some cpu dosen't support that or the compiler doesen't use it because of
speed loss

but if it work is should looks like this in the ram:

0 | 1 | 2.............15 |
IsTrue | waterver | number |

you could use it like an int but IMPORTANT

if you forgot the unsignet the int will always be 0

gnu says this:

...cpp: In function 'int main()':
....cpp:13: warning: comparison is always 0 due to width of bitfield

here is also the name they#re called bitfields

hope this can help ;-)

daniel · Oct 12, 2004

Well I found something:
http://www.sgi.com/tech/stl/bit_vector.html states
----------------
A bit_vector is essentially a vector<bool>: it is a Sequence that has
the same interface as vector. The main difference is that bit_vector
is optimized for space efficiency. A vector always requires at least
one byte per element, but a bit_vector only requires one bit per
element.

Warning: The name bit_vector will be removed in a future release of
the STL. The only reason that bit_vector is a separate class, instead
of a template specialization of vector<bool>, is that this would
require partial specialization of templates. On compilers that support
partial specialization, bit_vector is a specialization of
vector<bool>. The name bit_vector is a typedef. This typedef is not
defined in the C++ standard, and is retained only for backward
compatibility.
----------------
The std::bitset seems not to be useful because I cannot change its
size during runtime (no ->resize()).

So that should answer my first question.

Thanks for the help, daniel

Ioannis Vranos · Oct 13, 2004

Nonenix said:
hmm not sure but i think somethink lieke that can be done with structures i
mena this:

struct item {

unsigned int IsTrue:1;
unsigned int watever:1;
unsigned int number:14;

}

some cpu dosen't support that or the compiler doesen't use it because of
speed loss

but if it work is should looks like this in the ram:

0 | 1 | 2.............15 |
IsTrue | waterver | number |

you could use it like an int but IMPORTANT

if you forgot the unsignet the int will always be 0

Why?

gnu says this:

..cpp: In function 'int main()':
...cpp:13: warning: comparison is always 0 due to width of bitfield

May you provide that code?

Actually it could be even like this:

struct item
{
bool a:1;
bool b:1;
bool c:1;
bool d:1;
};

Richard Herring · Oct 13, 2004

Ioannis Vranos said:
There is not an explicit prohibition on this, however explicit
prohibitions are rare in the standard.

Just so. The standard specifies an interface and complexity (speed)
requirements, not the mechanism used to implement them. But it says
nothing prescriptive about the memory requirements for containers.

Also the implementer has to implement this particular reference:

// bit reference:
class reference {
friend class vector;
reference();
public:
Ëœreference();
operator bool() const;
reference& operator=(const bool x);
reference& operator=(const reference& x);
void flip(); // flips the bit
};

Yes, and the fact that this is what operator[] is mandated to return is
what makes it *possible* for the implementor to provide access to data
packed into something smaller than an addressable unit.

with the note:

reference is a class that simulates the behavior of references of a
single bit in vector<bool>.

Yes; note the word "simulates".

and this particular member function:

void flip(); // flips all bits

Also vector<bool> is mentioned as a separate case in the standard, and
thus I think the legislator intended this to be implemented only with
bits.

I'd say "intended it to be _possible_ to implement with bits".

Jeff Flinn · Oct 13, 2004

daniel said:
Well I found something:
http://www.sgi.com/tech/stl/bit_vector.html states
----------------
A bit_vector is essentially a vector<bool>: it is a Sequence that has
the same interface as vector. The main difference is that bit_vector
is optimized for space efficiency. A vector always requires at least
one byte per element, but a bit_vector only requires one bit per
element.

Warning: The name bit_vector will be removed in a future release of
the STL. The only reason that bit_vector is a separate class, instead
of a template specialization of vector<bool>, is that this would
require partial specialization of templates. On compilers that support
partial specialization, bit_vector is a specialization of
vector<bool>. The name bit_vector is a typedef. This typedef is not
defined in the C++ standard, and is retained only for backward
compatibility.

Which is why I also suggested boost::dynamic_bitset. See
http://www.boost.org/libs/dynamic_bitset/dynamic_bitset.html

Jeff F

Ioannis Vranos · Oct 13, 2004

Richard said:
I'd say "intended it to be _possible_ to implement with bits".

Well in the reference type definition

// bit reference:
class reference {
friend class vector;
reference();
public:
Ëœreference();
operator bool() const;
reference& operator=(const bool x);
reference& operator=(const reference& x);
void flip(); // flips the bit
};

the remark for flip() is "flips the bit". Which bit?

Richard Herring · Oct 13, 2004

Ioannis Vranos said:
Well in the reference type definition

// bit reference:
class reference {
friend class vector;
reference();
public:
Ëœreference();
operator bool() const;
reference& operator=(const bool x);
reference& operator=(const reference& x);
void flip(); // flips the bit
};

the remark for flip() is "flips the bit". Which bit?

The one "simulated" by the reference in the note you quoted earlier:

Unclear vector<bool> constructor element insertion	2	Apr 18, 2014
Question about vector<bool>	5	Jan 15, 2009
vector<bool>	6	Dec 26, 2003
STL Vector Access	1	Feb 6, 2011
Erratic Container Behavior (list & vector)	2	Apr 13, 2014
std::vector<bool> specialisation performance issue	8	Dec 8, 2006
in reply to: DWORD and bool to binary	6	Nov 26, 2009
Questions about std::vector and std::deque	2	Mar 11, 2012

vector and bool

daniel

Ivan Vecerina

JKop

Rolf Magnus

assaarpa

Ioannis Vranos

Ioannis Vranos

Ron Natalie

Richard Herring

Jeff Flinn

Ioannis Vranos

Richard Herring

Ioannis Vranos

Nonenix (astalavista.net)

daniel

Ioannis Vranos

Richard Herring

Jeff Flinn

Ioannis Vranos

Richard Herring

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads