A Better Choice?

Mike Copeland · Sep 28, 2013

I have the following data declaration that is causing compiler
warnings:

char DBI[60001] = {'\0'} ;

The purpose of this data is to store a character value ('a'..'I') for
each corresponding record that has been read from a database. The
records are identified by a value that ranges from 1-60000. I can't use
a bitset here, as I need to store the character value associated with
each record. I don't want to be limited by the current supported range
(1...60000).
I'm looking for a better (STL container?) technique that might serve
my purpose but also avoid the fixed size constant declaration which is
causing compiler warnings. Also, I'm not especially concerned with
performance in this functionality. Any thoughts? TIA

Ian Collins · Sep 28, 2013

Mike said:
I have the following data declaration that is causing compiler
warnings:

char DBI[60001] = {'\0'} ;

What warnings?

The purpose of this data is to store a character value ('a'..'I') for
each corresponding record that has been read from a database. The
records are identified by a value that ranges from 1-60000. I can't use
a bitset here, as I need to store the character value associated with
each record. I don't want to be limited by the current supported range
(1...60000).
I'm looking for a better (STL container?) technique that might serve
my purpose but also avoid the fixed size constant declaration which is
causing compiler warnings. Also, I'm not especially concerned with
performance in this functionality. Any thoughts? TIA

std::vector<char> xx( 60001, '\0' );

Victor Bazarov · Sep 28, 2013

I have the following data declaration that is causing compiler
warnings:

char DBI[60001] = {'\0'} ;

If that's inside a function (automatic storage duration), it's a lot of
chars to allocate on the stack...

The purpose of this data is to store a character value ('a'..'I') for
each corresponding record that has been read from a database. The
records are identified by a value that ranges from 1-60000. I can't use
a bitset here, as I need to store the character value associated with
each record. I don't want to be limited by the current supported range
(1...60000).
I'm looking for a better (STL container?) technique that might serve
my purpose but also avoid the fixed size constant declaration which is
causing compiler warnings. Also, I'm not especially concerned with
performance in this functionality. Any thoughts? TIA

You can do anything you want, but the easiest, I think, is to use a
vector of chars of a fixed length given at the initialization:

std::vector<char> DBI(60001, '\0');

The difference from what you have is that it will use free store to
allocate the elements, and it will free that memory afterwards, so you
don't need to pay much attention to it.

If you need more elements in the same vector later, you can make the
vector grow by mean of 'resize' function or by simply pushing new values
at the end, which will cause it to resize itself.

V

goran.pusic · Oct 2, 2013

Victor Bazarov said:
Victor Bazarov said:

On 9/28/2013 6:15 PM, Mike Copeland wrote:

I have the following data declaration that is causing compiler
warnings:

char DBI[60001] = {'\0'} ;

Click to expand...

If that's inside a function (automatic storage duration), it's a lot of

Click to expand...

chars to allocate on the stack...

Click to expand...

59 kilobytes is a laughably small amount of data on the stack.

Keep your hands of my stack! ;-)

Gert-Jan de Vos · Oct 2, 2013

I have the following data declaration that is causing compiler
warnings:

char DBI[60001] = {'\0'} ;

The purpose of this data is to store a character value ('a'..'I') for
each corresponding record that has been read from a database. The
records are identified by a value that ranges from 1-60000. I can't use
a bitset here, as I need to store the character value associated with
each record. I don't want to be limited by the current supported range
(1...60000).

For mapping a number to a character with any number of entries, this would
be my first choice:

std::map<int, char> dbi;

Victor Bazarov · Oct 2, 2013

I have the following data declaration that is causing compiler
warnings:

char DBI[60001] = {'\0'} ;

The purpose of this data is to store a character value ('a'..'I') for
each corresponding record that has been read from a database. The
records are identified by a value that ranges from 1-60000. I can't use
a bitset here, as I need to store the character value associated with
each record. I don't want to be limited by the current supported range
(1...60000).

Click to expand...

For mapping a number to a character with any number of entries, this would
be my first choice:

std::map<int, char> dbi;

I wonder what the difference with std::vector<char> would be if *all*
sixty thousand and one numbers in the range need a character. I mean,
the total size of a vector is 60001+sizeof(std::vector<char>) plus the
overhead of allocating the block, which is ~16 bytes. For a map it
would be 60000 * (sizeof(__node_type) + overhead), yes? Whatever
__node_type is, that is. And given that it usually needs to keep at
least three pointers, the value, and [as in VC++] a couple of chars,
that's like 27 bytes... So, is that, like, 40 times the memory? I
don't think it's worth considering unless the array is *really* sparse.

V

Jorgen Grahn · Oct 2, 2013

I have the following data declaration that is causing compiler
warnings:

char DBI[60001] = {'\0'} ;

The purpose of this data is to store a character value ('a'..'I') for
each corresponding record that has been read from a database. The
records are identified by a value that ranges from 1-60000. I can't use
a bitset here, as I need to store the character value associated with
each record. I don't want to be limited by the current supported range
(1...60000).

Click to expand...

For mapping a number to a character with any number of entries, this would
be my first choice:

std::map<int, char> dbi;

Click to expand...

I wonder what the difference with std::vector<char> would be if *all*
sixty thousand and one numbers in the range need a character. I mean,
the total size of a vector is 60001+sizeof(std::vector<char>) plus the
overhead of allocating the block, which is ~16 bytes. For a map it
would be 60000 * (sizeof(__node_type) + overhead), yes? Whatever
__node_type is, that is. And given that it usually needs to keep at
least three pointers, the value, and [as in VC++] a couple of chars,
that's like 27 bytes... So, is that, like, 40 times the memory? I
don't think it's worth considering unless the array is *really* sparse.

Yes, using a std::map would be a waste of memory. On the other hand:
(a) it would not be noticeable on a typical system!
(b) using std::map (or unordered_map) ought to make for more readable
code, if this is indeed a value->value mapping problem

So I agree with G-JdV about first choice.

/Jorgen

Victor Bazarov · Oct 2, 2013

On Sunday, 29 September 2013 00:15:35 UTC+2, Mike Copeland wrote:
I have the following data declaration that is causing compiler
warnings:

char DBI[60001] = {'\0'} ;

The purpose of this data is to store a character value ('a'..'I') for
each corresponding record that has been read from a database. The
records are identified by a value that ranges from 1-60000. I can't use
a bitset here, as I need to store the character value associated with
each record. I don't want to be limited by the current supported range
(1...60000).

For mapping a number to a character with any number of entries, this would
be my first choice:

std::map<int, char> dbi;

Click to expand...

I wonder what the difference with std::vector<char> would be if *all*
sixty thousand and one numbers in the range need a character. I mean,
the total size of a vector is 60001+sizeof(std::vector<char>) plus the
overhead of allocating the block, which is ~16 bytes. For a map it
would be 60000 * (sizeof(__node_type) + overhead), yes? Whatever
__node_type is, that is. And given that it usually needs to keep at
least three pointers, the value, and [as in VC++] a couple of chars,
that's like 27 bytes... So, is that, like, 40 times the memory? I
don't think it's worth considering unless the array is *really* sparse.

Click to expand...

Yes, using a std::map would be a waste of memory. On the other hand:
(a) it would not be noticeable on a typical system!

I am guessing it depends very much on the definition of "noticeable" and
on the definition of "typical".

(b) using std::map (or unordered_map) ought to make for more readable
code, if this is indeed a value->value mapping problem

Really? More readable, how? Instead of writing

mytable[recordnum] = thischar;

you will write

mytable[recordnum] = thischar;

?

So I agree with G-JdV about first choice.

/Jorgen

V

Gert-Jan de Vos · Oct 3, 2013

I have the following data declaration that is causing compiler
warnings:

char DBI[60001] = {'\0'} ;

The purpose of this data is to store a character value ('a'..'I') for
each corresponding record that has been read from a database. The
records are identified by a value that ranges from 1-60000. I can't use
a bitset here, as I need to store the character value associated with
each record. I don't want to be limited by the current supported range
(1...60000).

Click to expand...

For mapping a number to a character with any number of entries, this would
be my first choice:

std::map<int, char> dbi;

Click to expand...

I wonder what the difference with std::vector<char> would be if *all*
sixty thousand and one numbers in the range need a character. I mean,
the total size of a vector is 60001+sizeof(std::vector<char>) plus the
overhead of allocating the block, which is ~16 bytes. For a map it
would be 60000 * (sizeof(__node_type) + overhead), yes? Whatever
__node_type is, that is. And given that it usually needs to keep at
least three pointers, the value, and [as in VC++] a couple of chars,
that's like 27 bytes... So, is that, like, 40 times the memory? I
don't think it's worth considering unless the array is *really* sparse.

Of course Victor, a map of 60000 entries would take much more memory
than an array of the same size. It was this part of Mike's post that pointed
me to a map:

"I don't want to be limited by the current supported range (1...60000)"

I understand he has a database where each entry has an id in the range
1..60000, but this range may grow. There is no information about the
number of entries in the database. If it is more than a few % of the
range, an array or vector would indeed be better. Then you need
guarantees about the range of the ids you need to handle.

G-J

Mike Copeland · Oct 3, 2013

Of course Victor, a map of 60000 entries would take much more memory

than an array of the same size. It was this part of Mike's post that pointed
me to a map:

"I don't want to be limited by the current supported range (1...60000)"

I understand he has a database where each entry has an id in the range
1..60000, but this range may grow. There is no information about the
number of entries in the database. If it is more than a few % of the
range, an array or vector would indeed be better. Then you need
guarantees about the range of the ids you need to handle.

The application usually "fills" the range that's actually used, but
the range varies a great deal. That is, if the upper limit is 3000, the
range fill be 90% filled; for 15000 that range is also about 90% filled,
etc.
Thus, the application establishes a range of numbers that are
supported, and whatever the range is will be ~90% used. However, the
application seldom uses a range > 4000, but there are a few times when
it can require up to 60000. (And I must program to the extreme limit,
of course...)
So, the used values will always be quite dense, but the total size of
the data object will vary greatly. (FWIW...)

Öö Tiib · Oct 3, 2013

The application usually "fills" the range that's actually used, but
the range varies a great deal. That is, if the upper limit is 3000, the
range fill be 90% filled; for 15000 that range is also about 90% filled,
etc.
Thus, the application establishes a range of numbers that are
supported, and whatever the range is will be ~90% used. However, the
application seldom uses a range > 4000, but there are a few times when
it can require up to 60000. (And I must program to the extreme limit,
of course...)

So, the used values will always be quite dense, but the total size of
the data object will vary greatly. (FWIW...)

Maybe I misunderstand something but if it is dense array (and 90% fill
is rather dense) then 'std::array<char,60001>', 'std::vector<char>' or
'std::valarray<char>' are the most obvious choices. If on target platform
there are difficulties with continuous memory blocks of 60001 elements
then you can use 'std::deque<char>', but ... what platform has difficulties
with 60001 elements? All pocket phones I can reach seem to work happily
with it.

Jorgen Grahn · Oct 3, 2013

On 10/2/2013 2:48 PM, Gert-Jan de Vos wrote:
On Sunday, 29 September 2013 00:15:35 UTC+2, Mike Copeland wrote:
I have the following data declaration that is causing compiler
warnings:

char DBI[60001] = {'\0'} ;

The purpose of this data is to store a character value ('a'..'I') for
each corresponding record that has been read from a database. The
records are identified by a value that ranges from 1-60000. I can't use
a bitset here, as I need to store the character value associated with
each record. I don't want to be limited by the current supported range
(1...60000).

For mapping a number to a character with any number of entries, this would
be my first choice:

std::map<int, char> dbi;

I wonder what the difference with std::vector<char> would be if *all*
sixty thousand and one numbers in the range need a character. I mean,
the total size of a vector is 60001+sizeof(std::vector<char>) plus the
overhead of allocating the block, which is ~16 bytes. For a map it
would be 60000 * (sizeof(__node_type) + overhead), yes? Whatever
__node_type is, that is. And given that it usually needs to keep at
least three pointers, the value, and [as in VC++] a couple of chars,
that's like 27 bytes... So, is that, like, 40 times the memory? I
don't think it's worth considering unless the array is *really* sparse.

Click to expand...

Yes, using a std::map would be a waste of memory. On the other hand:
(a) it would not be noticeable on a typical system!

Click to expand...

I am guessing it depends very much on the definition of "noticeable" and
on the definition of "typical".

Well, I assume he's not working on a small embedded system, and not
using special debugging tools to measure memory usage.

(b) using std::map (or unordered_map) ought to make for more readable
code, if this is indeed a value->value mapping problem

Click to expand...

Really? More readable, how? Instead of writing

mytable[recordnum] = thischar;

you will write

mytable[recordnum] = thischar;

?

He didn't say, but I suspect he will do more things to his map than
write to it. Perhaps one day he will want more than 60,001 entries,
too.

If you want a mapping, use a mapping container. Makes sense to me,
and I'm surprised you seem to argue against it. Or is it some kind
of Devil's Advocate thing?

/Jorgen

A Better Container Choice?	3	Aug 22, 2013
STL Container Choice	11	Mar 28, 2012
Database Manager: A C++ Console Application	14	May 12, 2025
Is C an unsuitable choice as a string parser?	72	Dec 12, 2013
How does a HEAD pointer end up pointing to the first node in a linked list?	3	Jan 24, 2023
Implementing a Q-Learning Algorithm with Logistic Regression Normalization in C++	0	Jun 4, 2025
A better technique than initializer_list. Worth a defect report?	6	Jul 27, 2010
Even basic math is at risk? Why is a simple math and logic solution being ignored?	2	Jul 3, 2025

A Better Choice?

Mike Copeland

Ian Collins

Victor Bazarov

goran.pusic

Gert-Jan de Vos

Victor Bazarov

Jorgen Grahn

Victor Bazarov

Gert-Jan de Vos

Mike Copeland

Öö Tiib

Jorgen Grahn

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads