A Better Choice?

M

Mike Copeland

I have the following data declaration that is causing compiler
warnings:

char DBI[60001] = {'\0'} ;

The purpose of this data is to store a character value ('a'..'I') for
each corresponding record that has been read from a database. The
records are identified by a value that ranges from 1-60000. I can't use
a bitset here, as I need to store the character value associated with
each record. I don't want to be limited by the current supported range
(1...60000).
I'm looking for a better (STL container?) technique that might serve
my purpose but also avoid the fixed size constant declaration which is
causing compiler warnings. Also, I'm not especially concerned with
performance in this functionality. Any thoughts? TIA
 
I

Ian Collins

Mike said:
I have the following data declaration that is causing compiler
warnings:

char DBI[60001] = {'\0'} ;

What warnings?
The purpose of this data is to store a character value ('a'..'I') for
each corresponding record that has been read from a database. The
records are identified by a value that ranges from 1-60000. I can't use
a bitset here, as I need to store the character value associated with
each record. I don't want to be limited by the current supported range
(1...60000).
I'm looking for a better (STL container?) technique that might serve
my purpose but also avoid the fixed size constant declaration which is
causing compiler warnings. Also, I'm not especially concerned with
performance in this functionality. Any thoughts? TIA

std::vector<char> xx( 60001, '\0' );
 
V

Victor Bazarov

I have the following data declaration that is causing compiler
warnings:

char DBI[60001] = {'\0'} ;

If that's inside a function (automatic storage duration), it's a lot of
chars to allocate on the stack...
The purpose of this data is to store a character value ('a'..'I') for
each corresponding record that has been read from a database. The
records are identified by a value that ranges from 1-60000. I can't use
a bitset here, as I need to store the character value associated with
each record. I don't want to be limited by the current supported range
(1...60000).
I'm looking for a better (STL container?) technique that might serve
my purpose but also avoid the fixed size constant declaration which is
causing compiler warnings. Also, I'm not especially concerned with
performance in this functionality. Any thoughts? TIA

You can do anything you want, but the easiest, I think, is to use a
vector of chars of a fixed length given at the initialization:

std::vector<char> DBI(60001, '\0');

The difference from what you have is that it will use free store to
allocate the elements, and it will free that memory afterwards, so you
don't need to pay much attention to it.

If you need more elements in the same vector later, you can make the
vector grow by mean of 'resize' function or by simply pushing new values
at the end, which will cause it to resize itself.

V
 
G

goran.pusic

Victor Bazarov said:
On 9/28/2013 6:15 PM, Mike Copeland wrote:
I have the following data declaration that is causing compiler
warnings:

char DBI[60001] = {'\0'} ;
If that's inside a function (automatic storage duration), it's a lot of
chars to allocate on the stack...



59 kilobytes is a laughably small amount of data on the stack.

Keep your hands of my stack! ;-)
 
G

Gert-Jan de Vos

I have the following data declaration that is causing compiler
warnings:

char DBI[60001] = {'\0'} ;

The purpose of this data is to store a character value ('a'..'I') for
each corresponding record that has been read from a database. The
records are identified by a value that ranges from 1-60000. I can't use
a bitset here, as I need to store the character value associated with
each record. I don't want to be limited by the current supported range
(1...60000).

For mapping a number to a character with any number of entries, this would
be my first choice:

std::map<int, char> dbi;
 
V

Victor Bazarov

I have the following data declaration that is causing compiler
warnings:

char DBI[60001] = {'\0'} ;

The purpose of this data is to store a character value ('a'..'I') for
each corresponding record that has been read from a database. The
records are identified by a value that ranges from 1-60000. I can't use
a bitset here, as I need to store the character value associated with
each record. I don't want to be limited by the current supported range
(1...60000).

For mapping a number to a character with any number of entries, this would
be my first choice:

std::map<int, char> dbi;

I wonder what the difference with std::vector<char> would be if *all*
sixty thousand and one numbers in the range need a character. I mean,
the total size of a vector is 60001+sizeof(std::vector<char>) plus the
overhead of allocating the block, which is ~16 bytes. For a map it
would be 60000 * (sizeof(__node_type) + overhead), yes? Whatever
__node_type is, that is. And given that it usually needs to keep at
least three pointers, the value, and [as in VC++] a couple of chars,
that's like 27 bytes... So, is that, like, 40 times the memory? I
don't think it's worth considering unless the array is *really* sparse.

V
 
J

Jorgen Grahn

I have the following data declaration that is causing compiler
warnings:

char DBI[60001] = {'\0'} ;

The purpose of this data is to store a character value ('a'..'I') for
each corresponding record that has been read from a database. The
records are identified by a value that ranges from 1-60000. I can't use
a bitset here, as I need to store the character value associated with
each record. I don't want to be limited by the current supported range
(1...60000).

For mapping a number to a character with any number of entries, this would
be my first choice:

std::map<int, char> dbi;

I wonder what the difference with std::vector<char> would be if *all*
sixty thousand and one numbers in the range need a character. I mean,
the total size of a vector is 60001+sizeof(std::vector<char>) plus the
overhead of allocating the block, which is ~16 bytes. For a map it
would be 60000 * (sizeof(__node_type) + overhead), yes? Whatever
__node_type is, that is. And given that it usually needs to keep at
least three pointers, the value, and [as in VC++] a couple of chars,
that's like 27 bytes... So, is that, like, 40 times the memory? I
don't think it's worth considering unless the array is *really* sparse.

Yes, using a std::map would be a waste of memory. On the other hand:
(a) it would not be noticeable on a typical system!
(b) using std::map (or unordered_map) ought to make for more readable
code, if this is indeed a value->value mapping problem

So I agree with G-JdV about first choice.

/Jorgen
 
V

Victor Bazarov

On Sunday, 29 September 2013 00:15:35 UTC+2, Mike Copeland wrote:
I have the following data declaration that is causing compiler
warnings:

char DBI[60001] = {'\0'} ;

The purpose of this data is to store a character value ('a'..'I') for
each corresponding record that has been read from a database. The
records are identified by a value that ranges from 1-60000. I can't use
a bitset here, as I need to store the character value associated with
each record. I don't want to be limited by the current supported range
(1...60000).

For mapping a number to a character with any number of entries, this would
be my first choice:

std::map<int, char> dbi;

I wonder what the difference with std::vector<char> would be if *all*
sixty thousand and one numbers in the range need a character. I mean,
the total size of a vector is 60001+sizeof(std::vector<char>) plus the
overhead of allocating the block, which is ~16 bytes. For a map it
would be 60000 * (sizeof(__node_type) + overhead), yes? Whatever
__node_type is, that is. And given that it usually needs to keep at
least three pointers, the value, and [as in VC++] a couple of chars,
that's like 27 bytes... So, is that, like, 40 times the memory? I
don't think it's worth considering unless the array is *really* sparse.

Yes, using a std::map would be a waste of memory. On the other hand:
(a) it would not be noticeable on a typical system!

I am guessing it depends very much on the definition of "noticeable" and
on the definition of "typical".
(b) using std::map (or unordered_map) ought to make for more readable
code, if this is indeed a value->value mapping problem

Really? More readable, how? Instead of writing

mytable[recordnum] = thischar;

you will write

mytable[recordnum] = thischar;

?
So I agree with G-JdV about first choice.

/Jorgen

V
 
G

Gert-Jan de Vos

I have the following data declaration that is causing compiler
warnings:

char DBI[60001] = {'\0'} ;

The purpose of this data is to store a character value ('a'..'I') for
each corresponding record that has been read from a database. The
records are identified by a value that ranges from 1-60000. I can't use
a bitset here, as I need to store the character value associated with
each record. I don't want to be limited by the current supported range
(1...60000).

For mapping a number to a character with any number of entries, this would
be my first choice:

std::map<int, char> dbi;

I wonder what the difference with std::vector<char> would be if *all*
sixty thousand and one numbers in the range need a character. I mean,
the total size of a vector is 60001+sizeof(std::vector<char>) plus the
overhead of allocating the block, which is ~16 bytes. For a map it
would be 60000 * (sizeof(__node_type) + overhead), yes? Whatever
__node_type is, that is. And given that it usually needs to keep at
least three pointers, the value, and [as in VC++] a couple of chars,
that's like 27 bytes... So, is that, like, 40 times the memory? I
don't think it's worth considering unless the array is *really* sparse.

Of course Victor, a map of 60000 entries would take much more memory
than an array of the same size. It was this part of Mike's post that pointed
me to a map:

"I don't want to be limited by the current supported range (1...60000)"

I understand he has a database where each entry has an id in the range
1..60000, but this range may grow. There is no information about the
number of entries in the database. If it is more than a few % of the
range, an array or vector would indeed be better. Then you need
guarantees about the range of the ids you need to handle.

G-J
 
M

Mike Copeland

Of course Victor, a map of 60000 entries would take much more memory
than an array of the same size. It was this part of Mike's post that pointed
me to a map:

"I don't want to be limited by the current supported range (1...60000)"

I understand he has a database where each entry has an id in the range
1..60000, but this range may grow. There is no information about the
number of entries in the database. If it is more than a few % of the
range, an array or vector would indeed be better. Then you need
guarantees about the range of the ids you need to handle.
The application usually "fills" the range that's actually used, but
the range varies a great deal. That is, if the upper limit is 3000, the
range fill be 90% filled; for 15000 that range is also about 90% filled,
etc.
Thus, the application establishes a range of numbers that are
supported, and whatever the range is will be ~90% used. However, the
application seldom uses a range > 4000, but there are a few times when
it can require up to 60000. (And I must program to the extreme limit,
of course...)
So, the used values will always be quite dense, but the total size of
the data object will vary greatly. (FWIW...)
 
Ö

Öö Tiib

The application usually "fills" the range that's actually used, but
the range varies a great deal. That is, if the upper limit is 3000, the
range fill be 90% filled; for 15000 that range is also about 90% filled,
etc.
Thus, the application establishes a range of numbers that are
supported, and whatever the range is will be ~90% used. However, the
application seldom uses a range > 4000, but there are a few times when
it can require up to 60000. (And I must program to the extreme limit,
of course...)

So, the used values will always be quite dense, but the total size of
the data object will vary greatly. (FWIW...)

Maybe I misunderstand something but if it is dense array (and 90% fill
is rather dense) then 'std::array<char,60001>', 'std::vector<char>' or
'std::valarray<char>' are the most obvious choices. If on target platform
there are difficulties with continuous memory blocks of 60001 elements
then you can use 'std::deque<char>', but ... what platform has difficulties
with 60001 elements? All pocket phones I can reach seem to work happily
with it.
 
J

Jorgen Grahn

On 10/2/2013 2:48 PM, Gert-Jan de Vos wrote:
On Sunday, 29 September 2013 00:15:35 UTC+2, Mike Copeland wrote:
I have the following data declaration that is causing compiler
warnings:

char DBI[60001] = {'\0'} ;

The purpose of this data is to store a character value ('a'..'I') for
each corresponding record that has been read from a database. The
records are identified by a value that ranges from 1-60000. I can't use
a bitset here, as I need to store the character value associated with
each record. I don't want to be limited by the current supported range
(1...60000).

For mapping a number to a character with any number of entries, this would
be my first choice:

std::map<int, char> dbi;

I wonder what the difference with std::vector<char> would be if *all*
sixty thousand and one numbers in the range need a character. I mean,
the total size of a vector is 60001+sizeof(std::vector<char>) plus the
overhead of allocating the block, which is ~16 bytes. For a map it
would be 60000 * (sizeof(__node_type) + overhead), yes? Whatever
__node_type is, that is. And given that it usually needs to keep at
least three pointers, the value, and [as in VC++] a couple of chars,
that's like 27 bytes... So, is that, like, 40 times the memory? I
don't think it's worth considering unless the array is *really* sparse.

Yes, using a std::map would be a waste of memory. On the other hand:
(a) it would not be noticeable on a typical system!

I am guessing it depends very much on the definition of "noticeable" and
on the definition of "typical".

Well, I assume he's not working on a small embedded system, and not
using special debugging tools to measure memory usage.
(b) using std::map (or unordered_map) ought to make for more readable
code, if this is indeed a value->value mapping problem

Really? More readable, how? Instead of writing

mytable[recordnum] = thischar;

you will write

mytable[recordnum] = thischar;

?

He didn't say, but I suspect he will do more things to his map than
write to it. Perhaps one day he will want more than 60,001 entries,
too.

If you want a mapping, use a mapping container. Makes sense to me,
and I'm surprised you seem to argue against it. Or is it some kind
of Devil's Advocate thing?

/Jorgen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,046
Latest member
Gavizuho

Latest Threads

Top