am I running out of memory?

R

Rich S

Or just loosing it?

I have a small c++ app which generates millions combinations of bitset
class objects and adds them to a STL map.

(on a side thanks to all who responded to my earlier posting about
increasing the programs speed)

Anyway, I calculate my bitset class object size to be 8 bytes and when
I run this in debug mode I can generate 19,675,656 objects before an
exception is thrown from AfxMem.cpp line 341.

What I don't understand is why I only get that many and not a whole lot
more. 19,675,656 million times 8 is 157,405,208 which is significantly
less than the 1 gig of memory I have running on this machine. I
understand that I don't get the whole gig and thanks to virtual memory
I have 2 gig available but I don't understand why I'm not even getting
close to that mark.

Do you think its truly out of memory or am I trampling on on memory I
shouldn't be. I don't think I am.

One other thing is that I have exception handlers in place but they are
not catching the exception so I'm not sure whats going on there.

Thanks for any help.
Regards
 
V

Victor Bazarov

Rich said:
Or just loosing it?

I have a small c++ app which generates millions combinations of bitset
class objects and adds them to a STL map.

(on a side thanks to all who responded to my earlier posting about
increasing the programs speed)

Anyway, I calculate my bitset class object size to be 8 bytes and when
I run this in debug mode I can generate 19,675,656 objects before an
exception is thrown from AfxMem.cpp line 341.

I have no idea what 'AfxMem.cpp' is or what is in its like 341. Did
you mean to post it and forgot?
What I don't understand is why I only get that many and not a whole
lot more. 19,675,656 million times 8 is 157,405,208 which is
significantly less than the 1 gig of memory I have running on this
machine. I understand that I don't get the whole gig and thanks to
virtual memory I have 2 gig available but I don't understand why I'm
not even getting close to that mark.

How much overhead does your system produce on top of every object
created in the free store? Perhaps you should try a custom allocator
or something... 19 million objects of the same size are just begging
to be pre-allocated in an array or something.
Do you think its truly out of memory or am I trampling on on memory I
shouldn't be. I don't think I am.

Then who am I to contradict you?
One other thing is that I have exception handlers in place but they
are not catching the exception so I'm not sure whats going on there.

What exception handers? What exception is thrown?

V
 
K

Kai-Uwe Bux

Rich said:
Or just loosing it?

I have a small c++ app which generates millions combinations of bitset
class objects and adds them to a STL map.

(on a side thanks to all who responded to my earlier posting about
increasing the programs speed)

Anyway, I calculate my bitset class object size to be 8 bytes and when
I run this in debug mode I can generate 19,675,656 objects before an
exception is thrown from AfxMem.cpp line 341.

What I don't understand is why I only get that many and not a whole lot
more. 19,675,656 million times 8 is 157,405,208 which is significantly
less than the 1 gig of memory I have running on this machine. I
understand that I don't get the whole gig and thanks to virtual memory
I have 2 gig available but I don't understand why I'm not even getting
close to that mark.

Hm, your std::set< T > class very likely is implemented as some kind of
balanced tree, i.e., your objects will be stored as nodes that look
somewhat like this:

struct node {
T data_field;
node* right;
node* left;
node* parent;
some_type balancing_info;
}

Also, keep in mind that the new/delete mechanism is likely to add another
word of memory to keep track of how much memory can be reclaimed at delete.
Thus, depending on how balancing information is handled, you might get
slightly above 32 bytes per node. In that case, a call to new might
actually reserve something like 48 or 64 bytes per node, which would be 6
to 8 times more than you calculated.
 
P

Pete Becker

Kai-Uwe Bux said:
Also, keep in mind that the new/delete mechanism is likely to add another
word of memory to keep track of how much memory can be reclaimed at delete.
Thus, depending on how balancing information is handled, you might get
slightly above 32 bytes per node. In that case, a call to new might
actually reserve something like 48 or 64 bytes per node, which would be 6
to 8 times more than you calculated.

Yup, map and set use far more memory than one might expect, especially
for small data objects. A vector or a deque would make much better use
of available memory when you need to manage such a large number of objects.
 
R

Rich S

Thanks for replying.

I'm never sure how many objects I'll need, it depends on the data. I
could have up to 300 million but it's doubtful. I understand that I
could never actually hold that many but I was hoping for a higher
number than 19 million.

I'm trying to catch std::exception bad_alloc and I don't have the file
handy so I'm not sure what is being thrown. I'll check it out next
crash.

Thanks
 
R

Rich S

ok thanks. I was going to try vector but I calculated the bytes to be
higher than that of the map so I figured why bother, but I'll give it a
try.
 
R

Rich S

I'm trying the vector class now but it's running really really slow, I
don't think it'll get to 19 million any time soon and this is in
release mode.

Since I was trying to emulate the map class what I did was create the
bitset object, do a find on the vector class and if it finds it, it
updates an internal counter otherwise it adds the object to the vector.

It must be the find call that's slowing it down drastically.

Any other thoughts?

Thanks again
 
G

Gianni Mariani

Rich said:
I'm trying the vector class now but it's running really really slow, I
don't think it'll get to 19 million any time soon and this is in
release mode.

Since I was trying to emulate the map class what I did was create the
bitset object, do a find on the vector class and if it finds it, it
updates an internal counter otherwise it adds the object to the vector.

It must be the find call that's slowing it down drastically.

Any other thoughts?

searching a map is O(lnN), searching a vector is O(N).

G
 
D

David White

Rich said:
I'm trying the vector class now but it's running really really slow, I
don't think it'll get to 19 million any time soon and this is in
release mode.

Since I was trying to emulate the map class what I did was create the
bitset object, do a find on the vector class and if it finds it, it
updates an internal counter otherwise it adds the object to the
vector.

It must be the find call that's slowing it down drastically.

Any other thoughts?

You could try a map that uses a more memory efficient algorithm. A hash
table would probably be better. I have a similar problem with a program that
likes to store as many small objects as it can in a std::set object.
Depending on the exact object size it can store maybe 15 million before it
runs out of memory. The std::set also uses nodes with three pointers, so I'm
going to try a hash-table instead.

Are you deleting objects from the map as well as adding them? If not you can
use a memory allocator that has zero overhead per allocated block and save a
few more bytes per allocation. Such an allocator can only free all the
memory it's allocated at once and not individually allocated blocks, but
that's all you need if you aren't deleting objects except when the map is
destroyed. I use that kind of allocator with my std::set of small objects.

DW
 
R

ravinderthakur

if you are running your application on windows and are not sure what
type of exceptions are being thrown then u can look up the structured
exception handling in windows (SEH)
to see if those exceptions are being thrown.

windows throws these exceptions in certain cases (eg. when no more
stack space could be allocated).


thanks
rt
 
M

msalters

Rich S schreef:
I'm trying the vector class now but it's running really really slow, I
don't think it'll get to 19 million any time soon and this is in
release mode.

Since I was trying to emulate the map class what I did was create the
bitset object, do a find on the vector class and if it finds it, it
updates an internal counter otherwise it adds the object to the vector.

It must be the find call that's slowing it down drastically.

Yep, std::find is slow in comparison. Sorting may be expensive, but
it's
not a whole lot slower. I'd wrap access to the vector, and keep a bool
isSorted flag. When you add an object, don't check for duplicates. You
only check for duplicates if you run out of memory, or if some external
condition requires uniqueness. Furthermore, once you know your vector
is
full and won't change you can also sort them.

This of course doesn't work if the vector changes a lot, and you have
find() calls for other reasons than to check for duplicates. Those find
calls would require isSorted to be true. Even then, it may be possible
to keep a small second vector of unsorted items (say <100). Your find
would first check the sorted collection, and then the unsorted items.
If the unsorted collection becomes too big (==100), you merge them
into the sorted set. This reduces resorts by 99%, but it will roughly
double the sort times to O(log(N)+M) (e.g. N=19Million, M is 100)

HTH,
Michiel Salters
 
R

Rich S.

David White said:
You could try a map that uses a more memory efficient algorithm. A hash
table would probably be better. I have a similar problem with a program
that
likes to store as many small objects as it can in a std::set object.
Depending on the exact object size it can store maybe 15 million before it
runs out of memory. The std::set also uses nodes with three pointers, so
I'm
going to try a hash-table instead.

Are you deleting objects from the map as well as adding them? If not you
can
use a memory allocator that has zero overhead per allocated block and save
a
few more bytes per allocation. Such an allocator can only free all the
memory it's allocated at once and not individually allocated blocks, but
that's all you need if you aren't deleting objects except when the map is
destroyed. I use that kind of allocator with my std::set of small objects.

DW

Thanks.In one of the stl implementations I saw a hashmap so perhaps I'll
give that a try. I've worked with boost, stlport and sxl but I'm not sure
which is best as far as performance and hashing. Which hash-table are you
thinking of going with?

I only add objects to the map and I never delete them until the program is
done so I could go with the zero overhead allocation.

thanks
 
P

Pete Becker

David said:
You could try a map that uses a more memory efficient algorithm. A hash
table would probably be better. I have a similar problem with a program that
likes to store as many small objects as it can in a std::set object.
Depending on the exact object size it can store maybe 15 million before it
runs out of memory. The std::set also uses nodes with three pointers, so I'm
going to try a hash-table instead.

Depending on the implementation, you'll reduce the typical overhead from
three pointers per node down to one or two, plus the memeory manager's
overhead. That could get you a 10-20% reduction in memory usage.
However, most memory managers round the requested size up to the next
multiple of some internal constant, so you might not see any difference
at all.
 
P

Pete Becker

Rich said:
I'm trying the vector class now but it's running really really slow, I
don't think it'll get to 19 million any time soon and this is in
release mode.

Since I was trying to emulate the map class what I did was create the
bitset object, do a find on the vector class and if it finds it, it
updates an internal counter otherwise it adds the object to the vector.

It must be the find call that's slowing it down drastically.

Maybe. Since you haven't described what you're trying to do, it's not
possible to give meaningful advice.
 
R

Rich S

I am seeking out numeric combinations within a set of data and am using
a bitset class and a map to store them and keep a total of how many I
have.

I'm trying to store as many combinations as I can but when I reach the
memory limits I would like to catch the exception and write out the
results at that point.

Thats it in a nutshell.
 
P

Pete Becker

Rich said:
I am seeking out numeric combinations within a set of data and am using
a bitset class and a map to store them and keep a total of how many I
have.

I'm trying to store as many combinations as I can but when I reach the
memory limits I would like to catch the exception and write out the
results at that point.

Thats it in a nutshell.

Okay, but which is more important: speed or quantity? You can store far
more data in a vector or a deque than you can in a set, but the code
will be slower.
 
D

David White

Rich said:
Thanks.In one of the stl implementations I saw a hashmap so perhaps
I'll give that a try. I've worked with boost, stlport and sxl but I'm
not sure which is best as far as performance and hashing. Which
hash-table are you thinking of going with?

I'll probably try writing my own so I can optimize it for my program. The
data I'm storing is very biased. Each object is a series of bytes that are
more likely to be low values than high. Some byte values will appear in very
many objects and some values in none at all, and the bytes are always sorted
in ascending order. I need a hash algorithm that's fast and will generate an
effectively random table index from the biased data. At the moment I'm
thinking of using each byte as an index into a table, or multiple tables, of
pre-generated random numbers and XORing the random numbers together, but
I'll have to do some experiments.

DW
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,218
Latest member
JolieDenha

Latest Threads

Top