Illogical std::vector size?

L

Larry I Smith

simon said:
I did not need that, maybe it's a window thing.


Your compiler made assumptions about strlen and strcpy.
You should always include the appropriate header for
functions used; you'll prevent a lot of subtle errors
that way.

void CleanAll(){
if(sSomeString1) { delete [] sSomeString1; sSomeString1 = 0; }
if(sSomeString2) { delete [] sSomeString2; sSomeString2 = 0; }
}

Ok, but that's just 'good' practice, not really related to my prblem.


Good practice is always good practice. Sometimes failure to
follow good practice can lead to hard to trace bugs.

Are you sure the destructor would not handle it?


You may be right. I tend to forget that objects within
a loop get constructed/destructed on each loop iteration.
It's a hold-over from 'the really olden days' of C++; so I
always (sigh - yes still) do my own cleanup.

I get 16Mb

16MB is a lot for such a simple program...

I'm running Linux and you're running MS Windows, and
we're using different compilers & libs; so we can't
really compare - it's not apples-to-apples.


[snip]

Thanks,

Simon.

Regards,
Larry
 
O

Old Wolf

simon said:
struct sFileData
{
char*sSomeString1;
char*sSomeString2;
int iSomeNum1;
int iSomeNum2;
sFileData(){ NullAll(); }
~sFileData(){ CleanAll(); }
sFileData(const sFileData&sfd) { NullAll(); *this = sfd; }
const sFileData& operator=( const sFileData &sfd ) {
if( this != &sfd)
{
CleanAll();
iSomeNum1 = sfd.iSomeNum1;
iSomeNum2 = sfd.iSomeNum2;

if( sfd.sSomeString1 ){
sSomeString1 = new char[strlen(sfd.sSomeString1)+1];
strcpy( sSomeString1, sfd.sSomeString1 );
}
if( sfd.sSomeString2 ){
sSomeString2 = new char[strlen(sfd.sSomeString2)+1];
strcpy( sSomeString2, sfd.sSomeString2 );
}
}
return *this;
}

void CleanAll(){
if(sSomeString1) delete [] sSomeString1;
if(sSomeString2) delete [] sSomeString2;
}

BTW, these if() tests are not needed, as delete[] is defined
to have no effect if the pointer is null.
void NullAll(){
sSomeString1 = 0;
sSomeString2 = 0;
iSomeNum1 = 0;
iSomeNum2 = 0;
}
std::vector< sFileData, std::allocator<sFileData> > address_;

It appears (from discussion elsewhere in the thread) that your
problem is excessive memory chewing by copying strings around
all over the place.

In an ideal world, your compiler would optimise to avoid this,
but it looks like we are in the real world :) I would suggest
trying some hand-optimisations in this case.
(Before you do this, try compiling in release mode instead of
debug mode, that may help).
sFileData sfd;
sfd.iSomeNum1 = 1;
sfd.iSomeNum2 = 2;
sfd.sSomeString1 = new char[5];
sfd.sSomeString2 = new char[8];
strcpy( sfd.sSomeString1, "Helo" );
strcpy( sfd.sSomeString2, "Goodbye" );
address_.push_back(sfd);

Try creating the object directly in the vector.. then the strings
only have to be allocated once:

address_.resize( address_.size() + 1 );
sFileData &sfd = address_.end()[-1];
sfd.iSomeNum1 = 1;
sfd.iSomeNum2 = 2;
sfd.sSomeString1 = new char[5];
sfd.sSomeString2 = new char[8];
strcpy( sfd.sSomeString1, "Helo" );
strcpy( sfd.sSomeString2, "Goodbye" );

You could avoid vector copies (using lots of time and memory)
by reserving all of the memory at the start:

address_.reserve(100000);

In your real program, where you don't know the exact size in
advance, you might like to estimate this number from the file size,
or something. Or, preferably,use a deque or a list, which don't
require huge reallocations when you add new members.

Another thing you might do (if the strings aren't going to be
manipulated too much by the rest of your program) is to allocate
them both at once (since your OS probably has a minimum allocation
size anyway, we will be halving memory usage):

sfd.sSomeString1 = new char[13];
sfd.sSomeString2 = sfd.sSomeString1 + 5;

(and modify your destructors and operator= accordingly).
 
M

msalters

simon schreef:
1) Windows Task Manager is not suited for this

yea, but it was what raised suspicion in the first place.
What might be better?
2) vector only stores sFileData objects, not the strings themselves
3) Even when vector has excess size (which is common, don't want to
reallocate after each pusch_back) it won't include the strings
4) Many implementations of new[] allocate at least 16 bytes, plus
the overhead needed for delete[]

Are you saying that std::string might actually be better in that case?
What might be a better way?

Some std::string implementations have a sizeof()==16 but need not use
the heap. That would definitely save memory, and it's also faster
However, do check to see if replacing vector with deque helps.
40Mb or 4Gb, there is still something not quite right, and i would prefer
to know what it is rather than brushing it under the rug.

Is it worth it? It's expensive to find out. You'd need a profiler, or
something like that. I once had to find out, because a colleague had
used char*, exceeded an implementation limit (2Gb) and therefore
had to break up his program in 6 stages, each taking 4 hours. Finding
out what was wrong, writing a custom replacement string and merging the
stages took me several weeks, but reduced runtime to 1 hour total.
Do you have the $$$ for that?

HTH,
Michiel Salters
 
C

ctrucza

40Mb or 4Gb, there is still something not quite right, and i would prefer
(Disclaimer: As others already pointed out task manager is not the best
tool for memory profiling)

2. one thing you could do to see if your code is at fault is to read
the file, do everything you do in your real code, but don't actually
store the data in the vector. Then use the task manager to see how the
memory consumption behaves.

If you get the same (or almost the same) memory growth, then windows
uses up the memory for reading the file.

If you get significantly different results, then you can continue
hunting the cause in your code.

Just my 2 cents worth.

Csaba
 
S

Simon

40Mb or 4Gb, there is still something not quite right, and i would
(Disclaimer: As others already pointed out task manager is not the best
tool for memory profiling)

I only used it to notice there was a problem.
If Windows report 16Mb been used then that must be close to the truth.

But I agree that it is not an exact science
2. one thing you could do to see if your code is at fault is to read
the file, do everything you do in your real code, but don't actually
store the data in the vector. Then use the task manager to see how the
memory consumption behaves.

I tried that already, with no vector call the memory usage is negligible,
700k.
The speed is around 1sec.

With the insert everything jumps to 16Mb and around 30sec.

I posted a small piece of code in this thread that does not even use files.
It clearly shows that something that should not be more than 5Mb grows to
around 25Mb.
If you get significantly different results, then you can continue
hunting the cause in your code.
Just my 2 cents worth.

It looks like windows does not see the need to free the memory.
Maybe if my machine was a bit closer to the memory limits it would not
happen.
I don't know how to release the memory, but it looks like an OS issue to me.

Thanks.

Simon
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,610
Members
45,254
Latest member
Top Crypto TwitterChannel

Latest Threads

Top