std::vector erase is not freeing mem?

S

Sims

John said:
That's exactly why its not normally done with std::vector.

john

So what would be a better option to add 2 ints?

that all the vector has millions of pair of int, and I work with them to
place them in another array.

If not vectors what would I use?

Simon
 
J

John Harrison

Sims said:
So what would be a better option to add 2 ints?

that all the vector has millions of pair of int, and I work with them to
place them in another array.

If not vectors what would I use?

Well I don't know what problem you are trying to solve. Certainly the
approach you are using at the moment is very inefficient, but some problems
are just like that. Maybe you could explain exactly what it is that you are
trying to do

john
 
S

Sims

Well I don't know what problem you are trying to solve. Certainly the
approach you are using at the moment is very inefficient, but some problems
are just like that. Maybe you could explain exactly what it is that you are
trying to do

Like I said, I am reading a massive file that contain numbers. Depending
on the numbers I place them in another array and then rebuild separate
files.

to keep it simple assume that I am reading the following

1, 2
3, 5
10, -2
-2, 8

After reading all the numbers I would create 2 new arrays

each containing
A:
1, 2
3, 5 // because all the numbers are +ve.
B:
10, -2
-2, 8 // because a number in the pair was -ve.

In the example above I could move the items in the array directly as I
read them but in my project the numbers are all related, so I need to
know all the numbers first before I can work with them.

that's why I need to read all the numbers first then loop thru all of
them one by one and erasing the pairs that I used.

Simon
 
P

Pete Becker

Sims said:
After reading all the numbers I would create 2 new arrays

But you still haven't said what you're really trying to accomplish. In
any case, it looks like one possible solution is to simply partition the
original array, putting one set of values at the lower end and the other
set at the upper end. Then [begin, mid) is one set and [mid, end) is the
other set.
 
S

Sims

Pete said:
Sims said:
After reading all the numbers I would create 2 new arrays


But you still haven't said what you're really trying to accomplish. In
any case, it looks like one possible solution is to simply partition the
original array, putting one set of values at the lower end and the other
set at the upper end. Then [begin, mid) is one set and [mid, end) is the
other set.

Sorry I must be missing something, I am not certain why I need to
explain in more details what I am trying to achieve.

It would further complicate the problem. I would need to spend hours
explaining the math/logic/reason of why I was given such a task.

I thought that my, very simplified, explanation would have been enough.
And in any case, I have said what I am trying to accomplish, sorry if it
does not sound real enough.

I need to read some data into an array and then remove it, (including
freeing it's memory).

My way is obviously not very efficient, (because of the time it takes to
copy the data to the new array), so I was wondering if there was a
'better' way of doing it.

Regards

Simon
 
P

Pete Becker

Sims said:
I thought that my, very simplified, explanation would have been enough.

Your very simplified explanation undoubtedly has hidden assumptions.
Good design requires examining those assumptions. If you're not willing
to try to explain them in a bit more detail then you're not going to get
good answers here.
My way is obviously not very efficient, (because of the time it takes to
copy the data to the new array), so I was wondering if there was a
'better' way of doing it.

It depends on what "it" is. But since you're not willing to elaborate on
that, you're just wasting other people's time.
 
S

Sims

Pete said:
It depends on what "it" is. But since you're not willing to elaborate on
that, you're just wasting other people's time.

I posted in an earlier post what "it" was.
I'll post it again in case you did not get it.

I am sorry you think I do not wish to elaborate but sadly there is not
much more to elaborate on. As I said I want to read some data and I wish
to erase the data properly, (free the memory).

Sorry to waste your time, but sadly that's all there is to it.

//
//
//
struct int_array{
int_array(){
a = b = 0;
}
int a, b;
};

//
//
//
typedef std::vector< int_array, std::allocator< int_array >> v_int_array;

v_int_array *g_my_int_array = 0;

//
// erase 1 or 2 items from the array.
//
void erase( int nItemA, int nItemB = -1 )
{
if( !v_int_array ){
return; // nothing to erase.
}

int nSize = v_int_array.size();
nSize-= 1;
nSize-= (b==-1?0:1);//if we are deleting more than one item...
v_int_array *my_int_array = new v_int_array;
my_int_array.reserve( nSize );

for( int loop = 0; loop < v_int_array.size(); loop++)
{
if( loop == nItemA || loop == nItemB )
{
continue;
}
my_int_array.push_back( *(g_my_int_array->begin()+loop ));
}
// remove the old one
delete g_my_int_array;
// replace with the new one
g_my_int_array = *my_int_array;
}

Simon
 
P

Pete Becker

Sims said:
I am sorry you think I do not wish to elaborate but sadly there is not
much more to elaborate on. As I said I want to read some data and I wish
to erase the data properly, (free the memory).

But you also said you want to split the data into two chunks. That
doesn't necessarily involve copying stuff into separate arrays and
freeing any memory, as the suggestion I made indicates. When you're
having any sort of coding problem, don't fixate on the particular
approach that you're having problems with. There may well be a different
approach that greatly simplifies the task.
 
S

Sims

Pete said:
But you also said you want to split the data into two chunks. That
doesn't necessarily involve copying stuff into separate arrays and
freeing any memory, as the suggestion I made indicates. When you're
having any sort of coding problem, don't fixate on the particular
approach that you're having problems with. There may well be a different
approach that greatly simplifies the task.

Yes but the problem is that the numbers I read are needed to know into
what array I will put them.

All the data need is in the form

A, B
C, D
E, F
....
A, B

As you can see the chunk of data starts and ends with the same pair. But
the problem is that it is not always the case in the file I am reading
but rather in the format

0, 0 <- new 'sub' chunk
A, B
C, D
E, F
0, 0 <- new 'sub' chunk
E, F
Z, X
0, 0 <- new 'sub' chunk
Z, X
A, B

So I need to know all the data first so that I can re-order it in
another file in one complete chunk.

A, B
C, D
E, F
Z, X
A, B
(removing the repeating pairs).

But there are 000's of chunks of data and they are in turn sub divided
into 000's of sub-chunks themselves.

Simon
 
P

Pete Becker

Sims said:
0, 0 <- new 'sub' chunk
A, B
C, D
E, F
0, 0 <- new 'sub' chunk
E, F
Z, X
0, 0 <- new 'sub' chunk
Z, X
A, B

So I need to know all the data first so that I can re-order it in
another file in one complete chunk.

A, B
C, D
E, F
Z, X
A, B
(removing the repeating pairs).

Okay, that's clearer, but there must be more to it, because the solution
for the sample data you've presented is trivial: remove the '0, 0' pairs
and then remove adjacent duplicate pairs. That can be done on the fly,
without having to store anything more than the last pair that was read.
What's missing from this sample?

The most important thing to do when writing code is to have a clear idea
of what you want the code to accomplish, and one of the best ways to
know that you have a clear idea is being able to explain it to someone
else. I've often found the solution to a problem I've been struggling
with for days by explaining it to someone else. It doesn't even matter
whether they understand the explanation -- the effort to explain it
clearly makes the concept itself clearer. Sometimes I even explain it to
myself. But only when there's nobody listening. <g>
 
S

Sims

Pete said:
Okay, that's clearer, but there must be more to it, because the solution
for the sample data you've presented is trivial: remove the '0, 0' pairs
and then remove adjacent duplicate pairs. That can be done on the fly,
without having to store anything more than the last pair that was read.
What's missing from this sample?

Well all that is missing is the fact that they are not in the right order.

A bit more like..

0, 0 <- new 'sub' chunk
E, F
Z, X
0, 0 <- new 'sub' chunk
A, B
C, D
E, F
0, 0 <- new 'sub' chunk
Z, X
A, B

And further more, as I said there is 000's of chunks of data all in
different order mixed up all within each others.

A bit like a puzzle really, the data is like the pieces, I must put all
the pieced together to re-create the puzzle.
The only difference is that I have a few thousand puzzles each with a
few thousand pieces, (all in one box) :).
The most important thing to do when writing code is to have a clear idea
of what you want the code to accomplish, and one of the best ways to
know that you have a clear idea is being able to explain it to someone
else. I've often found the solution to a problem I've been struggling
with for days by explaining it to someone else. It doesn't even matter
whether they understand the explanation -- the effort to explain it
clearly makes the concept itself clearer. Sometimes I even explain it to
myself. But only when there's nobody listening. <g>

I do have a clear idea of what I need to achieve, I never said I wasn't
sure. I am not sure _how_ to achieve it, but that's not the same.

Simon
 
G

Gianni Mariani

Sims said:
Pete Becker wrote:
.....


Well all that is missing is the fact that they are not in the right order.

A bit more like..

0, 0 <- new 'sub' chunk
E, F
Z, X
0, 0 <- new 'sub' chunk
A, B
C, D
E, F
0, 0 <- new 'sub' chunk
Z, X
A, B

And further more, as I said there is 000's of chunks of data all in
different order mixed up all within each others.

I would create a base class that describes a segment.

struct Node
{
int values[2];

bool operator <( const Node & rhs ) const;
};

struct Segment
{
virtual unsigned Size() const = 0;
const Node * First() const = 0;
const Node * Last() const = 0;

virtual IsBeginnning() const = 0;
virtual const Node & Key() const = 0;
virtual void SetNext( const Segment * ) = 0;

bool Less( const Segment * rhs ) const
{
const Node & KeyLhs = this->Key();
const Node & KeyRhs = rhs->Key();

if ( KeyLhs < KeyRhs )
{
return true;
}

if ( KeyRhs < KeyLhs )
{
return false;
}

return this->IsBeginnning() > rhs->IsBeginnning();
}

};

struct Segment_End : Segment
{

const Segment * m_next;

Segment_End()
: m_next( 0 )
{
}

void SetNext( const Segment * i_next )
{
m_next = i_next;
}

virtual IsBeginnning() const
{
return false;
}

virtual const Node & Key() const
{
return * ( Last() -1 );
}

};

struct Segment_Begin : Segment
{

Segment ** m_pos;

Segment_Begin()
: m_pos( 0 )
{
}

void SetNext( const Segment * i_next )
{
m_next = i_next;
}

virtual IsBeginnning() const
{
return true;
}

virtual const Node & Key() const
{
return * ( First() );
}

};

template <unsigned N>
struct Segment_Basic : Segment_Begin, Segment_End
{

Node m_nodes[ N ];

Segment_Basic( const Node * first, const Node * last )
{
std::copy( first, last, m_nodes );
}

Segment_Basic()
{
// prolly should clear m_nodes
}

unsigned Size() const
{
return N;
}

const Node * First() const
{
return m_nodes;
}

const Node * Last() const
{
return m_nodes + N;
}
};

// -- or if you already have a monster array somewhere
// -- just refer to it

struct Segment_Array : Segment_Begin, Segment_End
{

const Node * m_first;
const Node * m_last;

Segment_Basic( const Node * first, const Node * last )
: m_first( first ), m_last( last )
{
}

unsigned Size() const
{
return m_last - m_first;
}

const Node * First() const
{
return m_first
}

const Node * Last() const
{
return m_last;
}
};

Then you place create a Segment_Array or Segment_Basic object for each
segment followed by inserting a Segment_Begin and Segment_End pointer
into an array (of Segment *) and then sort them by the "Less" function,
then traverse the sorted array and call the "Next" method for each
segment. From there you can trivially traverse the segments and in the
process create a second array that has all the segments spliced together.

I'm sure you could probably do this faster than this and I probably
missed a few things but it's performance probably won't suck too badly.
 
G

George Privalov

Sims said:
A, B
C, D
E, F
Z, X
A, B
(removing the repeating pairs).

You may consider the data structures based on std::map, assuming that
total number of unique pairs will not be outrageously high.
 
J

John Harrison

Sims said:
Well all that is missing is the fact that they are not in the right order.

A bit more like..

0, 0 <- new 'sub' chunk
E, F
Z, X
0, 0 <- new 'sub' chunk
A, B
C, D
E, F
0, 0 <- new 'sub' chunk
Z, X
A, B

And further more, as I said there is 000's of chunks of data all in
different order mixed up all within each others.

OK can you elaborate on how one chunk of data can be mixed up with all the
others? In the data above each of the sub chunks seems to be seperate from
each other

A bit like a puzzle really, the data is like the pieces, I must put all
the pieced together to re-create the puzzle.
The only difference is that I have a few thousand puzzles each with a few
thousand pieces, (all in one box) :).
[snip]

I do have a clear idea of what I need to achieve, I never said I wasn't
sure. I am not sure _how_ to achieve it, but that's not the same.

Jesus! You have a clear idea of what you need to achieve, we do not. You do
not have a clear idea of how to achieve it, but we proabably would if only
you could explain clearly what you are trying to achieve. Not the how, the
*what*. You have a file in some format, and you want to end up with another
file (or files) in some other format. Concentrate on that.

I think we might be getting there slowly. I think maybe the key will be that
not all the data need to be stored before you start processing, instead you
read and process at the same time, and maybe a different data structure can
be used.

john
 
R

Richard Herring

Nicolas Pavlidis said:
I missunderstood the question. I thought it was about how to make a
vector smaller while the program runs.
OK, what makes you think it will make the vector any smaller?

The normal trick is to swap the vector with a temporary *copy* of
itself:

std::vector<T>(my_vect).swap(my_vect);
 
S

Sims

John said:
OK can you elaborate on how one chunk of data can be mixed up with all the
others? In the data above each of the sub chunks seems to be seperate from
each other

I don't quite understand what you are asking here,
in the example given the sub chunks are separate from one another but
should be linked, (that's the problem).

so if I have 2 sub chunks

A B
C D
0 0 <-
C D
E F

then I would create one chunk

A B
C D
E F

the reason why I need to read the whole file is because the data can be
the other way around

C D
E F
0 0
A B
C D

In that case I would need to read the second sub-chunk to see that I
have enough data to create one chunk.

Lets leave him out of that one for now. :)
I think we might be getting there slowly. I think maybe the key will be that
not all the data need to be stored before you start processing, instead you
read and process at the same time, and maybe a different data structure can
be used.

As I tried to explain above the data could be processed before it is all
read but I am not sure it would make such a great difference, (if
anything it might slow the process down).

Going back to my puzzle example, I need all the pieces before I can
start building the model. I could test if each new piece goes somewhere
onto the puzzle but as I said I am not certain it will speed the process up.

And to complicate matters more I have more than one puzzle, (but each
piece only belongs to one puzzle).
 
S

Sims

George said:
You may consider the data structures based on std::map, assuming that
total number of unique pairs will not be outrageously high.

I have around 512mb of pairs.
 
S

Sims

Gianni said:
I would create a base class that describes a segment.

struct Node
{
int values[2];

bool operator <( const Node & rhs ) const;
};

struct Segment
{
virtual unsigned Size() const = 0;
const Node * First() const = 0;
const Node * Last() const = 0;

virtual IsBeginnning() const = 0;
virtual const Node & Key() const = 0;
virtual void SetNext( const Segment * ) = 0;

bool Less( const Segment * rhs ) const
{
const Node & KeyLhs = this->Key();
const Node & KeyRhs = rhs->Key();

if ( KeyLhs < KeyRhs )
{
return true;
}

if ( KeyRhs < KeyLhs )
{
return false;
}

return this->IsBeginnning() > rhs->IsBeginnning();
}

};

struct Segment_End : Segment
{

const Segment * m_next;

Segment_End()
: m_next( 0 )
{
}

void SetNext( const Segment * i_next )
{
m_next = i_next;
}

virtual IsBeginnning() const
{
return false;
}

virtual const Node & Key() const
{
return * ( Last() -1 );
}

};

struct Segment_Begin : Segment
{

Segment ** m_pos;

Segment_Begin()
: m_pos( 0 )
{
}

void SetNext( const Segment * i_next )
{
m_next = i_next;
}

virtual IsBeginnning() const
{
return true;
}

virtual const Node & Key() const
{
return * ( First() );
}

};

template <unsigned N>
struct Segment_Basic : Segment_Begin, Segment_End
{

Node m_nodes[ N ];

Segment_Basic( const Node * first, const Node * last )
{
std::copy( first, last, m_nodes );
}

Segment_Basic()
{
// prolly should clear m_nodes
}

unsigned Size() const
{
return N;
}

const Node * First() const
{
return m_nodes;
}

const Node * Last() const
{
return m_nodes + N;
}
};

// -- or if you already have a monster array somewhere
// -- just refer to it

struct Segment_Array : Segment_Begin, Segment_End
{

const Node * m_first;
const Node * m_last;

Segment_Basic( const Node * first, const Node * last )
: m_first( first ), m_last( last )
{
}

unsigned Size() const
{
return m_last - m_first;
}

const Node * First() const
{
return m_first
}

const Node * Last() const
{
return m_last;
}
};

Then you place create a Segment_Array or Segment_Basic object for each
segment followed by inserting a Segment_Begin and Segment_End pointer
into an array (of Segment *) and then sort them by the "Less" function,
then traverse the sorted array and call the "Next" method for each
segment. From there you can trivially traverse the segments and in the
process create a second array that has all the segments spliced together.

I'm sure you could probably do this faster than this and I probably
missed a few things but it's performance probably won't suck too badly.

Thanks,
I will look at it when I get back to my computers.

Simon
 
J

John Harrison

Sims said:
I don't quite understand what you are asking here,
in the example given the sub chunks are separate from one another but
should be linked, (that's the problem).

so if I have 2 sub chunks

A B
C D
0 0 <-
C D
E F

then I would create one chunk

A B
C D
E F

That I understand.
the reason why I need to read the whole file is because the data can be
the other way around

C D
E F
0 0
A B
C D

In that case I would need to read the second sub-chunk to see that I have
enough data to create one chunk.

So you are saying that if any sub-chunk has a pair of numbers in common with
another sub-chunk, then the duplicate must be removed and the sub-chunks
combined? And you have 512Mb of data? Tricky.

What if you have two chunks, you read in a new sub-chunk and you find that
one pair in the sub-chunk occurs in one of your chunks and another pair in
the sub-chunk occurs in the other chunk. Do you combine the two chunks into
one bigger chunk, or it that something that should never happen?

And are you trying to find all the chunks? Write them out to seperate files
or something?

A std::set would be the obvious thing to use, one std::set per chunk,
certainly much more efficient than a std::vector. The thing about std::sets
is that they remove duplicates automatically and efficiently and finding an
element within a std::set is much more efficient than finding an element
within a std::vector. But the size of the data worries me.

Read each sub-chunk into a std::set, then test each sub-chunk pair against
all the other chunks read so far (each of which is a std::set as well).
Combine the std::sets as necessary. It would actually be a very short
program, twenty lines or so.

john
 
J

John Harrison

Richard Herring said:
OK, what makes you think it will make the vector any smaller?

The normal trick is to swap the vector with a temporary *copy* of itself:

std::vector<T>(my_vect).swap(my_vect);

That's a neat trick, I'll remember that.

john
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,787
Messages
2,569,629
Members
45,331
Latest member
ElaneLyttl

Latest Threads

Top