Efficient large chunk of memory copy? -- copy diffs?

D

Developer

If I have a large chunk of memory, data_vector; and I want to make an
identical copy of it, duplicated_data--- basically a snapshot of
data_vector.

Is there a cheap way to do this? data_vector changes gradually, and I
want to make duplicated_data synchronized to data_vector at some time.
Basically, the difference between the previous 'snapshot' is just a
fraction of the whole data chunk. Is there a way to only copy the
'diff'? Let's say data_vector is around 1GB, and the 'diff' is only
10MB between snapshots.


--------------
class MyData {
private:
int data;
.....
public:
int get_data() const { return data; }
....
}

vector<MyData> data_vector;
vector<MyData> duplicated_data;
-----------------
 
A

Alf P. Steinbach

* Developer:
If I have a large chunk of memory, data_vector; and I want to make an
identical copy of it, duplicated_data--- basically a snapshot of
data_vector.

Vector const vCopy = v;


Is there a cheap way to do this?

Define "cheap".

data_vector changes gradually, and I
want to make duplicated_data synchronized to data_vector at some time.
Basically, the difference between the previous 'snapshot' is just a
fraction of the whole data chunk. Is there a way to only copy the
'diff'?

Yes. It depends on your 'diff'. If it can be defined then it can be copied, and
if it can't be defined then it doesn't exist and there's nothing to copy. :)

Let's say data_vector is around 1GB, and the 'diff' is only
10MB between snapshots.

Good, then you can save a lot of storage.

--------------
class MyData {
private:
int data;
.....
public:
int get_data() const { return data; }
....
}

Missing semicolon.

vector<MyData> data_vector;
vector<MyData> duplicated_data;

Hm -- and so?

What is the purpose of MyData?

Why don't you just use 'int'?


Cheers & hth.,

- Alf
 
D

Developer

* Developer:


   Vector const vCopy = v;


Define "cheap".
Most efficient way in regard of time.

Yes. It depends on your 'diff'. If it can be defined then it can be copied, and
if it can't be defined then it doesn't exist and there's nothing to copy. :)


Good, then you can save a lot of storage.


Missing semicolon.


Hm  --  and so?

What is the purpose of MyData?
MyData contains lots of fields, int data is just an example of the
members. But hopefully this does not make my question unclear.

Basically, I am asking if there is a smart way to copy diffs.
 
A

Alan Woodland

Developer said:
* Developer:

Vector const vCopy = v;

Define "cheap".
Most efficient way in regard of time.

data_vector changes gradually, and I
want to make duplicated_data synchronized to data_vector at some time.
Basically, the difference between the previous 'snapshot' is just a
fraction of the whole data chunk. Is there a way to only copy the
'diff'?
Yes. It depends on your 'diff'. If it can be defined then it can be copied, and
if it can't be defined then it doesn't exist and there's nothing to copy. :)
Let's say data_vector is around 1GB, and the 'diff' is only
10MB between snapshots.
Good, then you can save a lot of storage.
[snip]
vector<MyData> data_vector;
vector<MyData> duplicated_data;
Hm -- and so?

What is the purpose of MyData?
MyData contains lots of fields, int data is just an example of the
members. But hopefully this does not make my question unclear.

Basically, I am asking if there is a smart way to copy diffs.
You could play funny games with marking pages Read Only, and then
catching a write and making a copy of the page before allowing the write
to succeed. Of course this is totally non-portable and way off topic for
C++.

Depending on quite what you need you could wrap a std::map, and every
time you do a "snapshot" you copy the current map into a snapshot
object, and make a new blank map for the current one. Then when trying
to look things up if you can't find it in the current map it means that
it must be in the previous snapshot, so delegate it like that, but then
access cost is essentially a function of the age of the data of course.

Alan
 
D

Daniel Pitts

Developer said:
If I have a large chunk of memory, data_vector; and I want to make an
identical copy of it, duplicated_data--- basically a snapshot of
data_vector.

Is there a cheap way to do this? data_vector changes gradually, and I
want to make duplicated_data synchronized to data_vector at some time.
Basically, the difference between the previous 'snapshot' is just a
fraction of the whole data chunk. Is there a way to only copy the
'diff'? Let's say data_vector is around 1GB, and the 'diff' is only
10MB between snapshots.


--------------
class MyData {
private:
int data;
.....
public:
int get_data() const { return data; }
....
}

vector<MyData> data_vector;
vector<MyData> duplicated_data;
-----------------

If the operations that modify data_vector are "cheap", you might look
into the "Command Pattern". Basically, any operation that might read
with data_vector can be left alone, but any modification needs to create
a Command instance. The Command interface might look something like this:

class MyDataCommand {
public:
virtual void update(vector<MyData> &target) const = 0 ;
inline virtual ~MyDataCommand() {};
};

Then, you could have (psuedo-code):
class DataHolder {
private:
std::vector<MyData> &data_vector;
std::vector<MyData> &duplicate_data;
std::vector<tr1::shared_ptr<MyDataCommand> > commands;
public:
void updateData(tr1::shared_ptr<MyDataCommand> command) {
command->update(data_vector);
commands.insert(command);
}
void sync() {
for_each(commands.begin(), commands.end(), applyToDuplicateData);
commands.clear();
}
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top