How to define a map whose 'key' is a structure?

L

Lambda

I'd like to define a std::map: map<struct term, vector<size_t> >.

The definition of the structure is:
struct term {
string word;
size_t freq;
};

the freq is the size() of vector<size_t>.

When I insert the 'term' into the map,
the map should sort the key by the string 'word'.

And later, I need to sort the 'term' by the 'freq'.

I think I can define two compare functions,
one for insertion to the map: map<struct term, vector<size_t>, comp1>
index;
another for a sort function: sort(terms.begin(), terms.end(), comp2).

And now:
1. How can I use map as associative array, that is, how to do
something like:
struct term test;
test.string = "Hello";
test.freq = 0;
index[test].push_back(1);
And after I push_back all the size_t, I need to change the value of
test.freq.

2. I didn't find a function that return all the 'keys' as a vector or
sth like that.
If I define a function take a parameter vector<string>,
how can I match the string with struct term in the map?

I wish I presented my questions clearly.

Thanks,
Stephen Hsu
 
P

Pascal J. Bourguignon

Lambda said:
I'd like to define a std::map: map<struct term, vector<size_t> >.

The definition of the structure is:
struct term {
string word;
size_t freq;
};

the freq is the size() of vector<size_t>.

Then just say it. In C++ !!

struct term {
string word;
};

size_t freq(string word,const map<term,vector<size_t> >& wordMap){
// freq
// is the
return wordMap[word].size();
// size
// of the vector of the
// word in the
// wordMap.
}

(and suddenly, you don't need structures as key anymore, but let's
assume you really needed a structure, like for example:

struct PersonId {
string name;
string firstName;
Date birthDate;
};

When I insert the 'term' into the map,
the map should sort the key by the string 'word'.

And later, I need to sort the 'term' by the 'freq'.

I think I can define two compare functions,
one for insertion to the map: map<struct term, vector<size_t>, comp1>
index;
another for a sort function: sort(terms.begin(), terms.end(), comp2).

You cannot sort maps. The iterators returned by a map are not
random_iterators, that the sort algorithm expects. If you want the
keys you have put in the map in some other order, then you will have
to copy them in a vector, and sort that vector.

And now:
1. How can I use map as associative array, that is, how to do
something like:
struct term test;
test.string = "Hello";
test.freq = 0;
index[test].push_back(1);
And after I push_back all the size_t, I need to change the value of
test.freq.

You don't. It's like asking to change the value of 42.

If you need to change the value of test.freq without changing the
identity of the key, then you cannot use copies of the structures as
keys. You must use a pointer (or smartpointer) to the structure as
key. Then, the key will actually be the address of the structure, and
you will be able to change the slots of the structure, but of course,
the order will have to depend only on the actually key.

Otherwise, you won't change the value of the copies of the structures
used as keys in the map, unless you're ready to remap it (copy all the
entries from one map to another, changing on the fly the slots you
want in the keys). You can also remove the old entry from the map,
change the value of the key, and put back a new entry with the new
value of the key.


That is, to be clear, you cannot change the slots of the composite
key that are used by the order function of the map.


2. I didn't find a function that return all the 'keys' as a vector or
sth like that.

Yep. Or do you think the standard library should include all the
programs you will ever have to write, so you're left only in choosing
which program you want and #include it?

If I define a function take a parameter vector<string>,
how can I match the string with struct term in the map?

You tell us!

How do you want to match it?

I wish I presented my questions clearly.

I wish you have clearer thoughts.
 
L

Lambda

Lambda said:
I'd like to define a std::map: map<struct term, vector<size_t> >.
The definition of the structure is:
struct term {
string word;
size_t freq;
};
the freq is the size() of vector<size_t>.

Then just say it.  In C++ !!

struct term {
   string word;

};

size_t freq(string word,const map<term,vector<size_t> >& wordMap){
//     freq
//                                  is the
    return wordMap[word].size();
//                       size
//                                  of the vector of the
//                 word             in the
//         wordMap.

}

(and suddenly, you don't need structures as key anymore, but let's
assume you really needed a structure, like for example:

struct PersonId {
  string  name;
  string  firstName;
  Date    birthDate;

};

Thank you for your reply!
Of course, I can get the number of the size_t by the vector.size()
function.
I have tried it first. But it's not good enough.

I want to sort several strings order by the matching vector size.
If I want to get the size of vector, I need to read the whole vector
first, right?
But because the amount of data of all the vectors is too large to
reside in memory,
I'll put all the vectors in disk.
And let the map key contains a pointer to each matching vectors.
If I have vector size in the key, I can sort the several strings first
without disk read. So I need this redundancy vector size member in the
key.
You cannot sort maps.  The iterators returned by a map are not
random_iterators, that the sort algorithm expects.  If you want the
keys you have put in the map in some other order, then you will have
to copy them in a vector, and sort that vector.

I don't want to sort the map, because all the members in the map is
always sorted.
As the map is implemented with Red-Black Tree,
the map need to know how to compare two keys to put them in right
places.
And now:
1. How can I use map as associative array, that is, how to do
something like:
struct term test;
test.string = "Hello";
test.freq = 0;
index[test].push_back(1);
And after I push_back all the size_t, I need to change the value of
test.freq.

You don't.  It's like asking to change the value of 42.  

If you need to change the value of test.freq without changing the
identity of the key, then you cannot use copies of the structures as
keys.  You must use a pointer (or smartpointer) to the structure as
key.  Then, the key will actually be the address of the structure, and
you will be able to change the slots of the structure, but of course,
the order will have to depend only on the actually key.

Otherwise, you won't change the value of the copies of the structures
used as keys in the map, unless you're ready to remap it (copy all the
entries from one map to another, changing on the fly the slots you
want in the keys). You can also remove the old entry from the map,
change the value of the key, and put back a new entry with the new
value of the key.

That is, to be clear,  you cannot change the slots of the composite
key that are used by the order function of the map.  

You mean, if I assign a key to map, the key must be constant, right?
Why can I use pointers? The address of the structure will not change?
What if I move the structure to another place? The map will be broken?
Yep.  Or do you think the standard library should include all the
programs you will ever have to write, so you're left only in choosing
which program you want and #include it?

There is such a method in Java.
I just want to make sure there is no such function in C++.
You tell us!

How do you want to match it?

I don't know. I'm trying to find a better way.
 
M

Michael Oswald

Lambda said:
the freq is the size() of vector<size_t>.

When I insert the 'term' into the map,
the map should sort the key by the string 'word'.

And later, I need to sort the 'term' by the 'freq'.

I think I can define two compare functions,
one for insertion to the map: map<struct term, vector<size_t>, comp1>
index;
another for a sort function: sort(terms.begin(), terms.end(), comp2).

[snip]

This sounds to me, like you want someting like boost multiindex
(http://www.boost.org/doc/libs/1_35_0/libs/multi_index/doc/index.html)


lg,
Michael
 
P

Pascal J. Bourguignon

Lambda said:
I want to sort several strings order by the matching vector size.
If I want to get the size of vector, I need to read the whole vector
first, right?

No. Check some stl reference, size() is O(1) for vectors.
http://www.cplusplus.com/reference/stl/vector/size.html
(See section "Complexity").
But because the amount of data of all the vectors is too large to
reside in memory,
I'll put all the vectors in disk.

You don't really have a choice, stl vectors are stored in virtual
memory. If your vectors become bigger than the available RAM, then
your system will swap some data out to the swap partition of file, but
this is done automatically. You don't have to do anything to get that
behavior.

On the other hand, if you are implementing your own vector, I cannot
say anything about them.

And let the map key contains a pointer to each matching vectors.
If I have vector size in the key, I can sort the several strings first
without disk read. So I need this redundancy vector size member in the
key.

All right. Just be sure you don't use this field in the comparison
function you pass to stl::map.

You mean, if I assign a key to map, the key must be constant, right?
Why can I use pointers?

Because C++ objects don't move in memory.

(This wouldn't be true with a copying garbage collector, but C++ is
far from having one. Language implementations that use copying
garbage collector may have to rehash hash-tables when they move keys).

The address of the structure will not change?
Right.


What if I move the structure to another place?
The map will be broken?

What circumstance are you considering?

struct Key {
....
};
bool Key_equal(const Key& a,const Key& b){
....
}

std::map<Key,Value,Key_equal> map1;
std::map<Key*,Value> map2;


In the case of map1, the key structures are copied inside the map, so
you can always destroy (or "move") your Key structures.

In the case of map2, the pointer to the key structures are copied
inside the map, so if you destroy (or have destroyed) a key structure
pointed to by map2, you're burst! You must ensure that the pointer to
the keys inside map2 stay valid while map2 is valid, like for any
other pointer. (Same would be true with references).

I just want to make sure there is no such function in C++.

Browse a stl reference, such as: http://www.cplusplus.com/reference/stl/

I don't know. I'm trying to find a better way.


I think it would be clearer to keep the attribute with the value than
with the key. That's really what you want to do. In your case, it
seems the key is only the string, so you don't need a structure for
the key.

struct Attributes {
std::string word; // yes, put the key inside the value, it's often useful.
std::vector<size_t> sizes;
size_t numberOfSizes; // = sizes.size()
Attributes(std::string aWord):word(aWord){
computeSizes(word,&sizes);
numberOfSizes=sizes().size();
}
void update(void){
updateSizes(word,&sizes);
numberOfSizes=sizes().size();
}
};

std::map<std::string,Attributes*> wordMap;

then you can enter your words:

wordMap[word]=new Attributes(word);

and you can update your counters:

wordMap[word]->update();

and you can get the cached numbers:

wordMap[word]->numberOfSizes;
 
L

Lambda

No. Check some stl reference, size() is O(1) for vectors.http://www.cplusplus.com/reference/stl/vector/size.html
(See section "Complexity").


You don't really have a choice, stl vectors are stored in virtual
memory.  If your vectors become bigger than the available RAM, then
your system will swap some data out to the swap partition of file, but
this is done automatically. You don't have to do anything to get that
behavior.

On the other hand, if you are implementing your own vector, I cannot
say anything about them.

Won't the vector throw bad_alloc exception?
If so, I can't know there is no free memory available during
execution?
And let the map key contains a pointer to each matching vectors.
If I have vector size in the key, I can sort the several strings first
without disk read. So I need this redundancy vector size member in the
key.

All right.  Just be sure  you don't use this field in the comparison
function you pass to stl::map.
You mean, if I assign a key to map, the key must be constant, right?
Why can I use pointers?

Because C++ objects don't move in memory.

(This wouldn't be true with a copying garbage collector, but C++ is
far from having one.  Language implementations that use copying
garbage collector may have to rehash hash-tables when they move keys).
The address of the structure will not change?
Right.

What if I move the structure to another place?
The map will be broken?

What circumstance are you considering?

struct Key {
...};

bool Key_equal(const Key& a,const Key& b){
...

}

std::map<Key,Value,Key_equal> map1;
std::map<Key*,Value>          map2;

In the case of map1, the key structures are copied inside the map, so
you can always destroy (or "move") your Key structures.

In the case of map2, the pointer to the key structures are copied
inside the map, so if you destroy (or have destroyed) a key structure
pointed to by map2, you're burst!  You must ensure that the pointer to
the keys inside map2 stay valid while map2 is valid, like for any
other pointer. (Same would be true with references).
I just want to make sure there is no such function in C++.

Browse a stl reference, such as:http://www.cplusplus.com/reference/stl/
I don't know. I'm trying to find a better way.

I think it would be clearer to keep the attribute with the value than
with the key.  That's really what you want to do.  In your case, it
seems the key is only the string, so you don't need a structure for
the key.

struct Attributes {
    std::string         word; // yes, put the key inside the value, it's often useful.
    std::vector<size_t> sizes;
    size_t              numberOfSizes; // = sizes.size()
    Attributes(std::string aWord):word(aWord){
       computeSizes(word,&sizes);
       numberOfSizes=sizes().size();
    }
    void update(void){
         updateSizes(word,&sizes);
         numberOfSizes=sizes().size();
    }

};

std::map<std::string,Attributes*> wordMap;

then you can enter your words:

wordMap[word]=new Attributes(word);

and  you can update your counters:

wordMap[word]->update();

and  you can get the cached numbers:

wordMap[word]->numberOfSizes;

Good idea, i'll try it. Thanks.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top