__gnu_cxx::hash_map question, please help!!!

A

aaragon

Working for a while with the gnu hash map I found that I cannot remove
elements from the map whereas if I use a std::map there are no
problems. Is this a bug of the implementation or am I doing something
wrong?

The function is very simple, and I added some code so I can use the
hash map with std::pair as keys (which are not given in the usual
keys).

The code I wrote for this is as follows:

namespace __gnu_cxx {

//! Structure used for hash tables of std::pair objects
template<> struct hash<std::pair<size_t,size_t> > {
size_t operator()(std::pair<size_t,size_t> __p) const {
std::size_t seed = 0;
boost::hash_combine(seed, __p.first);
boost::hash_combine(seed, __p.second);
return seed;
}
};
}

and then my class:

template<typename Value = double, typename Key =
std::pair<size_t,size_t>,
class ContainerPolicy = __gnu_cxx::hash_map<Key,Value> > class
SparseMatrix {
....
// some type defs, constructors, etc, etc...
....
....

void clean(ValueType);
};

template<typename V, typename K, class C> int SparseMatrix<V, K,
C>::clean(V tol) {

size_t count(0);
for(IteratorType it = values_.begin(); it!=values_.end(); ++it){
if(std::abs((*this)(it->first.first,it->first.second)) < tol ) {
// remove element from matrix
cout<<"erasing ("<<it->first.first<<","<<it->first.second<<") =
"<<(*this)(it->first.first,it->first.second)<<endl;
int what = values_.erase(it->first);
cout<<"what -> "<<what<<endl;
++count;
}
}
return count;
}

The function clean just removes those elements from the matrix that
are smaller than a tolerance. Now, the output of using this function
with a 15x15 matrix is

erasing (0,7) = 0
what -> 1
erasing (6,1) = 2.27e-13
what -> 1
terminate called after throwing an instance of 'fea::RuntimeError'
what(): *** ERROR *** Matrix index (24,0) out of bounds.
The sparse matrix has 15 rows and 15 columns.

I throw a runtime error when the access to the matrix is out of
bounds. But the out of bounds happens only when removing an element!
So the only thing that comes to my mind, is that there is a bug in the
implementation of the hash table. Is it correct???? removing and
element from the hash table shouldn't invalidate the iterators, right?

 
J

James Kanze

aaragon said:
Working for a while with the gnu hash map I found that I cannot remove
elements from the map whereas if I use a std::map there are no
problems. Is this a bug of the implementation or am I doing something
wrong?

The GNU hash map isn't really part of C++, but since it does
obey the same rules as std::map...
The function is very simple, and I added some code so I can use the
hash map with std::pair as keys (which are not given in the usual
keys).

It doesn't look simple to me:).
The code I wrote for this is as follows:
namespace __gnu_cxx {
//! Structure used for hash tables of std::pair objects
template<> struct hash<std::pair<size_t,size_t> > {
size_t operator()(std::pair<size_t,size_t> __p) const {
std::size_t seed = 0;
boost::hash_combine(seed, __p.first);
boost::hash_combine(seed, __p.second);
return seed;
}
};
}
and then my class:
template<typename Value = double, typename Key =
std::pair<size_t,size_t>,
class ContainerPolicy = __gnu_cxx::hash_map<Key,Value> > class
SparseMatrix {
...
// some type defs, constructors, etc, etc...
...
...
void clean(ValueType);
};
template<typename V, typename K, class C> int SparseMatrix<V, K,
C>::clean(V tol) {
size_t count(0);
for(IteratorType it = values_.begin(); it!=values_.end(); ++it){
if(std::abs((*this)(it->first.first,it->first.second)) < tol ) {
// remove element from matrix
cout<<"erasing ("<<it->first.first<<","<<it->first.second<<") =
"<<(*this)(it->first.first,it->first.second)<<endl;
int what = values_.erase(it->first);

It's not too clear here, but assuming that values_ is a member
with the type ContainerPolicy (and that IteratorType is
ContainerPolicy::iterator, of course), you've just invalidated
"it", because you've erased the element it designates. The
result is undefined behavior, although if you're using g++ 4.x,
and have compiled with the usual options (in particular:
-D_GLIBCXX_CONCEPT_CHECKS -D_GLIBCXX_DEBUG
-D_GLIBCXX_DEBUG_PEDANTIC), you should get a runtime error when
you next try to increment the iterator. At least, that's what I
get with:

#include <map>
#include <string>

typedef std::map< std::string, int >
Map ;

void
cleanUp( Map& m, int i )
{
for ( Map::iterator it = m.begin() ;
it != m.end() ;
++ it ) {
if ( it->second == i ) {
m.erase( it->first ) ;
}
}
}

int
main()
{
Map m ;
m.insert( Map::value_type( "one", 1 ) ) ;
m.insert( Map::value_type( "two", 2 ) ) ;
m.insert( Map::value_type( "three", 3 ) ) ;
cleanUp( m, 2 ) ;
return 0 ;
}

Which, if I understand your code correctly, is an abstraction of
what you are trying to do. (VC++ also gives a runtime error.
With Sun CC, however, the code "works", in that there is no
error message, and the process returns a successful status; this
is also the results if I omit the debugging options with g++.)
cout<<"what -> "<<what<<endl;
++count;
}
}
return count;
}
The function clean just removes those elements from the matrix that
are smaller than a tolerance. Now, the output of using this function
with a 15x15 matrix is
erasing (0,7) = 0
what -> 1
erasing (6,1) = 2.27e-13
what -> 1
terminate called after throwing an instance of 'fea::RuntimeError'
what(): *** ERROR *** Matrix index (24,0) out of bounds.
The sparse matrix has 15 rows and 15 columns.

I have no idea what is happening here; I have no idea what
fea::RuntimeError is, for example. You really should simplify
before posting---if the problem isn't associated with template
instantiation, for example, get rid of the templates (which only
confuse). If you can trigger the problem using standard types
(rather than user defined types), do so. Etc.

But since your code has
undefined behavior, it's as legal as anything else. If you're
using __gnu_cxx::hash_map, then presumably, you're using g++, so
if you're invoking the compiler normally (with the usual 20 or
so options just about every compiler needs to be useful), you
should be getting something like:
error: attempt to increment a singular iterator.
followed by a lot of other information.
I throw a runtime error when the access to the matrix is out
of bounds. But the out of bounds happens only when removing an
element! So the only thing that comes to my mind, is that
there is a bug in the implementation of the hash table. Is it
correct???? removing and element from the hash table shouldn't
invalidate the iterators, right?

I don't know of a single container type where you can remove the
element designated by the iterator without invalidating the
iterator. Your code, at least if I've understood it correctly,
does exactly this, and doesn't work with std::map. I rather
doubt that the g++ hash_map offers stronger guarantees in this
respect that std::map.
 
A

aaragon

The GNU hash map isn't really part of C++, but since it does
obey the same rules as std::map...


It doesn't look simple to me:).




It's not too clear here, but assuming that values_ is a member
with the type ContainerPolicy (and that IteratorType is
ContainerPolicy::iterator, of course), you've just invalidated
"it", because you've erased the element it designates. The
result is undefined behavior, although if you're using g++ 4.x,
and have compiled with the usual options (in particular:
-D_GLIBCXX_CONCEPT_CHECKS -D_GLIBCXX_DEBUG
-D_GLIBCXX_DEBUG_PEDANTIC), you should get a runtime error when
you next try to increment the iterator. At least, that's what I
get with:

#include <map>
#include <string>

typedef std::map< std::string, int >
Map ;

void
cleanUp( Map& m, int i )
{
for ( Map::iterator it = m.begin() ;
it != m.end() ;
++ it ) {
if ( it->second == i ) {
m.erase( it->first ) ;
}
}
}

int
main()
{
Map m ;
m.insert( Map::value_type( "one", 1 ) ) ;
m.insert( Map::value_type( "two", 2 ) ) ;
m.insert( Map::value_type( "three", 3 ) ) ;
cleanUp( m, 2 ) ;
return 0 ;
}

Which, if I understand your code correctly, is an abstraction of
what you are trying to do. (VC++ also gives a runtime error.
With Sun CC, however, the code "works", in that there is no
error message, and the process returns a successful status; this
is also the results if I omit the debugging options with g++.)




I have no idea what is happening here; I have no idea what
fea::RuntimeError is, for example. You really should simplify
before posting---if the problem isn't associated with template
instantiation, for example, get rid of the templates (which only
confuse). If you can trigger the problem using standard types
(rather than user defined types), do so. Etc.

But since your code has
undefined behavior, it's as legal as anything else. If you're
using __gnu_cxx::hash_map, then presumably, you're using g++, so
if you're invoking the compiler normally (with the usual 20 or
so options just about every compiler needs to be useful), you
should be getting something like:
error: attempt to increment a singular iterator.
followed by a lot of other information.


I don't know of a single container type where you can remove the
element designated by the iterator without invalidating the
iterator. Your code, at least if I've understood it correctly,
does exactly this, and doesn't work with std::map. I rather
doubt that the g++ hash_map offers stronger guarantees in this
respect that std::map.

--
James Kanze (GABI Software) email:[email protected]
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Thank you James for answering to my question. This is not the first
time you answered one of my questions and you always give me useful
information, I really appreciate it.
I actually managed to make my code work by not using that loop. I
actually store temporarily in a vector those values that need to be
removed and then I do a loop over the vector to remove them (not using
the hash_map iterators). This works perfectly. What I don't understand
is why it did work with the std::map. I will add those options for
compilation (yes I'm using g++ on Ubuntu linux). Are there any other
useful compilation options to add?
Once again, thanks a lot.

 
J

James Kanze

On Oct 25, 4:58 am, James Kanze <[email protected]> wrote:

[...]
I actually managed to make my code work by not using that loop. I
actually store temporarily in a vector those values that need to be
removed and then I do a loop over the vector to remove them (not using
the hash_map iterators). This works perfectly.

That's one way to do it. IMHO, the hard way. The classical
approach, I think, is somethink like:

Map::iterator current = myMap.begin() ;
while ( current != myMap.end() ) {
if ( someCondition ) {
myMap.erase( current ++ ) ;
// NOTE: current incremented before the erase!!
else {
++ current ;
}
}

Note that this is valid for the associateve containers, where
erase only invalidates iterators to the object erased, and
doesn't return the next iterator. For the sequences, where
erase may invalidate other iterators (including one to the
following element), but returns an iterator to the first element
following the one erased, you would write:

while ( current != myContaner.end() ) {
if ( someCondition ) {
current = myContaner.erase( current ) ;
else {
++ current ;
}
}

Note too that in both cases, you specify the element to be
removed by the iterator; you don't need to extract the key and
do a second look-up.
What I don't understand is why it did work with the std::map.

It's undefined behavior, so it might work, sometimes. Without
looking at the implementation, I would guess that the iterator
contains a pointer to the node, and the node contains whatever
information is necessary for iterating (perhaps indirectly, in
the form of a pointer to the parent node). The erase function
has freed the node, but the actual memory hasn't been
overwritten. Depending on the implementation, you may iterate
to the next element, or possibly, in special cases, skip an
element or visit one twice (if the iteration is done by going up
to the parent iterator, and the tree has been reorganized as a
result of the erase).

Why it didn't work with the hash map is probably due to the fact
that the nodes are organized differently. You'd have to look at
the details of the implementation to know why one crashed, and
the other seemed to work.

And of course, a slightly different implementation, and you
could end up with std::map crashing, and the hash_map working.
That's the fun of undefined behavior.
I will add those options for compilation (yes I'm using g++ on
Ubuntu linux). Are there any other useful compilation options
to add?

Doubtlessly:). The first time I use a compiler, I study every
possible option, to see which ones I'm interested in. (For g++,
see http://www.gnu.org/software/gcc/onlinedocs/.) Note that the
compiler documentation isn't always complete,
however---especially if you consider the standard library as
part of the compiler; for g++, for example, you'll also want to
check the library documentation (libstdc++-v3: there's also a
link to it on the page above---the chapter "Preprocessor macros
controlling the library" is the one which interests you).
Finding this information isn't always trivial, but is IMHO
essential. Exactly which options you want will depend on what
you are doing. (I have to deal with a lot of legacy code, for
example, so I often need options which support older constructs.)

Some of the options you might be interested in:

-std=c++98 -no-nonansi-builtins -fuse-cxa-atexit
For maximum standards conformance. (If you need to support
legacy code, you may need to add options here which "undo"
some conformance.)

-no-threadsafe-statics
Doesn't work correctly for Sparc anyway (haven't verified
Intel), and isn't portable, so you're probably better off
without it. It does mean that you'll have to worry about
the issue yourself, however.

There are also tons of options controling warnings.
Classically, the recommendation is to use -Wall -Wextra
-pedantic; I find it better to go through the entire list, and
specify each one independantly, according to my personal opinion
as to what is right, and what isn't. (I don't use -Wall, for
example, because it contains -Wmissing-braces, which generates a
warning whenever I use PTHREAD_MUTEX_INITIALIZER under Solaris.)
 
A

aaragon

[...]

I actually managed to make my code work by not using that loop. I
actually store temporarily in a vector those values that need to be
removed and then I do a loop over the vector to remove them (not using
the hash_map iterators). This works perfectly.

That's one way to do it. IMHO, the hard way. The classical
approach, I think, is somethink like:

Map::iterator current = myMap.begin() ;
while ( current != myMap.end() ) {
if ( someCondition ) {
myMap.erase( current ++ ) ;
// NOTE: current incremented before the erase!!
else {
++ current ;
}
}

Note that this is valid for the associateve containers, where
erase only invalidates iterators to the object erased, and
doesn't return the next iterator. For the sequences, where
erase may invalidate other iterators (including one to the
following element), but returns an iterator to the first element
following the one erased, you would write:

while ( current != myContaner.end() ) {
if ( someCondition ) {
current = myContaner.erase( current ) ;
else {
++ current ;
}
}

Note too that in both cases, you specify the element to be
removed by the iterator; you don't need to extract the key and
do a second look-up.
What I don't understand is why it did work with the std::map.

It's undefined behavior, so it might work, sometimes. Without
looking at the implementation, I would guess that the iterator
contains a pointer to the node, and the node contains whatever
information is necessary for iterating (perhaps indirectly, in
the form of a pointer to the parent node). The erase function
has freed the node, but the actual memory hasn't been
overwritten. Depending on the implementation, you may iterate
to the next element, or possibly, in special cases, skip an
element or visit one twice (if the iteration is done by going up
to the parent iterator, and the tree has been reorganized as a
result of the erase).

Why it didn't work with the hash map is probably due to the fact
that the nodes are organized differently. You'd have to look at
the details of the implementation to know why one crashed, and
the other seemed to work.

And of course, a slightly different implementation, and you
could end up with std::map crashing, and the hash_map working.
That's the fun of undefined behavior.
I will add those options for compilation (yes I'm using g++ on
Ubuntu linux). Are there any other useful compilation options
to add?

Doubtlessly:). The first time I use a compiler, I study every
possible option, to see which ones I'm interested in. (For g++,
seehttp://www.gnu.org/software/gcc/onlinedocs/.) Note that the
compiler documentation isn't always complete,
however---especially if you consider the standard library as
part of the compiler; for g++, for example, you'll also want to
check the library documentation (libstdc++-v3: there's also a
link to it on the page above---the chapter "Preprocessor macros
controlling the library" is the one which interests you).
Finding this information isn't always trivial, but is IMHO
essential. Exactly which options you want will depend on what
you are doing. (I have to deal with a lot of legacy code, for
example, so I often need options which support older constructs.)

Some of the options you might be interested in:

-std=c++98 -no-nonansi-builtins -fuse-cxa-atexit
For maximum standards conformance. (If you need to support
legacy code, you may need to add options here which "undo"
some conformance.)

-no-threadsafe-statics
Doesn't work correctly for Sparc anyway (haven't verified
Intel), and isn't portable, so you're probably better off
without it. It does mean that you'll have to worry about
the issue yourself, however.

There are also tons of options controling warnings.
Classically, the recommendation is to use -Wall -Wextra
-pedantic; I find it better to go through the entire list, and
specify each one independantly, according to my personal opinion
as to what is right, and what isn't. (I don't use -Wall, for
example, because it contains -Wmissing-braces, which generates a
warning whenever I use PTHREAD_MUTEX_INITIALIZER under Solaris.)

--
James Kanze (GABI Software) email:[email protected]
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Thank you James once more for your insight on this problem. =)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top