hash_set core dump on memory free

R

Rakesh

Hi -

What is wrong this implementation? I get a core dump at the free()
statement? Thanks

Rakesh

#include <ext/hash_map>
#include <iostream.h>
#include <ext/hash_set>


using namespace std;
using namespace __gnu_cxx;

struct eqstr
{
bool operator()(char* s1, char* s2) const
{
return strcmp(s1, s2) == 0;
}
};

int main()
{

char *s, *s1, *temp;
hash_set<char *, hash<char *>, eqstr> AddedPhrases;
hash_set<char*, hash<char*>, eqstr> ::iterator Iter1;


s = (char *)malloc(sizeof(char)*100);
strcpy(s, "apple");
AddedPhrases.insert(s);

s1 = (char *)malloc(sizeof(char)*100);
strcpy(s1, "absent");
AddedPhrases.insert(s1);
for (Iter1 = AddedPhrases.begin(); Iter1 != AddedPhrases.end();
Iter1++)
{
temp = *Iter1;
//printf("\nDeleting:%s:%d", temp, strlen(temp));
free(temp);
}
}

$ g++ test3.cpp
$ ./a.out

Deleting:apple:5
Deleting:absent:6
*** glibc detected *** ./a.out: double free or corruption (!prev):
0x09fa2310 ***
======= Backtrace: =========
/lib/libc.so.6[0x6b9f18]
/lib/libc.so.6(__libc_free+0x79)[0x6bd41d]
../a.out(__gxx_personality_v0+0x274)[0x804892c]
/lib/libc.so.6(__libc_start_main+0xdc)[0x66b7e4]
../a.out(__gxx_personality_v0+0x69)[0x8048721]
======= Memory map: ========
0053b000-00546000 r-xp 00000000 fd:00 38110218
/lib/libgcc_s-4.1.0-20060304.so.1
00546000-00547000 rwxp 0000a000 fd:00 38110218
/lib/libgcc_s-4.1.0-20060304.so.1
00549000-0062b000 r-xp 00000000 fd:00 19344300
/usr/lib/libstdc++.so.6.0.8
0062b000-0062f000 r-xp 000e2000 fd:00 19344300
/usr/lib/libstdc++.so.6.0.8
0062f000-00630000 rwxp 000e6000 fd:00 19344300
/usr/lib/libstdc++.so.6.0.8
00630000-00636000 rwxp 00630000 00:00 0
00639000-00652000 r-xp 00000000 fd:00 38110213 /lib/ld-2.4.so
00652000-00653000 r-xp 00018000 fd:00 38110213 /lib/ld-2.4.so
00653000-00654000 rwxp 00019000 fd:00 38110213 /lib/ld-2.4.so
00656000-00782000 r-xp 00000000 fd:00 38110214 /lib/libc-2.4.so
00782000-00785000 r-xp 0012b000 fd:00 38110214 /lib/libc-2.4.so
00785000-00786000 rwxp 0012e000 fd:00 38110214 /lib/libc-2.4.so
00786000-00789000 rwxp 00786000 00:00 0
0078b000-007ae000 r-xp 00000000 fd:00 38110217 /lib/libm-2.4.so
007ae000-007af000 r-xp 00022000 fd:00 38110217 /lib/libm-2.4.so
007af000-007b0000 rwxp 00023000 fd:00 38110217 /lib/libm-2.4.so
00974000-00975000 r-xp 00974000 00:00 0 [vdso]
08048000-0804c000 r-xp 00000000 fd:00 24674319 /home/oracle/a.out
0804c000-0804d000 rw-p 00003000 fd:00 24674319 /home/oracle/a.out
09fa2000-09fc3000 rw-p 09fa2000 00:00 0 [heap]
b7e00000-b7e21000 rw-p b7e00000 00:00 0
b7e21000-b7f00000 ---p b7e21000 00:00 0
b7fa8000-b7fac000 rw-p b7fa8000 00:00 0
bf897000-bf8ac000 rw-p bf897000 00:00 0 [stack]
Deleting:Xax:3Aborted
 
?

=?iso-8859-1?Q?Lidstr=F6m?=

Hi -

What is wrong this implementation? I get a core dump at the free()
statement? Thanks

I would use std::string's instead of char*. That way you don't have to
worry about allocating/deleting memory.

#include <string>
#include <iostream>
#include <ext/hash_set>

using namespace std;
using namespace __gnu_cxx;

// need to define hash function for std::string
struct string_hasher
{
int operator()(const string& s) const
{
int value = 0;
for( string::size_type i=0; i<s.size(); i++ )
value = 5*value + s;
return value;
}
};

int main()
{
typedef hash_set<string, string_hasher()> AddedPhrases;

AddedPhrases addedPhrases;
addedPhrases.insert("apple");
addedPhrases.insert("pear");

for( AddedPhrases::const_iterator Iter1=addedPhrases.begin();
Iter1!=addedPhrases.end();
++Iter1 )
{
cout << *Iter1 << endl;
}

return 0;
}

I didn't try to compile, so there might be some mistakes in there.
Hopefully you get the idea anyway.
 
?

=?ISO-8859-1?Q?S=E9bastien_Gu=E9rif?=

Rakesh said:
Hi -

What is wrong this implementation? I get a core dump at the free()
statement? Thanks

Rakesh

Hi Rakesh!
for (Iter1 = AddedPhrases.begin(); Iter1 != AddedPhrases.end();
Iter1++)

Just replace Iter1++ by ++Iter1 above.

Hope this helps!
Sebastien
 
R

Rupert Kittinger

{ Note: this article is cross-posted to comp.lang.c++,
microsoft.public.vc.stl, gnu.g++.help and comp.lang.c++.moderated. -mod }
Hi -

What is wrong this implementation? I get a core dump at the free()
statement? Thanks

Rakesh

#include <ext/hash_map>
#include <iostream.h>
#include <ext/hash_set>


using namespace std;
using namespace __gnu_cxx;

struct eqstr
{
bool operator()(char* s1, char* s2) const
{Variou
return strcmp(s1, s2) == 0;
}
};

int main()
{

char *s, *s1, *temp;
hash_set<char *, hash<char *>, eqstr> AddedPhrases;
hash_set<char*, hash<char*>, eqstr> ::iterator Iter1;


s = (char *)malloc(sizeof(char)*100);
strcpy(s, "apple");
AddedPhrases.insert(s);

s1 = (char *)malloc(sizeof(char)*100);
strcpy(s1, "absent");
AddedPhrases.insert(s1);
for (Iter1 = AddedPhrases.begin(); Iter1 != AddedPhrases.end();
Iter1++)
{
temp = *Iter1;
//printf("\nDeleting:%s:%d", temp, strlen(temp));
free(temp);
}
}

$ g++ test3.cpp
$ ./a.out

Deleting:apple:5
Deleting:absent:6
*** glibc detected *** ./a.out: double free or corruption (!prev):
0x09fa2310 ***
======= Backtrace: =========
/lib/libc.so.6[0x6b9f18]
/lib/libc.so.6(__libc_free+0x79)[0x6bd41d]
./a.out(__gxx_personality_v0+0x274)[0x804892c]
/lib/libc.so.6(__libc_start_main+0xdc)[0x66b7e4]
./a.out(__gxx_personality_v0+0x69)[0x8048721]
======= Memory map: ======== {
0053b000-00546000 r-xp 00000000 fd:00 38110218
/lib/libgcc_s-4.1.0-20060304.so.1
00546000-00547000 rwxp 0000a000 fd:00 38110218
/lib/libgcc_s-4.1.0-20060304.so.1
00549000-0062b000 r-xp 00000000 fd:00 19344300
/usr/lib/libstdc++.so.6.0.8
0062b000-0062f000 r-xp 000e2000 fd:00 19344300
/usr/lib/libstdc++.so.6.0.8
0062f000-00630000 rwxp 000e6000 fd:00 19344300
/usr/lib/libstdc++.so.6.0.8
00630000-00636000 rwxp 00630000 00:00 0
00639000-00652000 r-xp 00000000 fd:00 38110213 /lib/ld-2.4.so
00652000-00653000 r-xp 00018000 fd:00 38110213 /lib/ld-2.4.so
00653000-00654000 rwxp 00019000 fd:00 38110213 /lib/ld-2.4.so
00656000-00782000 r-xp 00000000 fd:00 38110214 /lib/libc-2.4.so
00782000-00785000 r-xp 0012b000 fd:00 38110214 /lib/libc-2.4.so
00785000-00786000 rwxp 0012e000 fd:00 38110214 /lib/libc-2.4.so
00786000-00789000 rwxp 00786000 00:00 0
0078b000-007ae000 r-xp 00000000 fd:00 38110217 /lib/libm-2.4.so
007ae000-007af000 r-xp 00022000 fd:00 38110217 /lib/libm-2.4.so
007af000-007b0000 rwxp 00023000 fd:00 38110217 /lib/libm-2.4.so
00974000-00975000 r-xp 00974000 00:00 0 [vdso]
08048000-0804c000 r-xp 00000000 fd:00 24674319 /home/oracle/a.out
0804c000-0804d000 rw-p 00003000 fd:00 24674319 /home/oracle/a.out
09fa2000-09fc3000 rw-p 09fa2000 00:00 0 [heap]
b7e00000-b7e21000 rw-p b7e00000 00:00 0
b7e21000-b7f00000 ---p b7e21000 00:00 0
b7fa8000-b7fac000 rw-p b7fa8000 00:00 0
bf897000-bf8ac000 rw-p bf897000 00:00 0 [stack]
Deleting:Xax:3Aborted

Hi Rakesh,

the problem is that you are freeing an object that is still inside the
container, but the iterator still tries to access the object during to
call to operator++(). The reason is that the iterator does not store the
bucket number, so when the end of the bucket is reached, the hash
function is called to compute the bucket number. At this point,
hash<char*>() is called with a pointer that no longer points to valid
memory, so you encounter undefined behaviour.

So you have to erase() the string from the container before calling
free(), but after calling

AddedPhrases.erase(Iter1);

Iter1 is no longer a valid iterator. So the whole loop must be rewritten:

for (Iter1 = AddedPhrases.begin();
Iter1 != AddedPhrases.end();
/* do nothing */) { // increment is now performed inside the loop

temp = *Iter1++; // increment iterator, then dereference
// original value value
free(temp);
}


There are other issues with this code, e.g. missing checks for the size
of the string.

And if there is not a very good reason, it is much simpler to use
std::string which will take care of all the memory management for you.

cheers,


Rupert
 
G

guerif

{ Note: this article is cross-posted to comp.lang.c++,
microsoft.public.vc.stl, gnu.g++.help and comp.lang.c++.moderated. -mod }
for (Iter1 = AddedPhrases.begin(); Iter1 != AddedPhrases.end(); Iter1++)

Hi Rakesh,

Just replace "Iter1++" by "++Iter1"...

Hope this helps!
Sebastien
 
J

James Kanze

{ Note: this article is cross-posted to comp.lang.c++,
microsoft.public.vc.stl, gnu.g++.help and comp.lang.c++.moderated. -mod }
What is wrong this implementation? I get a core dump at the free()
statement? Thanks
#include <ext/hash_map>
#include <iostream.h>
#include <ext/hash_set>
using namespace std;
using namespace __gnu_cxx;
struct eqstr
{
bool operator()(char* s1, char* s2) const
{
return strcmp(s1, s2) == 0;
}
};
int main()
{

char *s, *s1, *temp;
hash_set<char *, hash<char *>, eqstr> AddedPhrases;
hash_set<char*, hash<char*>, eqstr> ::iterator Iter1;
s = (char *)malloc(sizeof(char)*100);
strcpy(s, "apple");
AddedPhrases.insert(s);
s1 = (char *)malloc(sizeof(char)*100);
strcpy(s1, "absent");
AddedPhrases.insert(s1);
for (Iter1 = AddedPhrases.begin(); Iter1 != AddedPhrases.end();
Iter1++)
{
temp = *Iter1;
//printf("\nDeleting:%s:%d", temp, strlen(temp));
free(temp);

I'd be very surprised that this doesn't result in undefined
behavior. You're leaving an invalide pointer in the table.
You're in for some very unplaisant surprises the next time some
other value hashes to this value, and the implementation tries
invoking your compare function on the entry. (More generally,
changing anything which affects the value of anything used for
actually keying causes undefined behavior.)

In fact, a quick analysis of the g++ code in the library shows
that it does re-calculate the hash code in the ++ operator.
(The actual implementation seems to be more space optimized than
performance optimized---although space optimizing the iterator
does mean that it copies a lot faster.) The result is that
anything can happen. You might actually iterator through the
loop only once, or you might iterator more times than there are
elements in the loop, or you might loop endlessly over the same
element, or core dump in the iterator, or ...

The correct way to clean up a container like this would be
something like:

__gnu_cxx::hash_set::iterator i = AddedPhrases.begin() ;
while ( i != AddedPhrases.end() ) {
char* tmp = *i ;
i = AddedPhrases.erase( i ) ;
free( tmp ) ;
}

More generally, of course, you'd be better off:

-- Using the standard IO (<iostream>, <ostream>); g++ definitly
supports it, and the only justification today to use
<iostream.h> is to support legacy compilers (g++ pre-3.0,
Sun CC 4.2, etc.---all so old you shouldn't be using them
anyway).

-- Using std::string, instead of the char* junk. Do that, and
the destructor of the hash_set will clean up everything
automatically.

-- Not declaring variables until you can correctly initialize
them.
 
R

Rupert Kittinger

Rupert said:
{ Note: this article is cross-posted to comp.lang.c++,
microsoft.public.vc.stl, gnu.g++.help and comp.lang.c++.moderated. -mod }
....

Hi Rakesh,

the problem is that you are freeing an object that is still inside the
container, but the iterator still tries to access the object during to
call to operator++(). The reason is that the iterator does not store the
bucket number, so when the end of the bucket is reached, the hash
function is called to compute the bucket number. At this point,
hash<char*>() is called with a pointer that no longer points to valid
memory, so you encounter undefined behaviour.

So you have to erase() the string from the container before calling
free(), but after calling

AddedPhrases.erase(Iter1);

Iter1 is no longer a valid iterator. So the whole loop must be rewritten:

for (Iter1 = AddedPhrases.begin();
Iter1 != AddedPhrases.end();
/* do nothing */) { // increment is now performed inside the loop

temp = *Iter1++; // increment iterator, then dereference
// original value value
free(temp);
}


There are other issues with this code, e.g. missing checks for the size
of the string.

And if there is not a very good reason, it is much simpler to use
std::string which will take care of all the memory management for you.

cheers,


Rupert

oops, the loop should really be

for (Iter1 = AddedPhrases.begin();
Iter1 != AddedPhrases.end();
/* do nothing */) { // increment is now performed inside the
loop temp = *Iter1;

AddedPhrases.erase(i++);
free(temp);
}

in his post below, James Kanze recommends

i = AddedPhrases.erase( i );

however, this will only work for std::set. Unfortunately, the g++
hash_set::erase(iterator) method returns void :-(.

Rupert
 
B

BobR

{ Note: this article is cross-posted to comp.lang.c++,
microsoft.public.vc.stl, gnu.g++.help and comp.lang.c++.moderated. -mod }

Sébastien Guérif wrote in message
Hi Rakesh!


Just replace Iter1++ by ++Iter1 above.

{
for( size_t i(0); i < 5; i++ ){
std::cout<<" i = "<< i << std::endl;
}
std::cout<<std::endl;
for( size_t i(0); i < 5; ++i ){
std::cout<<" i = "<< i << std::endl;
}
}

// i = 0
// i = 1
// i = 2
// i = 3
// i = 4

// i = 0
// i = 1
// i = 2
// i = 3
// i = 4
 
J

James Kanze

Rupert said:
{ Note: this article is cross-posted to comp.lang.c++,
microsoft.public.vc.stl, gnu.g++.help and comp.lang.c++.moderated. -mod }

Which is curious in itself: why microsoft.public.vc.stl, when
the compiler being used is manifestly g++?
Rakesh schrieb:

[...]
the problem is that you are freeing an object that is still inside the
container, but the iterator still tries to access the object during to
call to operator++().

The problem is that the specifications of hash_set have not been
respected. There is a requirement that the key value of any
element in the set not be modified. Although it's not essential
that any particular operation access the key, it's also not
forbidden.
The reason is that the iterator does not store the
bucket number, so when the end of the bucket is reached, the hash
function is called to compute the bucket number.

Which is a legal implementation, given the specifications, even
if it is somewhat surprising. (Typically, there should only be
one or two elements in each bucket, and recalculating the hash
value each time you change buckets can make incrementation an
expensive operation.)

Of course, even if ++ didn't access the key value, other
functions might (including the destructor).

The important point is that he has a contract with the
container, and he has violated it.
At this point, hash<char*>() is called with a pointer that no
longer points to valid memory, so you encounter undefined
behaviour.
So you have to erase() the string from the container before calling
free(), but after calling

Iter1 is no longer a valid iterator. So the whole loop must be rewritten:
for (Iter1 = AddedPhrases.begin();
Iter1 != AddedPhrases.end();
/* do nothing */) { // increment is now performed inside the loop
temp = *Iter1++; // increment iterator, then dereference
// original value value
free(temp);
}

The "standard" solution when removing objects in a loop is to
update the iterator with the return value of the erase()
function. This works for multiset and unordered_multiset, as
well as for set and unordered_set. (The g++ hash_set is a
preliminary version of unordered_set.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,059
Latest member
cryptoseoagencies

Latest Threads

Top