Tricky pointer problem?

D

da.colonel

Hello,

I'm having some trouble with implementing a hashtable (hashmap) for a
class at school.
The map's constructor is called with a hash function and the number of
possible keys the function returns. The get and set methods do work
well, only the []-operator does not work. When trying to do something
like map['D'] = "December", it seems like the = has no effect. At
least, if the value for the same key is read after assigning it, it
will be null.
I guess I got some pointer wrong or similar but I just cannot find the
mistake. I'd appreciate any help you can give.

So here's the code for the hashmap:

// ---
template<class KeyType, class DataType> class myhash {
public:
typedef unsigned int (*hashFunctionType)(const KeyType &);
typedef pair<KeyType, DataType> pairType;
typedef typename list<pairType>::iterator listIterator;
private:
list<pairType> **data;
hashFunctionType hashFunction;

inline pairType *findEntry(const unsigned int k, const KeyType &key)
const
{
for(listIterator z = data[k]->begin(); z != data[k]->end(); z++)
if(z->first == key)
return &(*z);
return NULL;
}
public:
void set(const KeyType &key, const DataType &value)
{
unsigned int k = hashFunction(key);
if(data[k] == NULL)
{
data[k] = new list<pairType>();
data[k]->push_back(*new pairType(key, value));
}
else
{
pairType *p = findEntry(k, key);
if(p == NULL) data[k]->push_back(*new pairType(key, value));
else p->second = value;
}
}
void get(const KeyType &key, DataType &value)
{
unsigned int k = hashFunction(key);
if(data[k] == NULL) throw new notfound_error();
pairType *p = findEntry(k, key);
if(p == NULL) throw new notfound_error();
value = p->second;
}
DataType &operator[] (const KeyType &key)
{
unsigned int k = hashFunction(key);
pairType *p;
if(data[k] == NULL) data[k] = new list<pairType>(); // create new
list
else
{
p = findEntry(k, key);
if(p != NULL) return p->second;
nCollisions++;
}
DataType *d = new DataType(); // create new data type in case write
access is coming next to that element
p = new pairType(key, *d); // create key-value pair
data[k]->push_back(*p);
return *d; // return reference to data element
}
};
// ---

I left out the definition of the constructor and destructor as they
don't do interesting things.

Now in this test program the myhash template gets specialised with a
char as KeyType and a wrapper class around a const char * as DataType:

// ---
unsigned int hashfunc(const char &c) {
unsigned int i = c - 'A';
return (i*i)%100;
}

class constcharpointer {
const char *_ptr;
public:
constcharpointer(const char *ptr) { _ptr = ptr; }
constcharpointer() { _ptr = 0; }
operator const char* () { return _ptr; }
};

int main() {
using namespace std;
myhash<char,constcharpointer> h(hashfunc,100);
h['D'] = "Dezember";
cerr << "h['D']: " << h['D'] << endl; // this one triggers a
segmentation fault (null pointer returned)
cerr << "strcmp: " << strcmp(h['D'], "Dezember") << endl;
return 0;
}
// ---

The constcharpointer class was written by the author of the programming
task and thus I guess should be working fine.

Regards,
Denis
 
J

John Harrison

Hello,

I'm having some trouble with implementing a hashtable (hashmap) for a
class at school.
The map's constructor is called with a hash function and the number of
possible keys the function returns. The get and set methods do work
well, only the []-operator does not work. When trying to do something
like map['D'] = "December", it seems like the = has no effect. At
least, if the value for the same key is read after assigning it, it
will be null.
I guess I got some pointer wrong or similar but I just cannot find the
mistake. I'd appreciate any help you can give.

There's lots of problems with references and copying. You're confused
about when to use a pointer and when to use a value. Basically you are
allocating memory unecessarily (and never freeing it). I'll try and fix
them below
So here's the code for the hashmap:

// ---
template<class KeyType, class DataType> class myhash {
public:
typedef unsigned int (*hashFunctionType)(const KeyType &);
typedef pair<KeyType, DataType> pairType;
typedef typename list<pairType>::iterator listIterator;
private:
list<pairType> **data;
hashFunctionType hashFunction;

inline pairType *findEntry(const unsigned int k, const KeyType &key)
const
{
for(listIterator z = data[k]->begin(); z != data[k]->end(); z++)
if(z->first == key)
return &(*z);
return NULL;
}
public:
void set(const KeyType &key, const DataType &value)
{
unsigned int k = hashFunction(key);
if(data[k] == NULL)
{
data[k] = new list<pairType>();
data[k]->push_back(*new pairType(key, value));

Should be

data[k]->push_back(pairType(key, value));

You don't need to allocate memory, just create the pairType directly.
}
else
{
pairType *p = findEntry(k, key);
if(p == NULL) data[k]->push_back(*new pairType(key, value));

Again should be

if(p == NULL) data[k]->push_back(pairType(key, value));
else p->second = value;
}
}
void get(const KeyType &key, DataType &value)
{
unsigned int k = hashFunction(key);
if(data[k] == NULL) throw new notfound_error();
pairType *p = findEntry(k, key);
if(p == NULL) throw new notfound_error();
value = p->second;
}
DataType &operator[] (const KeyType &key)
{
unsigned int k = hashFunction(key);
pairType *p;
if(data[k] == NULL) data[k] = new list<pairType>(); // create new
list
else
{
p = findEntry(k, key);
if(p != NULL) return p->second;
nCollisions++;
}
DataType *d = new DataType(); // create new data type in case write
access is coming next to that element
p = new pairType(key, *d); // create key-value pair
data[k]->push_back(*p);
return *d; // return reference to data element

Same error twice, neither memory allocation is necessary (both leak
memory) and then to make it worse you return a reference to the
allocated memory, which is not the object that is in the hash_map. This
is why it seems to have no effect.

Here's how you could do it

data[k]->push_back(pairType(key, DataType()));
return data[k].last().second;

Note there is no need for memory allocation, no need for the d or p
variables, just create values with constructors. Also note how the
return statement returns a reference to the item you just added to the list.

You are using too many pointers, in all the wrong places. Remember
pointers are the work of the devil.

john
 
J

John Harrison

data[k]->push_back(pairType(key, DataType()));
return data[k].last().second;

Typo, should be

return data[k]->last().second;

But there probably not good reason to use pointers to lists, it just
means you have to allocate them and then remember to free them, which is
tedious and error prone. Why not declare data like this

list<pairType> *data;

or even better like this

vector< list<pairType> > data;

then your lists will be created *automatically* you won't have to
allocate them, or free them.

You're flirting with the dark side, cease using pointers!

john
 
D

da.colonel

Great, I'll try that and give feedback whether that solved my problems!

I'm sorry for all those pointers... I'm more familiar with Java
programming and in C++ I always feel like I have to use those evil
pointers :)
 
D

da.colonel

Thank you very much, that solved my problems. Now the program is
working correctly!

But I have some questions about the changes, because I don't see why
this solution is working:

To create a pair, I use now:

data[k]->push_back(pairType(key, DataType()));

just as you suggested. But why does that work? Doesn't that create a
new pairType on the current method's local stack and so should be
invalid as soon as the method returns? That's why I made had a call to
'new' there.
 
J

John Harrison

Thank you very much, that solved my problems. Now the program is
working correctly!

But I have some questions about the changes, because I don't see why
this solution is working:

To create a pair, I use now:

data[k]->push_back(pairType(key, DataType()));

just as you suggested. But why does that work? Doesn't that create a
new pairType on the current method's local stack and so should be
invalid as soon as the method returns? That's why I made had a call to
'new' there.

No, because what goes into the the hash map is a copy of the object. The
original is destroyed but the copy lives on in your hash map.

Exactly the same thing happens when you called new, except that way the
copy goes into the hash map and the original lives on as a memory leak.

This is a fundanental difference between java and C++, in java all class
objects are references, when you copy then you are in effect copying a
pointer an object. But in C++ when you copy an object you now have two
objects, not two pointers to the same object. If you want to use the
buzz words you say that Java has 'reference sementics' but C++ has
'value semantics'.

For instance this would also be a perfectly correct way to code your program

DataType x;
pairType y(key, x);
data[k]->push_back(y);

The variable x and y are local, they do get destroyed, but it doesn't
matter, by the time they are destroyed copies of there values are safely
in your hash map.

john
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top