calloc and hash tables

ababeel · Mar 1, 2007

Hi
I am using a calloc in a hash table program on an openvms system. The
calloc is used to declare the hash table
char **pHashes;
pHashes = calloc(hash_size,sizeof(char *)); //hash_size = 101

and then later on when it is half full i do the following

char **newHashes;
if(newHashes)
free(newHashes);

newHashes = calloc(new_hash_size,sizeof(char *)); //new_hash_size =
hash_size * 2
// copy over the items to newHashes
free(pHashes)
pHashes = newHashes;

Now all works well except when hash_size is 101....I tried
hash_size=1009 and it works fine, basically if it doesnt go into the
newHashes calloc, it seems to work fine...and another weird thing is
that it doesnt crash in the hashtable program, rather later on it
crashes in an fopen. Yes I checked the return status of fopen and it
doesnt even get to that. Inside fopen it crashes...
I guess that second calloc is corrupting memory which later on causes
fopen to fail. Any help regarding how I should proceed with debugging
this would be greatly appreciated.
Thanks.

Ben Pfaff · Mar 1, 2007

ababeel said:
char **newHashes;
if(newHashes)
free(newHashes);

That's a bug. You can't use the value of a variable you haven't
initialized. You certainly can't free a pointer that has an
indeterminate value.

Perhaps your code has other bugs, but that one's obvious.

ababeel · Mar 1, 2007

Yes...
I changed that, removed the if(newHashes) bit...still the same
though...

user923005 · Mar 1, 2007

Yes...
I changed that, removed the if(newHashes) bit...still the same
though...

- Show quoted text -

If the routine is not too long (a couple hundred lines at most) then
post it.
We can't guess what you are doing wrong without seeing the code in
full.

Peter Nilsson · Mar 1, 2007

Please don't top post in comp.lang.c.

If the routine is not too long (a couple hundred lines at most)
then post it.
We can't guess what you are doing wrong without seeing the code
in full.

We don't need the code in full, just the smallest compilable
snippet that exhibits the problem. Getting that snippet means
work for the OP, but what most OP's don't realise is that the
exercise is not just a usenet courtesy, it's an extremely
useful debugging technique in it's own right.

ababeel · Mar 1, 2007

Please don't top post in comp.lang.c.

We don't need the code in full, just the smallest compilable
snippet that exhibits the problem. Getting that snippet means
work for the OP, but what most OP's don't realise is that the
exercise is not just a usenet courtesy, it's an extremely
useful debugging technique in it's own right.

char *hash_lookup(t_hash_ctrl *p_hash_ctrl, char * hash_key, size_t
createSize, boolean* p_bCreated)
{
size_t i;
long malloc_size = 0; // size to malloc for the data buffer
(when used_limit == used_items)
t_dataStruct * tempPtr = 0;
*p_bCreated = FALSE;
char **newHashes = 0;
unsigned long h = hash(hash_key, p_hash_ctrl->key_size);

for(i = h %(p_hash_ctrl->hash_size); p_hash_ctrl->p_hashes;
i==0 ? i = p_hash_ctrl->hash_size - 1: --i)
{

if(!memcmp(hash_key, p_hash_ctrl->p_hashes + p_hash_ctrl-

key_offset - 1, p_hash_ctrl->key_size))

Click to expand...

return p_hash_ctrl->p_hashes;
else
p_hash_ctrl->search_steps++;
}
// If we are only looking for an existing
// item then return that we haven't found it.
if(!createSize)
{
//printf("NO CREATE SIZE\n");
return NULL;
}
// If we have hit the hash table size limit
// then extend the hash table by a factor of 2.

ASSERT(p_hash_ctrl->used_limit < p_hash_ctrl->hash_size);

if(p_hash_ctrl->used_items == p_hash_ctrl->used_limit)
{
size_t new_hash_size = p_hash_ctrl->hash_size << 1; //
new_hash_size = hash_size * 2;
size_t new_used_limit = new_hash_size >> 1; //
new_used_limit = new_hash_size / 2;

newHashes = calloc(new_hash_size, sizeof(char *));
ASSERT(new_hash_size > p_hash_ctrl->hash_size);

if(!newHashes)
syntax("Unable to allocate memory for newHashes");

for(i=0; i < p_hash_ctrl->hash_size; i++)
if(p_hash_ctrl->p_hashes)
{
size_t j;
for(j = hash(p_hash_ctrl->p_hashes + p_hash_ctrl-

key_offset - 1, p_hash_ctrl->key_size) % new_hash_size; newHashes[j];

Click to expand...

j == 0 ? j = new_hash_size - 1 : --j)
{}
ASSERT(j>=0 && j<new_hash_size);
newHashes[j] = p_hash_ctrl->p_hashes;
}

// create a new data buffer
// and copy the address of the previous
// in the first prevAdd variable in hdr

malloc_size = ((new_used_limit - p_hash_ctrl->used_limit) *
p_hash_ctrl->data_size);
tempPtr = p_hash_ctrl->dataStruct;
p_hash_ctrl->dataStruct = malloc(sizeof(t_header) + malloc_size);
if(!p_hash_ctrl->dataStruct)
syntax("Could not allocate memory for the data buffer");
p_hash_ctrl->dataStruct->hdr.prevAdd = tempPtr;
p_hash_ctrl->dataStruct->hdr.data_items = 0;
p_hash_ctrl->dataStruct->p_data = (char *)p_hash_ctrl->dataStruct
+ sizeof(t_dataStruct);

free(p_hash_ctrl->p_hashes);
p_hash_ctrl->p_hashes = newHashes;
p_hash_ctrl->hash_size = new_hash_size;
p_hash_ctrl->used_limit = new_used_limit;

free(newHashes);

for(i = h % (p_hash_ctrl->hash_size); p_hash_ctrl->p_hashes;
i == 0 ? i = p_hash_ctrl->hash_size - 1 : --i)
{}

}
// Add new item to hash table.

*p_bCreated = TRUE;
p_hash_ctrl->p_hashes = &p_hash_ctrl->dataStruct-

p_data[p_hash_ctrl->dataStruct->hdr.data_items++ * p_hash_ctrl-
data_size];

Click to expand...

p_hash_ctrl->used_items++;
return p_hash_ctrl->p_hashes;
}

Apologies for the earlier "top post" I am still new to using groups
and not too familiar with etiquettes as yet, will get there ....The
code above is the routine due which the crash occurs, reproducing the
problem with a snippet of code is a bit difficult coz the problem
occurs in an update run which is a part of a big system....basically
this is where I narrowed the acc vio to. And as I said in my earlier
post hash table executes fine...causes some memory corruption which
later on causes the fopen error. It only happens if we get into the
newHashes calloc....

Flash Gordon · Mar 1, 2007

ababeel wrote, On 01/03/07 02:29:
Please do not quote sigs, the bit after the "-- " unless you are
commenting on them. Also, please snip any material not relevant to your
reply (leaving enough in for the post to make sense on its own). For
example, nothing before Peter's comment is relevant to what I'm saying,

char *hash_lookup(t_hash_ctrl *p_hash_ctrl, char * hash_key, size_t
createSize, boolean* p_bCreated)

Peter asked for a *compilable* snippet, without a definition of
t_hash_ctrl this is not compilable. The reason we want compilable is
that the first thing many of us do is run the code through a compiler
set to a high (or ridiculously high) warning level and possibly through
some other static analysis tools, this often throws up a number of
issues and so makes it easier for us. It also means we can do test
compiles of anything we suggest. Since your code is not compilable it
will certainly limit how much and how carefully I will look.

{
size_t i;
long malloc_size = 0; // size to malloc for the data buffer
(when used_limit == used_items)

Please do not use // style comments when posting. Not only are they not
in the most commonly implemented C standard (MS VC++ does not attempt to
implement the later standard that officially added them and gcc is not
yet fully conforming to the new standard) but even worse, they get line
wrapped as you can see making it far harder to read or compile your code.

t_dataStruct * tempPtr = 0;
*p_bCreated = FALSE;
char **newHashes = 0;
unsigned long h = hash(hash_key, p_hash_ctrl->key_size);

Make sure you have a prototype in scope for hash. If hash returns an
unsigned long (which I am guessing it does from context) then anything
can happen if no prototype is in scope.

for(i = h %(p_hash_ctrl->hash_size); p_hash_ctrl->p_hashes;
i==0 ? i = p_hash_ctrl->hash_size - 1: --i)
{

if(!memcmp(hash_key, p_hash_ctrl->p_hashes + p_hash_ctrl-

key_offset - 1, p_hash_ctrl->key_size))

Click to expand...

Please try to keep the line length down to something like 70 characters,
it makes it easier to read in a post since it avoids line wrapping at
inconvenient places.

return p_hash_ctrl->p_hashes;
else
p_hash_ctrl->search_steps++;
}
// If we are only looking for an existing
// item then return that we haven't found it.
if(!createSize)
{
//printf("NO CREATE SIZE\n");
return NULL;
}
// If we have hit the hash table size limit
// then extend the hash table by a factor of 2.

ASSERT(p_hash_ctrl->used_limit < p_hash_ctrl->hash_size);

Click to expand...

Would not returning an error condition make more sense than an assert?

if(p_hash_ctrl->used_items == p_hash_ctrl->used_limit)
{
size_t new_hash_size = p_hash_ctrl->hash_size << 1; //
new_hash_size = hash_size * 2;

Click to expand...

If you want to multiply by 2 then multiply by 2. Over 10 years ago
compilers would do this optimisation for you and it makes the code
easier to read.

1st rule of optimisation, don't do it.

Once you are obeying the 1st rule and have learnt enough we will tell
you the second rule.

Also, you should check for overflow (which is easy as you are doing
unsigned arithmetic).

size_t new_used_limit = new_hash_size >> 1; //
new_used_limit = new_hash_size / 2;

Click to expand...

So having multiplied by 2 using what you think (incorrectly) you now
divide by 2? Isn't that a bit daft?

newHashes = calloc(new_hash_size, sizeof(char *));
ASSERT(new_hash_size > p_hash_ctrl->hash_size);

if(!newHashes)
syntax("Unable to allocate memory for newHashes");

for(i=0; i < p_hash_ctrl->hash_size; i++)
if(p_hash_ctrl->p_hashes)
{
size_t j;
for(j = hash(p_hash_ctrl->p_hashes + p_hash_ctrl-

key_offset - 1, p_hash_ctrl->key_size) % new_hash_size; newHashes[j];

Click to expand...

j == 0 ? j = new_hash_size - 1 : --j)
{}
ASSERT(j>=0 && j<new_hash_size);
newHashes[j] = p_hash_ctrl->p_hashes;
}

// create a new data buffer
// and copy the address of the previous
// in the first prevAdd variable in hdr

malloc_size = ((new_used_limit - p_hash_ctrl->used_limit) *
p_hash_ctrl->data_size);
tempPtr = p_hash_ctrl->dataStruct;
p_hash_ctrl->dataStruct = malloc(sizeof(t_header) + malloc_size);

Click to expand...

Without the type definitions (which would make this code compilable) we
have no way of knowing if this is correct.

if(!p_hash_ctrl->dataStruct)
syntax("Could not allocate memory for the data buffer");
p_hash_ctrl->dataStruct->hdr.prevAdd = tempPtr;
p_hash_ctrl->dataStruct->hdr.data_items = 0;
p_hash_ctrl->dataStruct->p_data = (char *)p_hash_ctrl->dataStruct
+ sizeof(t_dataStruct);

Click to expand...

I am suspicious of the above line when compared to your malloc line. I
also suspect very strongly there is a better way to do things than using
a cast, but you do not provide sufficient information. However, I do
suggest you look up the struct hack, and I'm pretty sure that if you
read the comp.lang.c FAQ you will find it there somewhere (if not
reading the comp.lang.c FAQ will help you anyway).

free(p_hash_ctrl->p_hashes);
p_hash_ctrl->p_hashes = newHashes;
p_hash_ctrl->hash_size = new_hash_size;
p_hash_ctrl->used_limit = new_used_limit;

free(newHashes);

Click to expand...

OK, having just saved away a pointer to your allocated space you are
freeing it? That makes absolutely no sense.

for(i = h % (p_hash_ctrl->hash_size); p_hash_ctrl->p_hashes;

Click to expand...

You are now accessing the memory you have freed. BANG!

i == 0 ? i = p_hash_ctrl->hash_size - 1 : --i)
{}

}
// Add new item to hash table.

*p_bCreated = TRUE;
p_hash_ctrl->p_hashes = &p_hash_ctrl->dataStruct-

p_data[p_hash_ctrl->dataStruct->hdr.data_items++ * p_hash_ctrl-
data_size];

Click to expand...

p_hash_ctrl->used_items++;
return p_hash_ctrl->p_hashes;
}

Apologies for the earlier "top post" I am still new to using groups
and not too familiar with etiquettes as yet, will get there ....The
code above is the routine due which the crash occurs, reproducing the
problem with a snippet of code is a bit difficult coz the problem
occurs in an update run which is a part of a big system....basically
this is where I narrowed the acc vio to. And as I said in my earlier
post hash table executes fine...causes some memory corruption which
later on causes the fopen error.

Click to expand...

If it "causes some memory corruption" then it does not "executes fine".
Anything that causes *any* memory corruption is extremely broken can
could cause *anything* to happen.

> It only happens if we get into the
newHashes calloc....

Click to expand...

I suspect you have more problems that I have pointed out, but since your
code was *not* compilable due to lack of definitions you will not get
more from me.

Markus Becker · Mar 1, 2007

Flash Gordon said:
Please do not quote sigs, the bit after the "-- " unless you are

There was no "-- ", just a "--"

Markus

Flash Gordon · Mar 1, 2007

Markus Becker wrote, On 01/03/07 05:15:

There was no "-- ", just a "--"

Well, it should still not be quoted since it was not being commented on,
and the person who missed off the trailing space should try in future to
put it in since the convention is to use "-- " as the sig. sep.

user923005 · Mar 1, 2007

Your code should actually compile and demonstrate the problem.
The code you provided could in no way compile.
Also, you should provide the struct definitions that are missing.
You were missing header files (at least):
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <assert.h> /* Assuming your screaming ASSERT() somehow maps
to assert() */

ababeel · Mar 1, 2007

ababeel wrote, On 01/03/07 02:29:

^^^^^^^^^^

Click to expand...

Please do not quote sigs, the bit after the "-- " unless you are
commenting on them. Also, please snip any material not relevant to your
reply (leaving enough in for the post to make sense on its own). For
example, nothing before Peter's comment is relevant to what I'm saying,

char *hash_lookup(t_hash_ctrl *p_hash_ctrl, char * hash_key, size_t
createSize, boolean* p_bCreated)

Click to expand...

Peter asked for a *compilable* snippet, without a definition of
t_hash_ctrl this is not compilable. The reason we want compilable is
that the first thing many of us do is run the code through a compiler
set to a high (or ridiculously high) warning level and possibly through
some other static analysis tools, this often throws up a number of
issues and so makes it easier for us. It also means we can do test
compiles of anything we suggest. Since your code is not compilable it
will certainly limit how much and how carefully I will look.

{
size_t i;
long malloc_size = 0; // size to malloc for the data buffer
(when used_limit == used_items)

Click to expand...

Please do not use // style comments when posting. Not only are they not
in the most commonly implemented C standard (MS VC++ does not attempt to
implement the later standard that officially added them and gcc is not
yet fully conforming to the new standard) but even worse, they get line
wrapped as you can see making it far harder to read or compile your code.

t_dataStruct * tempPtr = 0;
*p_bCreated = FALSE;
char **newHashes = 0;
unsigned long h = hash(hash_key, p_hash_ctrl->key_size);

Click to expand...

Make sure you have a prototype in scope for hash. If hash returns an
unsigned long (which I am guessing it does from context) then anything
can happen if no prototype is in scope.

for(i = h %(p_hash_ctrl->hash_size); p_hash_ctrl->p_hashes;
i==0 ? i = p_hash_ctrl->hash_size - 1: --i)
{

Click to expand...

if(!memcmp(hash_key, p_hash_ctrl->p_hashes + p_hash_ctrl-

key_offset - 1, p_hash_ctrl->key_size))

Click to expand...

Click to expand...

Please try to keep the line length down to something like 70 characters,
it makes it easier to read in a post since it avoids line wrapping at
inconvenient places.

return p_hash_ctrl->p_hashes;
else
p_hash_ctrl->search_steps++;
}
// If we are only looking for an existing
// item then return that we haven't found it.
if(!createSize)
{
//printf("NO CREATE SIZE\n");
return NULL;
}
// If we have hit the hash table size limit
// then extend the hash table by a factor of 2.

Click to expand...

ASSERT(p_hash_ctrl->used_limit < p_hash_ctrl->hash_size);

Click to expand...

Would not returning an error condition make more sense than an assert?

if(p_hash_ctrl->used_items == p_hash_ctrl->used_limit)
{
size_t new_hash_size = p_hash_ctrl->hash_size << 1; //
new_hash_size = hash_size * 2;

Click to expand...

If you want to multiply by 2 then multiply by 2. Over 10 years ago
compilers would do this optimisation for you and it makes the code
easier to read.

1st rule of optimisation, don't do it.

Once you are obeying the 1st rule and have learnt enough we will tell
you the second rule.

Also, you should check for overflow (which is easy as you are doing
unsigned arithmetic).

size_t new_used_limit = new_hash_size >> 1; //
new_used_limit = new_hash_size / 2;

Click to expand...

So having multiplied by 2 using what you think (incorrectly) you now
divide by 2? Isn't that a bit daft?

newHashes = calloc(new_hash_size, sizeof(char *));
ASSERT(new_hash_size > p_hash_ctrl->hash_size);

Click to expand...

if(!newHashes)
syntax("Unable to allocate memory for newHashes");

Click to expand...

for(i=0; i < p_hash_ctrl->hash_size; i++)
if(p_hash_ctrl->p_hashes)
{
size_t j;
for(j = hash(p_hash_ctrl->p_hashes + p_hash_ctrl-

key_offset - 1, p_hash_ctrl->key_size) % new_hash_size; newHashes[j];

Click to expand...

j == 0 ? j = new_hash_size - 1 : --j)
{}
ASSERT(j>=0 && j<new_hash_size);
newHashes[j] = p_hash_ctrl->p_hashes;
}

Click to expand...

// create a new data buffer
// and copy the address of the previous
// in the first prevAdd variable in hdr

Click to expand...

malloc_size = ((new_used_limit - p_hash_ctrl->used_limit) *
p_hash_ctrl->data_size);
tempPtr = p_hash_ctrl->dataStruct;
p_hash_ctrl->dataStruct = malloc(sizeof(t_header) + malloc_size);

Click to expand...

Without the type definitions (which would make this code compilable) we
have no way of knowing if this is correct.

if(!p_hash_ctrl->dataStruct)
syntax("Could not allocate memory for the data buffer");
p_hash_ctrl->dataStruct->hdr.prevAdd = tempPtr;
p_hash_ctrl->dataStruct->hdr.data_items = 0;
p_hash_ctrl->dataStruct->p_data = (char *)p_hash_ctrl->dataStruct
+ sizeof(t_dataStruct);

Click to expand...

I am suspicious of the above line when compared to your malloc line. I
also suspect very strongly there is a better way to do things than using
a cast, but you do not provide sufficient information. However, I do
suggest you look up the struct hack, and I'm pretty sure that if you
read the comp.lang.c FAQ you will find it there somewhere (if not
reading the comp.lang.c FAQ will help you anyway).

free(p_hash_ctrl->p_hashes);
p_hash_ctrl->p_hashes = newHashes;
p_hash_ctrl->hash_size = new_hash_size;
p_hash_ctrl->used_limit = new_used_limit;

Click to expand...

free(newHashes);

Click to expand...

OK, having just saved away a pointer to your allocated space you are
freeing it? That makes absolutely no sense.

for(i = h % (p_hash_ctrl->hash_size); p_hash_ctrl->p_hashes;

Click to expand...

You are now accessing the memory you have freed. BANG!

i == 0 ? i = p_hash_ctrl->hash_size - 1 : --i)
{}

Click to expand...

}
// Add new item to hash table.

Click to expand...

*p_bCreated = TRUE;
p_hash_ctrl->p_hashes = &p_hash_ctrl->dataStruct-

p_data[p_hash_ctrl->dataStruct->hdr.data_items++ * p_hash_ctrl-
data_size];

Click to expand...

p_hash_ctrl->used_items++;
return p_hash_ctrl->p_hashes;
}

Click to expand...

Apologies for the earlier "top post" I am still new to using groups
and not too familiar with etiquettes as yet, will get there ....The
code above is the routine due which the crash occurs, reproducing the
problem with a snippet of code is a bit difficult coz the problem
occurs in an update run which is a part of a big system....basically
this is where I narrowed the acc vio to. And as I said in my earlier
post hash table executes fine...causes some memory corruption which
later on causes the fopen error.

Click to expand...

If it "causes some memory corruption" then it does not "executes fine".
Anything that causes *any* memory corruption is extremely broken can
could cause *anything* to happen.

It only happens if we get into the

Click to expand...

newHashes calloc....

Click to expand...

I suspect you have more problems that I have pointed out, but since your
code was *not* compilable due to lack of definitions you will not get
more from me.
--
Flash Gordon- Hide quoted text -

- Show quoted text -- Hide quoted text -

- Show quoted text -

The multiply and divide by 2 which u thought was a bit daft, one of
them actually is a comment. But I understand your point about
optimization and comments wrapped around making code hard to read.
Also ASSERT is actually a macro which provides an error statement.
I will try and strip unnecessary stuff and get a small compiling
version which crashes and post it.
Thanks

Peter Nilsson · Mar 2, 2007

Markus Becker said:
There was no "-- ", just a "--"

I'm afraid that's another Google 'feature'. [It was even missing
from the 'source' view.] Bare in mind though that the extra space
is just a marker for automated detection and snippage of signatures.
There's nothing preventing a human from parsing the obvious and
acting appropriately.

CBFalconer · Mar 2, 2007

.... snip about 220 lines of useless text ...

The multiply and divide by 2 which u thought was a bit daft, one
of them actually is a comment. But I understand your point about
optimization and comments wrapped around making code hard to read.
Also ASSERT is actually a macro which provides an error statement.
I will try and strip unnecessary stuff and get a small compiling
version which crashes and post it.

Please snip any quoted material not relevant to your reply.

Flash Gordon · Mar 2, 2007

CBFalconer wrote, On 02/03/07 02:16:

ababeel wrote:
... snip about 220 lines of useless text ...

<snip>

Hey, I resemble that remark! I had one line of useful text to it was
only 219 lines of useless text! ;-)

Please snip any quoted material not relevant to your reply.

This I agree with.

Calloc arguments order	21	Aug 19, 2012
calloc interposing issue	6	Feb 22, 2008
malloc vs calloc	47	Feb 2, 2007
calloc	5	Sep 11, 2003
Windows LLDP Driver Responds With No Data	0	Mar 17, 2023
Different sizes of data and function pointers on a machine -- void*return type of malloc, calloc, an	23	Jun 25, 2012
calloc / free	11	Jun 27, 2003
calloc crashing -- dump core	18	Dec 4, 2007

calloc and hash tables

ababeel

Ben Pfaff

ababeel

user923005

Peter Nilsson

ababeel

Flash Gordon

Markus Becker

Flash Gordon

user923005

ababeel

Peter Nilsson

CBFalconer

Flash Gordon

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads