C
chrisbazley
Hello,
I wasted a lot of time yesterday writing some code which manages a
collection of strings within a single heap block allocated by a
function similar to 'malloc'. A separate array of structs includes
members which point to the start of each string. My intention is to
replace existing (working) code where each string is held in a
separate heap block, to reduce a high turnover of blocks.
When replacing existing strings with longer strings, or adding strings
to the collection, the heap block containing the strings must be
extended using a function like 'realloc'. However, 'realloc' may move
the base address of the block, which invalidates the pointers to the
start of each string. I don't really want to replace all my 'char *'
members with 'ptrdiff_t' (offset from start of heap block containing
strings) because of the loss of type-safety and additional cost of
accessing the strings.
I thought I could solve my problem by adding a relocation offset to
each pointer immediately after calling 'realloc', derived by
subtracting the old from the new address of the heap block. However,
turning to the appendix of my copy of Kernighan & Ritchie, I
discovered - to my dismay - that the result of subtracting one pointer
from another is undefined unless both point to objects within the same
array.
Presumably that means the following code would have undefined effects:
unsigned int i;
char *strings, *new_strings;
ptrdiff_t relocate;
struct
{
char *string;
}
objs[10];
/* Resize heap block containing strings */
new_strings = realloc(strings, new_size);
if (new_strings == NULL)
{
/* ...handle error... */
}
/* Relocate pointers to the start of each string */
relocate = new_strings - strings;
for (i = 0; i < sizeof(objs) / sizeof(objs[0]); i++)
{
objs.string += relocate;
}
strings = new_strings;
Can anyone think of a machine architecture where the above code would
not work? It seems likely to be a common idiom, even if not strictly
legal.
It occurs to me that I could circumvent the K&R restriction by
utilitising the old address of the heap block within my relocation
loop:
for (i = 0; i < sizeof(objs) / sizeof(objs[0]); i++)
{
objs.string = new_strings + (objs.string - strings);
}
However, given that 'strings' is no longer a valid pointer, is this
version any less reprehensible than my original code?
I look forward to your comments.
TIA,
I wasted a lot of time yesterday writing some code which manages a
collection of strings within a single heap block allocated by a
function similar to 'malloc'. A separate array of structs includes
members which point to the start of each string. My intention is to
replace existing (working) code where each string is held in a
separate heap block, to reduce a high turnover of blocks.
When replacing existing strings with longer strings, or adding strings
to the collection, the heap block containing the strings must be
extended using a function like 'realloc'. However, 'realloc' may move
the base address of the block, which invalidates the pointers to the
start of each string. I don't really want to replace all my 'char *'
members with 'ptrdiff_t' (offset from start of heap block containing
strings) because of the loss of type-safety and additional cost of
accessing the strings.
I thought I could solve my problem by adding a relocation offset to
each pointer immediately after calling 'realloc', derived by
subtracting the old from the new address of the heap block. However,
turning to the appendix of my copy of Kernighan & Ritchie, I
discovered - to my dismay - that the result of subtracting one pointer
from another is undefined unless both point to objects within the same
array.
Presumably that means the following code would have undefined effects:
unsigned int i;
char *strings, *new_strings;
ptrdiff_t relocate;
struct
{
char *string;
}
objs[10];
/* Resize heap block containing strings */
new_strings = realloc(strings, new_size);
if (new_strings == NULL)
{
/* ...handle error... */
}
/* Relocate pointers to the start of each string */
relocate = new_strings - strings;
for (i = 0; i < sizeof(objs) / sizeof(objs[0]); i++)
{
objs.string += relocate;
}
strings = new_strings;
Can anyone think of a machine architecture where the above code would
not work? It seems likely to be a common idiom, even if not strictly
legal.
It occurs to me that I could circumvent the K&R restriction by
utilitising the old address of the heap block within my relocation
loop:
for (i = 0; i < sizeof(objs) / sizeof(objs[0]); i++)
{
objs.string = new_strings + (objs.string - strings);
}
However, given that 'strings' is no longer a valid pointer, is this
version any less reprehensible than my original code?
I look forward to your comments.
TIA,