General method for dynamically allocating memory for a string

S

smnoff

I have searched the internet for malloc and dynamic malloc; however, I still
don't know or readily see what is general way to allocate memory to char *
variable that I want to assign the substring that I found inside of a
string.

Any ideas?
 
B

Ben Pfaff

smnoff said:
I have searched the internet for malloc and dynamic malloc; however, I still
don't know or readily see what is general way to allocate memory to char *
variable that I want to assign the substring that I found inside of a
string.

If you want to allocate memory for a string, pass the number of
non-null characters in it, plus one, to malloc, then copy the
string into it, being sure to null-terminate the result.

If you can be more specific about what you're trying to do,
perhaps we can help with some details.
 
F

Frederick Gotham

smnoff posted:
I have searched the internet for malloc and dynamic malloc; however, I
still don't know or readily see what is general way to allocate memory
to char * variable that I want to assign the substring that I found
inside of a string.

Any ideas?


Unchecked code, may contain an error or two:

#include <stdlib.h>
#include <stddef.h>
#include <assert.h>

char *to_release;

void ReleaseLastString(void)
{
free(to_release);
}

char const *CreateSubstring(char const *const p,
size_t const istart,size_t const iend)
{
int assert_dummy = (assert(!!p),assert(!!istart),assert(!!iend),0);

if(!p[iend+1]) return to_release = 0, p + istart;

to_release = malloc(iend - istart + 1);

memcpy(to_release,p,iend - istart);

to_release[iend - istart] = 0;

return to_release;
}

int main()
{
puts(CreateSubstring("abcdefghijklmnop",2,7));
ReleaseLastString();

puts(CreateSubstring("abcdefghijklmnop",4,15));
ReleaseLastString();
}
 
S

smnoff

Frederick Gotham said:
smnoff posted:



Unchecked code, may contain an error or two:

#include <stdlib.h>
#include <stddef.h>
#include <assert.h>

char *to_release;

void ReleaseLastString(void)
{
free(to_release);
}

char const *CreateSubstring(char const *const p,
size_t const istart,size_t const iend)
{
int assert_dummy = (assert(!!p),assert(!!istart),assert(!!iend),0);



Why are there double exclamation marks in the line show above?
 
R

Randall

Frederick's code is hard to read and harder to learn from. Here is
some fully checked code with lots of comments.


///////////////////////////////////////////////////////////////////////////////
// file: substr.c
// Note: C++ style comments are allowed for C99 compliant compilers.
///////////////////////////////////////////////////////////////////////////////
#include <stdlib.h> // for malloc() and free()
#include <string.h> // for strncpy()
#include <sys/types.h> // for size_t
#include <stdio.h> // for printf()

char * substr( char * string, size_t start, size_t end);

int main( void ) {

char *str1 = substr("abcdefghijklmnop",2,7);
char *str2 = substr("abcdefghijklmnop",4,15);

if( str1 != NULL ) {
printf( "str1: %s\n", str1);
} else {
// Setting a null pointer to zero ensures you
// can delete it more than once (free) without
// undefined behavior. This is a good
// programming habit.
str1 = 0;
}

if( str2 != NULL ) {
printf( "str2: %s\n", str2 );
} else {
// Setting a null pointer to zero ensures you
// can delete it more than once (free) without
// undefined behavior. This is a good
// programming habit.
str2 = 0;
}

free( str1 );
free( str2 );

return 0;
}

/**
* Note: this function will return a newly allocated string. It
* is your responsibility to delete this memory to prevent a leak.
*
* param "string" - the string you want to extract a substring from.
* param "start" - the array index to begin your substring.

* param "start" - the array index to begin your substring.
* param "end" - the array index to terminate your substring.
*
* On Error: this function returns null;
*/
char * substr( char * string, size_t start, size_t end) {
// pointer to the substring on the heap
char *subString;

// calculate the total amount of memory needed
// to hold the substring.
// Algo: end - start + null terminator
size_t subStringSize = end - start + 1;

// request enough bytes to store the entire
// substring and null terminator.
subString = malloc( subStringSize );

// test to make sure we got the memory
// from malloc
if( subString != NULL ) {
// Note this copies one extra byte (the
// null terminator's spot) which is garbage.
// We have to terminate this string.
strncpy( subString, string + start, subStringSize );

subString[subStringSize] = '\0';
}

// This will either be NULL if we didn't get the
// memory from malloc or it will have our substring.
return subString;

} // end function substr


I hope this helps you.

-Randall
Frederick said:
smnoff posted:
I have searched the internet for malloc and dynamic malloc; however, I
still don't know or readily see what is general way to allocate memory
to char * variable that I want to assign the substring that I found
inside of a string.

Any ideas?


Unchecked code, may contain an error or two:

#include <stdlib.h>
#include <stddef.h>
#include <assert.h>

char *to_release;

void ReleaseLastString(void)
{
free(to_release);
}

char const *CreateSubstring(char const *const p,
size_t const istart,size_t const iend)
{
int assert_dummy = (assert(!!p),assert(!!istart),assert(!!iend),0);

if(!p[iend+1]) return to_release = 0, p + istart;

to_release = malloc(iend - istart + 1);

memcpy(to_release,p,iend - istart);

to_release[iend - istart] = 0;

return to_release;
}

int main()
{
puts(CreateSubstring("abcdefghijklmnop",2,7));
ReleaseLastString();

puts(CreateSubstring("abcdefghijklmnop",4,15));
ReleaseLastString();
}
 
K

Keith Thompson

smnoff said:
Why are there double exclamation marks in the line show above?

The assert() macro doesn't necessarily accept an argument of a type
other than int. (It does in C99, but not all compilers support C99.)

The ! (logical not) operator, applied to any scalar operand, yields
the int value 1 if the operand compares equal to 0, 0 if it doesn't.
A pointer value compares equal to 0 only if it's a null pointer. So
!p means "p is a null pointer". Applying it a second time reverses
the result, so !!p means "p is not a null pointer". !! normalizes a
scalar value, mapping 0 to 0 and anything else to 1.

So the declaration asserts that each of the pointers is non-null.
 
J

jaysome

Frederick's code is hard to read and harder to learn from. Here is
some fully checked code with lots of comments.


///////////////////////////////////////////////////////////////////////////////
// file: substr.c
// Note: C++ style comments are allowed for C99 compliant compilers.
///////////////////////////////////////////////////////////////////////////////
#include <stdlib.h> // for malloc() and free()
#include <string.h> // for strncpy()
#include <sys/types.h> // for size_t
#include <stdio.h> // for printf()

char * substr( char * string, size_t start, size_t end);

int main( void ) {

char *str1 = substr("abcdefghijklmnop",2,7);
char *str2 = substr("abcdefghijklmnop",4,15);

if( str1 != NULL ) {
printf( "str1: %s\n", str1);
} else {
// Setting a null pointer to zero ensures you
// can delete it more than once (free) without
// undefined behavior. This is a good
// programming habit.
str1 = 0;
}

if( str2 != NULL ) {
printf( "str2: %s\n", str2 );
} else {
// Setting a null pointer to zero ensures you
// can delete it more than once (free) without
// undefined behavior. This is a good
// programming habit.
str2 = 0;
}

free( str1 );
free( str2 );

return 0;
}

/**
* Note: this function will return a newly allocated string. It
* is your responsibility to delete this memory to prevent a leak.
*
* param "string" - the string you want to extract a substring from.
* param "start" - the array index to begin your substring.

* param "start" - the array index to begin your substring.
* param "end" - the array index to terminate your substring.
*
* On Error: this function returns null;
*/
char * substr( char * string, size_t start, size_t end) {
// pointer to the substring on the heap
char *subString;

// calculate the total amount of memory needed
// to hold the substring.
// Algo: end - start + null terminator
size_t subStringSize = end - start + 1;

// request enough bytes to store the entire
// substring and null terminator.
subString = malloc( subStringSize );

Make that:

subString = malloc( subStringSize + 1 );
// test to make sure we got the memory
// from malloc
if( subString != NULL ) {
// Note this copies one extra byte (the
// null terminator's spot) which is garbage.
// We have to terminate this string.
strncpy( subString, string + start, subStringSize );

subString[subStringSize] = '\0';

Otherwise this indexes one past the end of malloc()ed memory, which is
undefined behavior.

Best regards
 
R

Richard Heathfield

Randall said:
Frederick's code is hard to read and harder to learn from. Here is
some fully checked code with lots of comments.

It didn't compile for me, so I just took a very quick look, and one thing
caught my eye:
if( str1 != NULL ) {
printf( "str1: %s\n", str1);
} else {
// Setting a null pointer to zero ensures you
// can delete it more than once (free) without
// undefined behavior. This is a good
// programming habit.
str1 = 0;
}

This is exactly equivalent to:

if(str1 != NULL) {
printf("str1: %s\n", str1);
}

The else is utterly unnecessary because, if str1 is not a null pointer, it
is never entered, and if it /is/ a null pointer, setting it to a null
pointer value is a redundant operation.
if( str2 != NULL ) {
printf( "str2: %s\n", str2 );
} else {

Likewise.
 
F

Frederick Gotham

Randall posted:
#include <stdlib.h> // for malloc() and free()
#include <string.h> // for strncpy()
#include <sys/types.h> // for size_t



Non-standard header. "stddef.h" contains "size_t".


#include <stdio.h> // for printf()

char * substr(char * string, size_t start, size_t end);



First parameter should be "pointer to const".


int main( void ) {

char *str1 = substr("abcdefghijklmnop",2,7);
char *str2 = substr("abcdefghijklmnop",4,15);



Both variables should be const. (You don't want the address to change
before you pass it to "free".)


if( str1 != NULL ) {
printf( "str1: %s\n", str1);
} else {
// Setting a null pointer to zero ensures you
// can delete it more than once (free) without
// undefined behavior. This is a good
// programming habit.
str1 = 0;
}

if( str2 != NULL ) {
printf( "str2: %s\n", str2 );
} else {
// Setting a null pointer to zero ensures you
// can delete it more than once (free) without
// undefined behavior. This is a good
// programming habit.
str2 = 0;
}



As Richard Heathfield pointed out, both "else" clauses are redundant.


free( str1 );
free( str2 );

return 0;
}

/**
* Note: this function will return a newly allocated string. It
* is your responsibility to delete this memory to prevent a leak.
*
* param "string" - the string you want to extract a substring from.
* param "start" - the array index to begin your substring.

* param "start" - the array index to begin your substring.
* param "end" - the array index to terminate your substring.
*
* On Error: this function returns null;
*/
char * substr( char * string, size_t start, size_t end) {



First parameter should be "pointer to const".


// pointer to the substring on the heap
char *subString;

// calculate the total amount of memory needed
// to hold the substring.
// Algo: end - start + null terminator
size_t subStringSize = end - start + 1;



Perhaps you should have written:

size_t len = end - start;

And then added the +1 only when malloc'ing.


// request enough bytes to store the entire
// substring and null terminator.
subString = malloc( subStringSize );



subString = malloc(len+1);


// test to make sure we got the memory
// from malloc
if( subString != NULL ) {
// Note this copies one extra byte (the
// null terminator's spot) which is garbage.
// We have to terminate this string.
strncpy( subString, string + start, subStringSize );



"memcpy" would be more efficient if you were strict about the length of
the string, e.g.:

assert( strlen(string) - istart >= len );


subString[subStringSize] = '\0';



Here you invoke undefine behaviour by writing past the end of the buffer.

subString[len] = 0;

You should _always_ check array indexes.
 
R

Richard Bos

Frederick Gotham said:
Randall posted:


Non-standard header. "stddef.h" contains "size_t".

.... <stdio.h>.

It is not often necessary to #include <stddef.h>. It is very rarely
necessary to do so just for the declaration of size_t. All headers whose
functions need it also declare it themselves, and it's a rare program
that does not use one of those headers.
First parameter should be "pointer to const".

No, it shouldn't, because that's not how the function is defined. It may
be a wise idea to _define_ it as a pointer to const char, or even a
const pointer to const char, and then also declare it as such, but
"should" is too strong.
As Richard Heathfield pointed out, both "else" clauses are redundant.

The reasoning given in the comment is also bogus. It is a very _bad_
programming habit to start expecting that you can free pointers twice.
Perhaps you should have written:

size_t len = end - start;

And then added the +1 only when malloc'ing.

Why?

Richard
 
B

Ben Pfaff

The reasoning given in the comment is also bogus. It is a very _bad_
programming habit to start expecting that you can free pointers twice.

You can free a null pointer any number of times you like. I
think that is what the "it" in "delete it more than once" means.
 
W

websnarf

smnoff said:
I have searched the internet for malloc and dynamic malloc; however, I still
don't know or readily see what is general way to allocate memory to char *
variable that I want to assign the substring that I found inside of a
string.

The C language makes this a bigger pain in the ass than it needs to be
as the discusion that has followed this shows. You can see how this is
done more easily in the Better String Library which you can get here:
http://bstring.sf.net/

If you just want to do it directly then:

#include <stdlib.h>
#include <string.h>

char * substralloc (const char * src, size_t pos, size_t len) {
size_t i;
char * substr;
if (NULL == src || len < pos) return NULL; /* Bad parameters */
for (len += pos, i = 0; i < len; i++) {
if ('\0' == src) {
if (i < pos) i = pos;
break;
}
}
i -= pos;
if (NULL != (substr = (char *) malloc ((1 + i) * sizeof (char)))) {
if (i) memcpy (substr, src+pos, i);
substr = '\0';
}
return substr;
}

So if there is an error in the meaning of the parameters or if there is
a memory allocation failure then NULL is returned. If you ask for a
substring which is beyond the end of the source string, or whose length
exceeds the end othe source string, then the result is truncated.
 
W

websnarf

Randall said:
Frederick's code is hard to read and harder to learn from. Here is
some fully checked code with lots of comments.

///////////////////////////////////////////////////////////////////////////////
// file: substr.c
// Note: C++ style comments are allowed for C99 compliant compilers.
///////////////////////////////////////////////////////////////////////////////

Note: C99 compliant compilers are in very short supply. (Though
compilers that *claim* to be C99 compilers is a little higher and those
that accept // comments are even higher.)
#include <stdlib.h> // for malloc() and free()
#include <string.h> // for strncpy()
#include <sys/types.h> // for size_t
#include <stdio.h> // for printf()

[...]

/**
* Note: this function will return a newly allocated string. It
* is your responsibility to delete this memory to prevent a leak.
*
* param "string" - the string you want to extract a substring from.
* param "start" - the array index to begin your substring.

* param "start" - the array index to begin your substring.
* param "end" - the array index to terminate your substring.
*
* On Error: this function returns null;
*/
char * substr( char * string, size_t start, size_t end) {
// pointer to the substring on the heap
char *subString;

// calculate the total amount of memory needed
// to hold the substring.
// Algo: end - start + null terminator
size_t subStringSize = end - start + 1;

// request enough bytes to store the entire
// substring and null terminator.
subString = malloc( subStringSize );

// test to make sure we got the memory
// from malloc
if( subString != NULL ) {
// Note this copies one extra byte (the
// null terminator's spot) which is garbage.
// We have to terminate this string.
strncpy( subString, string + start, subStringSize );

This leads to UB, since you have not established that string + start is
a valid thing to be pointing at.
 
W

websnarf

Frederick said:
smnoff posted:

Unchecked code, may contain an error or two:

(At least ...)
#include <stdlib.h>
#include <stddef.h>
#include <assert.h>

char *to_release;

Are you seriously making this global?
void ReleaseLastString(void)
{
free(to_release);
}

char const *CreateSubstring(char const *const p,
size_t const istart,size_t const iend)
{
int assert_dummy = (assert(!!p),assert(!!istart),assert(!!iend),0);

if(!p[iend+1]) return to_release = 0, p + istart;

Homiesaywhat? What the hell is that if() condition trying to do? And
what is p+istart trying to do? Are you trying to say p += istart?
to_release = malloc(iend - istart + 1);

Ok, so what if istart > iend? And what if to_release contained some
unfreed contents before this call -- wouldn't that lead to a leak?
memcpy(to_release,p,iend - istart);

What if to_release was set to NULL (because of a memory allocation
failure)?
to_release[iend - istart] = 0;

return to_release;
}
 
W

websnarf

#include <stdlib.h>
#include <string.h>

char * substralloc (const char * src, size_t pos, size_t len) {
size_t i;
char * substr;
if (NULL == src || len < pos) return NULL; /* Bad parameters */

Whoops! That should just be if (NULL == src) return NULL;
 
A

Ancient_Hacker

smnoff said:
I have searched the internet for malloc and dynamic malloc; however, I still
don't know or readily see what is general way to allocate memory to char *
variable that I want to assign the substring that I found inside of a
string.

Any ideas?

It would help if you could tell us a bit more about the strings:

(1) Are we allowed to mangle the source string?

(2) Can we assume the source string is going to exist for as long as
you need the substring?

(3) How many substrings do you need at any one instant? How long are
they on average?

-----
You see most malloc()'s have considerable overhead. If we can mangle
the source string, we can return the substring as a pointer into the
source string, with a zero byte at the end of the substring. Which is
economical, unless you need the original string or maybe another
substring might match at substring end + 1?

If the lifetime of the substrings can be determined, it might be
simpler faster and easier to just have a small array of substrings
rather than allocate each darn one on the heap.
 
R

Randall

I corrected the code to reflect all the feedback. Special thanks to:
Jay, Frederick - pointing out my off-by-one array index error.
Richard - questioning the gratuitous NULLing of already
NULLed pointers.
Frederick - explicitly making a string immutable.
Richard - bringing up the issue of freeing or deleting memory twice.

The points requiring further clarification:
Issue 1: #include <sys/types.h> vs. <stddef.h>
As a Unix programmer I am accustomed to writing the following code:
#include <sys/types.h>
#include <unistd.h>
As unistd.h requires sys/types.h to be used correctly. Only _unistd.h
includes sys/types.h directly (the implementation file).

For straight C coding (platform independent) I would prefer <stddef.h>
because it is smaller. Both are correct; this issue comes down to
taste and personal preference.

Issue 2: freeing or deleting given a NULL pointer more than once.
I should have written this code:
free( ptr_to_memory);
ptr_to_memory = NULL;

That way the ptr_to_memory no longer points to freed memory. If your
program is complicated and mission critical, it's nice to know that if
you (accidentally) free or delete it twice you wont introduce undefined
behavior into your system.

///////////////////////////////////////////////////////////////////////////////
// file: substr.c
// Note: C++ style comments are allowed for (partially)
// C99 compliant compilers -- the part that allows
// C++ style comments. ; )
///////////////////////////////////////////////////////////////////////////////
#include <stdlib.h> // for malloc() and free()
#include <string.h> // for strncpy()
#include <stdio.h> // for printf()
#include <sys/types.h> // for size_t - goes great with
// standard Unix headers like
// <unistd.h>; in fact it is required.

char * substr( const char * string, size_t start, size_t end);

int main( void ) {

char *str1 = substr("abcdefghijklmnop",2,7);
char *str2 = substr("abcdefghijklmnop",4,15);

if( str1 != NULL )
printf( "str1: %s\n", str1);

if( str2 != NULL )
printf( "str2: %s\n", str2 );

free( str1 );
str1 = NULL;

free( str2 );
str2 = NULL;

return 0;
}

/**
* Note: this function will return a newly allocated string. It
* is your responsibility to delete this memory to prevent a leak.
*
* param "string" - the string you want to extract a substring from.
* param "start" - the array index to begin your substring.
* param "end" - the array index to terminate your substring.
*
* On Error: this function returns null.
*/
char * substr( const char * string, size_t start, size_t end) {
// pointer to the substring on the heap
char *subString;

// calculate the total amount of memory needed to hold the substring
size_t subStringLen = end - start;

// request enough bytes to store the entire substring and null
// terminator.
subString = malloc( subStringLen + 1);

// test to make sure we got the memory from malloc
if( subString != NULL ) {
// Use pointer arithmetic to skip to the specified
// starting point of the substring. From that
// position, copy the desired number of characters.
strncpy( subString, string + start, subStringLen );

// We have to terminate this string with a NULL.
subString[ subStringLen ] = '\0';
}

// This will either be NULL if we didn't get the
// memory from malloc or it will have our substring.
return subString;

} // end function substr
 
R

Richard Heathfield

Randall said:
I corrected the code to reflect all the feedback. Special thanks to:
Jay, Frederick - pointing out my off-by-one array index error.
Richard - questioning the gratuitous NULLing of already
NULLed pointers.

....and for pointing out that it didn't compile on his (GNU Linux/gcc)
system. Which it still doesn't.
 
R

Richard Bos

Ben Pfaff said:
You can free a null pointer any number of times you like. I
think that is what the "it" in "delete it more than once" means.

Yes, you can. My point was that relying on this capability by setting
pointers to null after you're done with them is a bad programming habit,
not a good one, because you _will_ get in the habit of calling free() on
any which pointer, with the reasoning that "it'll either be a valid
pointer or null". That habit will then bite you when you encounter an
exception; e.g., when you make a copy of a pointer and only set one copy
to null, or when you have to work with other people's code which doesn't
nullify any pointers.
The truly good programming habit, in this case, is to do your bleedin'
bookkeeping, and keep in mind which pointers you have free()d, and which
you haven't. Don't just pass any which pointer to any which function,
relying on cheap hacks to keep you "safe".

Richard
 
J

jacob navia

Ben said:
You can free a null pointer any number of times you like. I
think that is what the "it" in "delete it more than once" means.

Besides, if you want to use modern programming techniques, i.e. a
GARBAGE COLLECTOR and forget about free() accounting and all this mess,
this is

a VERY good habit

since setting a pointer to NULL tells the GC that
the associated memory is free to reuse.


It is interesting that none of the posters considers GC and how
it would enormously SIMPLIFY the writing of this code.

jacob
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top