Buffer or Realloc?

bwaichu · Aug 29, 2006

One problem I see with the above is that I am not setting any ceiling
on how big the buffer can get. Should I restrict size or let realloc()
handle that?

Here's my second attempt at building a function to handle buffers:

/* adj the size of the buffer based on the length of the input string
*/
char *
adj_buffer(char **buffer, size_t len, size_t *curr_buff_size) {

char *f_buffer;
char *new_buffer;
size_t num;

f_buffer = *buffer;
new_buffer = NULL;

if ( *curr_buff_size < len) {
for(num = BUFSIZ; num < len; num += BUFSIZ)
; /* nothing */
}
else
return f_buffer;

if ( (new_buffer = realloc(f_buffer, num)) == NULL) {
free(f_buffer);
f_buffer = NULL;
return (NULL);
}
f_buffer = new_buffer;
memset(f_buffer, 0, num);
*curr_buff_size = num;
return f_buffer;
}

CBFalconer · Aug 29, 2006

[email protected] said:
One problem I see with the above is that I am not setting any
ceiling on how big the buffer can get. Should I restrict size or
let realloc() handle that?

There is no 'above' here, so your post is meaningless. Google is
not usenet, it is only a poor imitation of an interface to usenet.
You should always ensure your articles can stand by themselves,
which means adequate quoting. See my sig below (which is slightly
out of date) and especially the referenced URLs.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>

Ian Collins · Aug 29, 2006

One problem I see with the above is that I am not setting any ceiling
on how big the buffer can get. Should I restrict size or let realloc()
handle that?

Above what? You haven't provided any context. Please do in future.

Here's my second attempt at building a function to handle buffers:

/* adj the size of the buffer based on the length of the input string
*/
char *
adj_buffer(char **buffer, size_t len, size_t *curr_buff_size) {

char *f_buffer;
char *new_buffer;
size_t num;

f_buffer = *buffer;
new_buffer = NULL;

if ( *curr_buff_size < len) {
for(num = BUFSIZ; num < len; num += BUFSIZ)
; /* nothing */

num is now inappropriately named as it is a size.

Why not stick with a simple division?

}
else
return f_buffer;

if ( (new_buffer = realloc(f_buffer, num)) == NULL) {
free(f_buffer);
f_buffer = NULL;
return (NULL);
}
f_buffer = new_buffer;
memset(f_buffer, 0, num);

Again, are you sure you want to do this? If so, please explain why.

bwaichu · Aug 29, 2006

Ian said:
num is now inappropriately named as it is a size.

You lost me. Is this just a question of style? I can call it
something
else. It's just the size of what I will pass to realloc below. I
probably
should just call it size to mirror the function.

I have since decided to increase the buffer as:

num +=num

which should reduce my calls, but I endup with a large buffer.

Why not stick with a simple division?

Division of what? Can you provide an example?

Again, are you sure you want to do this? If so, please explain why.

Are you talking about filling up the allocated space with 0's? If so,
the only reason why I am doing it is so that I am not passing allocated
memory filled up with garbage. I am just writing this as if I had
called
calloc again. What is the con of doing this?

Ian Collins · Aug 29, 2006

Ian Collins wrote:

You lost me. Is this just a question of style? I can call it
something
else. It's just the size of what I will pass to realloc below. I
probably
should just call it size to mirror the function.

That's a more appropriate name.

I have since decided to increase the buffer as:

num +=num

which should reduce my calls, but I endup with a large buffer.

Division of what? Can you provide an example?

Like you had before,

num = len / BUFSIZ;

if( len % BUFSIZ != 0 ) ++num;

Are you talking about filling up the allocated space with 0's? If so,
the only reason why I am doing it is so that I am not passing allocated
memory filled up with garbage. I am just writing this as if I had
called
calloc again. What is the con of doing this?

You destroy the data that was in the original buffer, whereas realloc
preserves it.

bwaichu · Aug 30, 2006

Ian said:
Like you had before,

num = len / BUFSIZ;

if( len % BUFSIZ != 0 ) ++num;

What is the advantage of one over the other? If I use the above,
I have to check to make sure I don't overflow. But I immediately
know how big num has to be. I can also avoid a call to realloc
if I know I am going to overflow.

With incrementing the for loop, I have to waste an unknown number of
loops to arrive at the correct size, but I can avoid the num * size
idiom.

I changed it because I wanted to get away from the num * size idiom.
It seems less error prone to just pass a size. And the code is easier
to read. I also don't have to worry about testing to see if I am
bigger
than SIZE_MAX as I can never become bigger than size_t.

Now, if I can figure out a way to avoid starting at BUFSIZ when I know
the
size of the original buffer is greater, than I might be able to save
some
for loops. This is where I am tempted to go back to the old approach.

You destroy the data that was in the original buffer, whereas realloc
preserves it.

So your suggestion is that if I am really just wrapping a function
around realloc,
then I should shouldn't change the behavior of realloc? I can agree
with that
approach. It's something I overlooked.

bwaichu · Aug 30, 2006

CBFalconer said:
There is no 'above' here, so your post is meaningless. Google is
not usenet, it is only a poor imitation of an interface to usenet.
You should always ensure your articles can stand by themselves,
which means adequate quoting. See my sig below (which is slightly
out of date) and especially the referenced URLs.

I have been clicking show options. I meant below. Typo on my part.

I used to use Pine, but I figured this would be easier.

websnarf · Aug 30, 2006

What is the best way to manage the current size of the buffer? The
first pass through here, I know, what the initial buffer size was, but
the second time through here, I do not, so I agree with your assessment
above. What is the best way to calculate the size
of a memory allocated buffer? I assume I should do it as part of this
function, so it's self-contained.

Well, you should *pass in* the old size as an additional parameter.
Unfortunately, C does not let you extract what the allocation size is
for a given pointer which has been allocated, so you have to track it
by yourself.

Since the size gets modified by your reallocating it, you want to pass
in the pointer to the variable holding the length so that you can
change it on the way out. I would recommend something like the
following:

int strSizeAdjust (char ** str, int * memLength, int minSize) {
int i;
char * newStr;

if (*memLength > minSize) return 0; /* No size adjustment
needed */
for (i=*memLength; i <= minSize; i++) {};
newStr = (char *) realloc (*str, memLength * sizeof (char));
if (NULL == newStr) return -__LINE__; /* Out of memory error */
*str = newStr;
*memLength = i;
return 1; /* A size adjustment was made */
}

I meant (same difference):

f_buffer = NULL;

Maybe I am using lint incorrectly. What flags would have caught that?

Dunno, its been a while since I last used the real lint. I'm just
surpised it didn't catch a "do nothing" operation.

Would memset (which is ANSI) be a better alternative or should I skip
this step?

memset(f_buffer, 0, sizeof(f_buffer));

You can skip the step, and if performance matters to you, you probably
should. sizeof() is the wrong operator to be using here. The length
of the buffer being pointered to is num*BUFSIZ, not sizeof(f_buffer).

SM Ryan · Aug 30, 2006

# One problem I see with the above is that I am not setting any ceiling
# on how big the buffer can get. Should I restrict size or let realloc()
# handle that?

# if ( *curr_buff_size < len) {
# for(num = BUFSIZ; num < len; num += BUFSIZ)
# ; /* nothing */
# }

Increasing buffer size by a constant factor rather than a constant
increment keeps the algorithm linear in time and space. If you
increase the buffer by a factor of f, f>1, you risk overshooting
the space by a (f-1) times the file size. The code is linear, but
it can still have a large factor on its linearity.

On virtual memory systems with most files, this will work within
the address space. As 64-bit pointer vm machines become popular,
this will fit even easier. So you can have something that is likely
to work on typical files. If that's good enough, you're done.

However it will not work on all systems with all files. If you
want that, you have to stream through the file, possibly setting
up buffer frames and other techniques.

If you're willing to restrict your code to a subset of systems
such as unix, then you can also use memory mapping of files. This
essentially uses kernel paging to allocate a sufficiently large
buffer and fill it with file in one operation.

Alter line of file	4	Mar 24, 2009
regex.h	5	Feb 18, 2007
Strange problem with PCRE	7	Jun 16, 2012
regex doesn't recognize a pattern in a string	2	Nov 7, 2007
Increase memory buffer with realloc or malloc?	7	Sep 7, 2009
Help wih regexec() function in libc.a library ( Solaris platform)	2	Mar 13, 2006
Posix regexec - matching subexpression	6	Feb 13, 2005
Expanding buffer - response to "Determine the size of malloc" query	26	May 30, 2008

Buffer or Realloc?

bwaichu

CBFalconer

Ian Collins

bwaichu

Ian Collins

bwaichu

bwaichu

websnarf

SM Ryan

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads