Buffer or Realloc?

B

bwaichu

One problem I see with the above is that I am not setting any ceiling
on how big the buffer can get. Should I restrict size or let realloc()
handle that?

Here's my second attempt at building a function to handle buffers:

/* adj the size of the buffer based on the length of the input string
*/
char *
adj_buffer(char **buffer, size_t len, size_t *curr_buff_size) {

char *f_buffer;
char *new_buffer;
size_t num;

f_buffer = *buffer;
new_buffer = NULL;

if ( *curr_buff_size < len) {
for(num = BUFSIZ; num < len; num += BUFSIZ)
; /* nothing */
}
else
return f_buffer;

if ( (new_buffer = realloc(f_buffer, num)) == NULL) {
free(f_buffer);
f_buffer = NULL;
return (NULL);
}
f_buffer = new_buffer;
memset(f_buffer, 0, num);
*curr_buff_size = num;
return f_buffer;
}
 
C

CBFalconer

One problem I see with the above is that I am not setting any
ceiling on how big the buffer can get. Should I restrict size or
let realloc() handle that?

There is no 'above' here, so your post is meaningless. Google is
not usenet, it is only a poor imitation of an interface to usenet.
You should always ensure your articles can stand by themselves,
which means adequate quoting. See my sig below (which is slightly
out of date) and especially the referenced URLs.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 
I

Ian Collins

One problem I see with the above is that I am not setting any ceiling
on how big the buffer can get. Should I restrict size or let realloc()
handle that?
Above what? You haven't provided any context. Please do in future.
Here's my second attempt at building a function to handle buffers:

/* adj the size of the buffer based on the length of the input string
*/
char *
adj_buffer(char **buffer, size_t len, size_t *curr_buff_size) {

char *f_buffer;
char *new_buffer;
size_t num;

f_buffer = *buffer;
new_buffer = NULL;

if ( *curr_buff_size < len) {
for(num = BUFSIZ; num < len; num += BUFSIZ)
; /* nothing */

num is now inappropriately named as it is a size.

Why not stick with a simple division?
}
else
return f_buffer;

if ( (new_buffer = realloc(f_buffer, num)) == NULL) {
free(f_buffer);
f_buffer = NULL;
return (NULL);
}
f_buffer = new_buffer;
memset(f_buffer, 0, num);

Again, are you sure you want to do this? If so, please explain why.
 
B

bwaichu

Ian said:
num is now inappropriately named as it is a size.

You lost me. Is this just a question of style? I can call it
something
else. It's just the size of what I will pass to realloc below. I
probably
should just call it size to mirror the function.

I have since decided to increase the buffer as:

num +=num

which should reduce my calls, but I endup with a large buffer.
Why not stick with a simple division?

Division of what? Can you provide an example?
Again, are you sure you want to do this? If so, please explain why.

Are you talking about filling up the allocated space with 0's? If so,
the only reason why I am doing it is so that I am not passing allocated
memory filled up with garbage. I am just writing this as if I had
called
calloc again. What is the con of doing this?
 
I

Ian Collins

Ian Collins wrote:




You lost me. Is this just a question of style? I can call it
something
else. It's just the size of what I will pass to realloc below. I
probably
should just call it size to mirror the function.
That's a more appropriate name.
I have since decided to increase the buffer as:

num +=num

which should reduce my calls, but I endup with a large buffer.




Division of what? Can you provide an example?
Like you had before,

num = len / BUFSIZ;

if( len % BUFSIZ != 0 ) ++num;
Are you talking about filling up the allocated space with 0's? If so,
the only reason why I am doing it is so that I am not passing allocated
memory filled up with garbage. I am just writing this as if I had
called
calloc again. What is the con of doing this?
You destroy the data that was in the original buffer, whereas realloc
preserves it.
 
B

bwaichu

Ian said:
Like you had before,

num = len / BUFSIZ;

if( len % BUFSIZ != 0 ) ++num;

What is the advantage of one over the other? If I use the above,
I have to check to make sure I don't overflow. But I immediately
know how big num has to be. I can also avoid a call to realloc
if I know I am going to overflow.

With incrementing the for loop, I have to waste an unknown number of
loops to arrive at the correct size, but I can avoid the num * size
idiom.

I changed it because I wanted to get away from the num * size idiom.
It seems less error prone to just pass a size. And the code is easier
to read. I also don't have to worry about testing to see if I am
bigger
than SIZE_MAX as I can never become bigger than size_t.

Now, if I can figure out a way to avoid starting at BUFSIZ when I know
the
size of the original buffer is greater, than I might be able to save
some
for loops. This is where I am tempted to go back to the old approach.

You destroy the data that was in the original buffer, whereas realloc
preserves it.

So your suggestion is that if I am really just wrapping a function
around realloc,
then I should shouldn't change the behavior of realloc? I can agree
with that
approach. It's something I overlooked.
 
B

bwaichu

CBFalconer said:
There is no 'above' here, so your post is meaningless. Google is
not usenet, it is only a poor imitation of an interface to usenet.
You should always ensure your articles can stand by themselves,
which means adequate quoting. See my sig below (which is slightly
out of date) and especially the referenced URLs.

I have been clicking show options. I meant below. Typo on my part.

I used to use Pine, but I figured this would be easier.
 
W

websnarf

What is the best way to manage the current size of the buffer? The
first pass through here, I know, what the initial buffer size was, but
the second time through here, I do not, so I agree with your assessment
above. What is the best way to calculate the size
of a memory allocated buffer? I assume I should do it as part of this
function, so it's self-contained.

Well, you should *pass in* the old size as an additional parameter.
Unfortunately, C does not let you extract what the allocation size is
for a given pointer which has been allocated, so you have to track it
by yourself.

Since the size gets modified by your reallocating it, you want to pass
in the pointer to the variable holding the length so that you can
change it on the way out. I would recommend something like the
following:

int strSizeAdjust (char ** str, int * memLength, int minSize) {
int i;
char * newStr;

if (*memLength > minSize) return 0; /* No size adjustment
needed */
for (i=*memLength; i <= minSize; i++) {};
newStr = (char *) realloc (*str, memLength * sizeof (char));
if (NULL == newStr) return -__LINE__; /* Out of memory error */
*str = newStr;
*memLength = i;
return 1; /* A size adjustment was made */
}
I meant (same difference):

f_buffer = NULL;

Maybe I am using lint incorrectly. What flags would have caught that?

Dunno, its been a while since I last used the real lint. I'm just
surpised it didn't catch a "do nothing" operation.
Would memset (which is ANSI) be a better alternative or should I skip
this step?

memset(f_buffer, 0, sizeof(f_buffer));

You can skip the step, and if performance matters to you, you probably
should. sizeof() is the wrong operator to be using here. The length
of the buffer being pointered to is num*BUFSIZ, not sizeof(f_buffer).
 
S

SM Ryan

# One problem I see with the above is that I am not setting any ceiling
# on how big the buffer can get. Should I restrict size or let realloc()
# handle that?

# if ( *curr_buff_size < len) {
# for(num = BUFSIZ; num < len; num += BUFSIZ)
# ; /* nothing */
# }

Increasing buffer size by a constant factor rather than a constant
increment keeps the algorithm linear in time and space. If you
increase the buffer by a factor of f, f>1, you risk overshooting
the space by a (f-1) times the file size. The code is linear, but
it can still have a large factor on its linearity.

On virtual memory systems with most files, this will work within
the address space. As 64-bit pointer vm machines become popular,
this will fit even easier. So you can have something that is likely
to work on typical files. If that's good enough, you're done.

However it will not work on all systems with all files. If you
want that, you have to stream through the file, possibly setting
up buffer frames and other techniques.

If you're willing to restrict your code to a subset of systems
such as unix, then you can also use memory mapping of files. This
essentially uses kernel paging to allocate a sufficiently large
buffer and fill it with file in one operation.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,188
Latest member
Crypto TaxSoftware

Latest Threads

Top