more buffering

B

bwaichu

I am writing a function that:

1) checks the size of the a non c string being copied against the
available
buffer space; len should not be large enough to contain a nul
terminator (should I check for
this?)
2) if the space is too small, the function increases the buffer size by
a block and repeats as
necessary; there should be more effective ways to do this. I
started to experiment with this
in an earlier post here.
3) then memcpy's the string to the end of the written part of the
buffer
4) returns the current used amount (might want to return something
different; still undecided)

The buffer structure was built based on the advice I received here to
manage buffer information. What I want to know is if there is a better
way to approach this problem?
Ideally, I would also want a function that would tag a nul to the end
of the string to make it
a c string if I need that functionality. My goal is general use buffer
functions for network applications. If you see any c mistakes, please
tell me.

size_t
buffer_append(BUF *buffer, const char *string, size_t len) {

char *buf_temp = NULL;

allocate:

if (buffer_remaining(buffer) < len) {
buf_temp = realloc(buffer->buffer, buffer->size + 512);
if (buf_temp == NULL) {
free(buffer->buffer);
buffer->buffer = NULL;
errx(1, "realloc failed");
}
buffer->buffer = buf_temp;
buffer->size += 512;
if (buffer_remaining(buffer) < len)
goto allocate;
}

memcpy(buffer->buffer + buffer->end, string, len);
buffer->end += len;

return buffer->end;
}

size_t
buffer_remaining(BUF *buffer) {

return buffer->size - buffer->end;
}

And here's the BUF structure:

typedef struct {

char *buffer; /* buffer */
size_t size; /* allocated size of buffer */
size_t offset; /* offset of buffer -- used to walk the buffer*/
size_t end; /* end of buffer */
} BUF;
 
S

Samuel Stearley

1) What platform is this? Why is block size = 512 bytes? I'd rather
that realloc increments be = to the MMU page size which is probably 4
KB


2) use while loops
while (buffer_remaining(buffer) < len) {
buf_temp = realloc(buffer->buffer, buffer->size + 512);
if (buf_temp == NULL) {
free(buffer->buffer);
buffer->buffer = NULL;
errx(1, "realloc failed");
}
buffer->buffer = buf_temp;
buffer->size += 512;
}


3)
#define BLOCK_SIZE 512

buffer_append(BUF *buffer, const char *string, size_t len) {
unsigned long amount_needed
char *buf_temp = NULL;

if (buffer_remaining(buffer) < (len + 1))
{
amount_needed = (len - buffer_remaining(buffer)) + 1; /* 1 is for
zero terminator */
amount_needed += BLOCK_SIZE;
amount_needed &= (BLOCK_SIZE - 1); /* takes advantage of block size
being a power of 2 */

buf_temp = realloc(buffer->buffer, buffer->size + amount_needed);
if (buf_temp == NULL) {
free(buffer->buffer);
buffer->buffer = NULL;
errx(1, "realloc failed");
}
buffer->buffer = buf_temp;
buffer->size += amount_needed;
}

memcpy(buffer->buffer + buffer->end, string, len);
buffer->buffer[buffer->end + len] = 0; /* zero terminate */
buffer->end += len + 1;
return buffer->end
}
 
S

Samuel Stearley

1) What platform is this? Why is block size = 512 bytes? I'd rather
that realloc increments be = to the MMU page size which is probably 4
KB


2) use while loops
while (buffer_remaining(buffer) < len) {
buf_temp = realloc(buffer->buffer, buffer->size + 512);
if (buf_temp == NULL) {
free(buffer->buffer);
buffer->buffer = NULL;
errx(1, "realloc failed");
}
buffer->buffer = buf_temp;
buffer->size += 512;
}


3)
#define BLOCK_SIZE 512

buffer_append(BUF *buffer, const char *string, size_t len) {
unsigned long amount_needed
char *buf_temp = NULL;

if (buffer_remaining(buffer) < (len + 1))
{
amount_needed = (len - buffer_remaining(buffer)) + 1; /* 1 is for
zero terminator */
amount_needed += BLOCK_SIZE;
amount_needed &= (BLOCK_SIZE - 1); /* takes advantage of block size
being a power of 2 */
}

buf_temp = realloc(buffer->buffer, buffer->size + amount_needed);
if (buf_temp == NULL) {
free(buffer->buffer);
buffer->buffer = NULL;
errx(1, "realloc failed");
}
buffer->buffer = buf_temp;
buffer->size += amount_needed;

memcpy(buffer->buffer + buffer->end, string, len);
buffer->buffer[buffer->end + len] = 0; /* zero terminate */
buffer->end += len + 1;
return buffer->end
}
 
S

Samuel Stearley

quick fix:

amount_needed &= (BLOCK_SIZE - 1)

should be

amount_needed &= ~(BLOCK_SIZE - 1)

1) What platform is this? Why is block size = 512 bytes? I'd rather
that realloc increments be = to the MMU page size which is probably 4
KB


2) use while loops
while (buffer_remaining(buffer) < len) {
buf_temp = realloc(buffer->buffer, buffer->size + 512);
if (buf_temp == NULL) {
free(buffer->buffer);
buffer->buffer = NULL;
errx(1, "realloc failed");
}
buffer->buffer = buf_temp;
buffer->size += 512;
}


3)
#define BLOCK_SIZE 512

buffer_append(BUF *buffer, const char *string, size_t len) {
unsigned long amount_needed
char *buf_temp = NULL;

if (buffer_remaining(buffer) < (len + 1))
{
amount_needed = (len - buffer_remaining(buffer)) + 1; /* 1 is for
zero terminator */
amount_needed += BLOCK_SIZE;
amount_needed &= (BLOCK_SIZE - 1); /* takes advantage of block size
being a power of 2 */

buf_temp = realloc(buffer->buffer, buffer->size + amount_needed);
if (buf_temp == NULL) {
free(buffer->buffer);
buffer->buffer = NULL;
errx(1, "realloc failed");
}
buffer->buffer = buf_temp;
buffer->size += amount_needed;
}

memcpy(buffer->buffer + buffer->end, string, len);
buffer->buffer[buffer->end + len] = 0; /* zero terminate */
buffer->end += len + 1;
return buffer->end
}





I am writing a function that:

1) checks the size of the a non c string being copied against the
available
buffer space; len should not be large enough to contain a nul
terminator (should I check for
this?)
2) if the space is too small, the function increases the buffer size by
a block and repeats as
necessary; there should be more effective ways to do this. I
started to experiment with this
in an earlier post here.
3) then memcpy's the string to the end of the written part of the
buffer
4) returns the current used amount (might want to return something
different; still undecided)

The buffer structure was built based on the advice I received here to
manage buffer information. What I want to know is if there is a better
way to approach this problem?
Ideally, I would also want a function that would tag a nul to the end
of the string to make it
a c string if I need that functionality. My goal is general use buffer
functions for network applications. If you see any c mistakes, please
tell me.

size_t
buffer_append(BUF *buffer, const char *string, size_t len) {

char *buf_temp = NULL;

allocate:

if (buffer_remaining(buffer) < len) {
buf_temp = realloc(buffer->buffer, buffer->size + 512);
if (buf_temp == NULL) {
free(buffer->buffer);
buffer->buffer = NULL;
errx(1, "realloc failed");
}
buffer->buffer = buf_temp;
buffer->size += 512;
if (buffer_remaining(buffer) < len)
goto allocate;
}

memcpy(buffer->buffer + buffer->end, string, len);
buffer->end += len;

return buffer->end;
 
B

bwaichu

Samuel said:
P.S.

Sorry for the double post.
The one at 9:02 pm is not correct

Is there an advantage of using the page size? If so, I can use
implementation functions to determine the size, but that would be
off-topic for this newsgroup. I know on my amd64 box that the OS is
paging at 4096 and that the block size is 512. But I should probably
use magic numbers for those values, so I can at least cross platform on
the same OS.
 
B

bwaichu

Samuel said:
1) What platform is this? Why is block size = 512 bytes? I'd rather
that realloc increments be = to the MMU page size which is probably 4
KB

<OT>
The platform is openBSD -current. I could use the page size, but I need
to know more to make that decision. Is there an advantage to using the
page size over the block size? I am not an expert in the page daemon
or uvm. But I get the gist of how it works.
2) use while loops
while (buffer_remaining(buffer) < len) {
buf_temp = realloc(buffer->buffer, buffer->size + 512);
if (buf_temp == NULL) {
free(buffer->buffer);
buffer->buffer = NULL;
errx(1, "realloc failed");
}
buffer->buffer = buf_temp;
buffer->size += 512;
}

I could also use for loops. Is there any advantage of one approach
over the other? But I do agree that the goto was pretty lame.
3)
#define BLOCK_SIZE 512

buffer_append(BUF *buffer, const char *string, size_t len) {
unsigned long amount_needed
char *buf_temp = NULL;

if (buffer_remaining(buffer) < (len + 1))
{
amount_needed = (len - buffer_remaining(buffer)) + 1; /* 1 is for
zero terminator */

As I originally said, these are not c strings. If these were c
strings, I would be using strlcpy, rather than memcpy.

I also plan to write a separate function to handle c strings. I am
using this buffer function to create a packet to be sent over the wire.
I'll post a c string buffer function later. Of course, I do want to
avoid filling the buffer like:

'A', 'A', '\0', 'A', 'A', '\0'

If I was just handling individual strings, I would use a linked list or
an array of pointers to c strings.
amount_needed += BLOCK_SIZE;
amount_needed &= (BLOCK_SIZE - 1); /* takes advantage of block size
being a power of 2 */

Please explain the above. I have never used bitwise AND in this
fashion. Could you walk me through the math? I do hope learn
something new. I have been struggling with a good way to increment the
allocation. I have used the shift operator to do endian conversions
and simple multiplication.

Cheers,

Brian
 
S

Samuel Stearley

1) Whenever I make my own allocation interface I prefer that it do its
allocations in the units the OS manages memory in.


2)
As I originally said, these are not c strings. If these were c
strings, I would be using strlcpy, rather than memcpy.

I know very jolly well that they are not c strings, but you did say
that you wanted to add a null at the end to turn it into a c string:
Ideally, I would also want a function that would tag a nul to the end
of the string to make it a c string if I need that functionality.


3) the math:
amount_needed += BLOCK_SIZE;
amount_needed &= ~(BLOCK_SIZE - 1); /* takes advantage of block size being a power of 2 */

Suppose you need 560 bytes which means 2 BLOCK_SIZE allocation. The
code adds BLOCK_SIZE to amount_needed and now you need 1072 bytes.

But 1072 bytes is not a multiple of 512 bytes. The &= ~(BLOCK_SIZE -
1) turns the 1072 bytes into 1024 bytes.

About the math: 512 is a power of 2, ie only 1 bit is set.
When you subtract 1 from a number that has only one set bit, all bits
beneath that bit get set. All bits above and including the original
bit are cleared. This creates a bit mask. That bit mask is then
inverted.

The final mask in binary is:

11111111_11111111_11111110_00000000

Anding this mask against 1072 turns 1072 into 1024
 
K

kondal

<OT>
The platform is openBSD -current. I could use the page size, but I need
to know more to make that decision. Is there an advantage to using the
page size over the block size? I am not an expert in the page daemon
or uvm. But I get the gist of how it works.

Page size is relevant only on mmu based operating systems. It would be
good to know the page sizes of the operating systems during the
programming phase but do not make it mandatory.
I could also use for loops. Is there any advantage of one approach
over the other? But I do agree that the goto was pretty lame.

Using while and for loop does not make much difference, you can use
either of them.


Regarding goto, it is just not cleaner way to write a probram. I've
seen programs that extensively use goto with good results and
increasing readiablity. If you see the code when converted to assembly
you would see that most of the code would be with jmp instructions
analogus to goto in C language.
As I originally said, these are not c strings. If these were c
strings, I would be using strlcpy, rather than memcpy.

I also plan to write a separate function to handle c strings. I am
using this buffer function to create a packet to be sent over the wire.
I'll post a c string buffer function later. Of course, I do want to
avoid filling the buffer like:

'A', 'A', '\0', 'A', 'A', '\0'

If I was just handling individual strings, I would use a linked list or
an array of pointers to c strings.

In most of the networking protocols strings are appended/prepended with
a length. This is required for variable string options. If this is not
specified then there would be delimeters given specifing the end of the
string. I am very much interested to know what network protocol you are
programing (unless it is not propritory).
Please explain the above. I have never used bitwise AND in this
fashion. Could you walk me through the math? I do hope learn
something new. I have been struggling with a good way to increment the
allocation. I have used the shift operator to do endian conversions
and simple multiplication.

This reminds me of my initial days of programing. In most institutions
teach bitwise operations at the end because it takes time :) I myself
weak in it taking a lot of time to craft one :( Internet is the biggest
library in the world, use it.

-kondal
 
S

Simon Biber

Samuel said:
P.S.

Sorry for the double post.
The one at 9:02 pm is not correct

You mean the one at 1:02 AM UTC, I guess. Or 9:02 PM -0700. Most news
readers display times converted to the reader's time zone. That's 11:02
AM +1000 for me.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top