message buffering for logs, sprintf, etc...

M

Matt Garman

I'm trying to develop (for my own personal use) a general
"messaging facility" in C. The idea is that the facility will be
used for both debugging/logging and user interfacing. I'd like it
to be fairly portable, but as robust and reliable as possible.

My intent is to develop an API for such general messaging. I'll
implement the parts that write to stdout, stderr and files. Later,
I'll add support for writing to (for example) syslog, graphical
windows, network sockets, etc.

So what I'd like to do is create a printf()-like function (i.e. one
that takes variable arguments and is formatted according to the
printf() conversion mechanism). This function would then take the
user's message, plus some additional accounting information (time
message was created, message "level", source file location, etc).

As for making this extendable, I figured I'd typedef something like
this:

int message_function(int flags, const char* message);

So I can quickly create new functions that write to arbitrary
locations.

What this all boils down to is I'll have to use sprintf() and write
to a buffer. It looks like there is no perfect way to do this, so
I'm trying to get an idea of the pros and cons of the different
strategies.

First question would be static memory versus dynamic memory. Keep
in mind that this is a logger/debugger, so I could be writing
messages like "out of memory"---which suggests dynamic memory
probably isn't the way to go. In other words, I'd like this
messaging log to continue working when all else has failed.

But with static memory, obviously, I'm limited to a fixed buffer
size. What is too big and what is too little? The accounting
information alone could easily occupy 80+ bytes. Plus, I don't like
the idea of limiting the length of the user's message.

On the other hand, say I use dynamic memory. Should I just assume
that if malloc() isn't available, the system is pretty unusable
anyway?

As for dynamically allocating buffers for sprintf(), a bit of
research showed that some people fopen() /dev/null (or NUL on
Windows), fprintf() to that file pointer, and malloc() with the
return value of fprintf().

What about using tmpfile() as a dynamic memory store?

FWIW, I took a very cursory glance at the glib (www.gtk.org) source,
and it looks like it uses malloc() for buffering messages (though I
didn't check to see what it does when malloc() fails).

Anyway, any discussion is welcome!

Thank you,
Matt
 
M

Michael Mair

Matt said:
I'm trying to develop (for my own personal use) a general
"messaging facility" in C. The idea is that the facility will be
used for both debugging/logging and user interfacing. I'd like it
to be fairly portable, but as robust and reliable as possible.

My intent is to develop an API for such general messaging. I'll
implement the parts that write to stdout, stderr and files. Later,
I'll add support for writing to (for example) syslog, graphical
windows, network sockets, etc.

So what I'd like to do is create a printf()-like function (i.e. one
that takes variable arguments and is formatted according to the
printf() conversion mechanism). This function would then take the
user's message, plus some additional accounting information (time
message was created, message "level", source file location, etc).

As for making this extendable, I figured I'd typedef something like
this:

int message_function(int flags, const char* message);

So I can quickly create new functions that write to arbitrary
locations.

Umh, why do you not want to use
int message_function(int flags, const char* formatstring, ...);
as wrappers for v([s|sn|f]?)printf()?
What this all boils down to is I'll have to use sprintf() and write
to a buffer. It looks like there is no perfect way to do this, so
I'm trying to get an idea of the pros and cons of the different
strategies.

If you have a wrapper for vsnprintf(), you can break down your
format string if necessary and use a fixed size buffer.
First question would be static memory versus dynamic memory. Keep
in mind that this is a logger/debugger, so I could be writing
messages like "out of memory"---which suggests dynamic memory
probably isn't the way to go. In other words, I'd like this
messaging log to continue working when all else has failed.

But with static memory, obviously, I'm limited to a fixed buffer
size. What is too big and what is too little? The accounting
information alone could easily occupy 80+ bytes. Plus, I don't like
the idea of limiting the length of the user's message.

On the other hand, say I use dynamic memory. Should I just assume
that if malloc() isn't available, the system is pretty unusable
anyway?

As for dynamically allocating buffers for sprintf(), a bit of
research showed that some people fopen() /dev/null (or NUL on
Windows), fprintf() to that file pointer, and malloc() with the
return value of fprintf().

If you have access to (v?)snprintf() (from the C99 standard library),
you can use this instead:
size = snprintf(NULL, 0, format, ....);

What about using tmpfile() as a dynamic memory store?

Same argument as with malloc() calls failing.
FWIW, I took a very cursory glance at the glib (www.gtk.org) source,
and it looks like it uses malloc() for buffering messages (though I
didn't check to see what it does when malloc() fails).

Anyway, any discussion is welcome!

HTH
Michael
 
S

SM Ryan

# As for making this extendable, I figured I'd typedef something like
# this:
#
# int message_function(int flags, const char* message);

That's a prototype, not a typedef.

If your system has vasprintf, you can do something like

int message_function(int flags, const char* message,...);
int vmessage_function(int flags, const char* message,va_list);
int smessage_function(int flags, char *buffer);

int message_function(int flags, const char* message,...) {
va_list list; int r;
va_start(list,message); r = vmessage_function(flags,message,list); va_end(list);
return r;
}
int vmessage_function(int flags, const char* message,va_list list) {
char *buffer; vasprintf(&buffer,message,list);
int r = smessage_function(flags ,buffer);
free(buffer);
return r;
}


If not, you have to do a more cumbersome method to restart the va_list
after sizing the buffer.

int message_function(int flags, const char* message,...) {
va_list list; int r,n; char staticbuffer[1000],*dynamicbuffer = 0;
va_start(list,message);
n = vsnprintf(staticbuffer,sizeof staticbuffer,message,list);
va_end(list);
if (n<=(sizeof staticbuffer)-1) {
r = smessage_function(flags,staticbuffer);
}else {
va_start(list,message);
dynamicbuffer = malloc(n+1);
vsnprintf(dynamicbuffer,n+1,message,list);
va_end(list);
r = smessage_function(flags,dynamicbuffer);
free(dynamicbuffer);
}
return r;
}


Some implementations make it easier to restart the va_list

int vmessage_function(int flags, const char* message,va_list list) {
va_list list0 = list; char staticbuffer[1000],*dynamicbuffer = 0;
int r,n = vsnprintf(staticbuffer,sizeof staticbuffer,message,list);
if (n<=(sizeof staticbuffer)-1) {
r = smessage_function(flags,staticbuffer);
}else {
list = list0;
dynamicbuffer = malloc(n+1);
vsnprintf(dynamicbuffer,n+1,message,list);
r = smessage_function(flags,dynamicbuffer);
free(dynamicbuffer);
}
return r;
}
 
K

Keith Thompson

Matt Garman said:
What this all boils down to is I'll have to use sprintf() and write
to a buffer. It looks like there is no perfect way to do this, so
I'm trying to get an idea of the pros and cons of the different
strategies.

First question would be static memory versus dynamic memory. Keep
in mind that this is a logger/debugger, so I could be writing
messages like "out of memory"---which suggests dynamic memory
probably isn't the way to go. In other words, I'd like this
messaging log to continue working when all else has failed.

But with static memory, obviously, I'm limited to a fixed buffer
size. What is too big and what is too little? The accounting
information alone could easily occupy 80+ bytes. Plus, I don't like
the idea of limiting the length of the user's message.

On the other hand, say I use dynamic memory. Should I just assume
that if malloc() isn't available, the system is pretty unusable
anyway?
[...]

Here's an idea.

Declare a static buffer of some reasonable size, likely to be big
enough for most messages.

For each message, figure out how big a buffer you need to hold it
(this could be the tricky part).

If it fits in the static buffer
Use the static buffer
else
Try to malloc a larger buffer.
If malloc succeeds
Use the malloced buffer
else
Truncate and use the static buffer
Log an out-of-memory message. Be brief.
 
C

CBFalconer

Keith said:
[...] .... snip ...
But with static memory, obviously, I'm limited to a fixed buffer
size. What is too big and what is too little? The accounting
information alone could easily occupy 80+ bytes. Plus, I don't like
the idea of limiting the length of the user's message.

On the other hand, say I use dynamic memory. Should I just assume
that if malloc() isn't available, the system is pretty unusable
anyway?
[...]

Here's an idea.

Declare a static buffer of some reasonable size, likely to be big
enough for most messages.

For each message, figure out how big a buffer you need to hold it
(this could be the tricky part).

Here is some code to output numbers to a stream, and you can choose
between a buffer or recursion. A modification would be to replace
the FILE* parameter with a pointer to a putchar function. Since it
can return the output char. count without actually doing so, it can
be incorporated in other formatting routines. The idea is to avoid
the heavy overhead of the printf family, together with the
insecurities of variadic functions.

/* Convert unsigned value to stream of digits in base
A NULL value for f returns a char count with no output
Returns count of chars. output
return is negative for any output error */
int unum2txt(unsigned long n, unsigned int base, FILE *f)
{
/* MUST be a power of 2 in length */
static char hexchars[] = "0123456789abcdef";

/* allow for terminal '\0' */
#define MAXBASE (sizeof(hexchars)-1)
#ifdef NORECURSE
#include <limits.h>
int count, ix;

/* sufficient for a binary expansion */
char buf[CHAR_BIT * sizeof(long)];
#else
int count, err;
#endif

if ((base < 2) || (base > MAXBASE)) base = 10;
#ifdef NORECURSE
count = ix = 0;
do {
buf[ix++] = hexchars[(n % base) & (MAXBASE-1)];
} while ((n = n / base));
while (ix--) {
count++;
if (f && (putc(buf[ix], f) < 0)) return -count;
}
#else
count = 1;
if (n / base) {
if ((err = unum2txt(n / base, base, f)) < 0) return err;
else count += err;
}
if (f && (putc(hexchars[(n % base) & (MAXBASE-1)], f) < 0))
return -count;
#endif
return count;
} /* unum2txt */
 
M

Malcolm

Matt Garman said:
So what I'd like to do is create a printf()-like function (i.e. one
that takes variable arguments and is formatted according to the
printf() conversion mechanism). This function would then take the
user's message, plus some additional accounting information (time
message was created, message "level", source file location, etc).

On the other hand, say I use dynamic memory. Should I just assume
that if malloc() isn't available, the system is pretty unusable
anyway?
Why not

/*
get the size of buffer needed for printf
*/
int vprintfbufflen(const char *fmt, va_list args)

The size of buffer you need is the size of the format string, plus the size
of any strings, plus a generous allowance for any numerical fields. Unless
you need the program to be robust against people deliberately trying to
sabotage you by printing intgers with a width of a hundred characters and
the like, you don't need to deal with all the complexities.

As for dynamic verus static, have a static buffer of reasonable size, and
keep malloc() in reserve for big messages. If malloc() fails then you could
print out an "out of memory" message using the static buffer.
 
M

Michael Wojcik

If you have access to (v?)snprintf() (from the C99 standard library),
you can use this instead:
size = snprintf(NULL, 0, format, ....);

Note, however, that there are prominent snprintf implementations
which are not compliant with C99 (or SUSv2, which also fixed
snprintf's return semantics - based on a proposal from Chris Torek,
IIRC). These implementations return -1 if the buffer is too small
for the formatted string, not the length of the formatted string.

Of course, if you do use snprintf to determine the length of the
formatted string, you should check the sign of the return value
anyway, since snprintf may return -1 for formatting errors.

The major implementations I know of which have this flaw are at least
some versions of Microsoft Visual C and HP C for HP-UX and Tru64. I
haven't checked the most recent releases.
 
M

Michael Wojcik

Why not

/*
get the size of buffer needed for printf
*/
int vprintfbufflen(const char *fmt, va_list args)

The size of buffer you need is the size of the format string, plus the size
of any strings, plus a generous allowance for any numerical fields. Unless
you need the program to be robust against people deliberately trying to
sabotage you by printing intgers with a width of a hundred characters and
the like, you don't need to deal with all the complexities.

And lo, a security hole is born.

The software security business would be a lot smaller and duller if
it weren't for all the programmers who thought they didn't need their
code "to be robust against people deliberately trying to sabotage" it.

--
Michael Wojcik (e-mail address removed)

Most people believe that anything that is true is true for a reason.
These theorems show that some things are true for no reason at all,
i.e., accidentally, or at random. -- G J Chaitin
 
M

Malcolm

Michael Wojcik said:
And lo, a security hole is born.

The software security business would be a lot smaller and duller if
it weren't for all the programmers who thought they didn't need their
code "to be robust against people deliberately trying to sabotage" it.
A programmer who is calling a C api can generally cause a buffer overflow if
he has malicious intent.
So if the format string is supplied by the programmer it is no more or less
dangerous than a call to strcpy(). If it is generated by a non C programmer,
then of course there is a potential problem.

If you use C at all, you normally have to assume that the person writing the
calling code is not malicious.
 
T

those who know me have no need of my name

in comp.lang.c i read:
Note, however, that there are prominent snprintf implementations
which are not compliant with C99

the easiest non-snprintf solution is to make use of tmpfile() and
fprintf(), though it might draw fire for having other issues (sometimes
rightly so, sometimes not).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,066
Latest member
VytoKetoReviews

Latest Threads

Top