Problem with printf-like logging framework and inline hexdumps.

Jef Driesen · Sep 13, 2012

Hi,

For one of my projects, I created a simple logging framework. I have an
(opaque) context object which contains a buffer:

typedef struct dc_context_t {
char msg[4096];
/* More stuff here. */
} dc_context_t;

and a printf-like function to log a message. The logging function uses
the vsnprintf() function to write a string representation of the
arguments to the context buffer. The resulting string is then passed to
a user-defined callback function.

int
dc_context_log (dc_context_t *context, const char *format, ...)
{
va_list ap;

va_start (ap, format);
vsnprintf (context->msg, sizeof (context->msg), format, ap);
va_end (ap);

callback (context->msg);
}

This works perfect for all printf-style logging. For example:

dc_context_log (context, "%s (%d)", "An error message", 3);

However, I would like to log some binary data too. Let's say I want to
log a message with an inline hexdump, something like this:

"DEBUG: data=<DATA>, size=<SIZE>"

Since there is no printf format string available for hexdumps, I could
write a little dc_context_hexdump() helper function to convert the
binary data into a hex string, and then produce the log message in 3
steps:

dc_context_log (context, "%s: data=", "DEBUG");
dc_context_hexdump (context, hexdata, size);
dc_context_log (context, "", size);

This would work fine, except that I need the output in only one piece.
There are several reasons for that. I simplified the code a bit for this
post, but in reality the dc_context_log() function also receives extra
info, like the __FILE__ and __LINE__ macros. When printing the output, I
only want to see the file and line info appear only once, and not three
times. Also in a multithreaded environment, another thread may call the
logging function between each of the three calls, and thus interleaving
the output.

How would you workaround this?

A possible approach might be to convert the binary data to string
first, and then pass that to the dc_context_log call. Something like
this:

void
somefunction (dc_context_t *context, unsigned char *data, unsigned int
size)
{
/* Do something useful here. */

/* Log the binary data. */
hexstr = hexdump (data, size);
dc_context_log (context, "%s: data=%s, size=%u",
"DEBUG", hexstr, size);
}

But this hexdump function isn't thread-safe (because it would have to
return a pointer to a static buffer). And even if I would make a
re-entrant hexdump_r() version, it's not ideal because then I would have
to allocate a buffer inside the somefunction(). I don't want to do that
because the dc_context_log is actually called through a macro, which can
be configured at compile time to omit the actual call (e.g. in
production builds). But then the memory allocation would still be there,
and I don't want to pollute application code with #ifdef's. An temporary
buffer on the stack won't work either, because the size isn't fixed.

Suggestions are welcome!

Jef

Heinrich Wolf · Sep 13, 2012

....

But this hexdump function isn't thread-safe (because it would have to
return a pointer to a static buffer). And even if I would make a
re-entrant hexdump_r() version, it's not ideal because then I would have
to allocate a buffer inside the somefunction(). I don't want to do that
because the dc_context_log is actually called through a macro, which can
be configured at compile time to omit the actual call (e.g. in
production builds). But then the memory allocation would still be there,
and I don't want to pollute application code with #ifdef's. An temporary
buffer on the stack won't work either, because the size isn't fixed.

Suggestions are welcome!

Jef

Maybe you supply the buffer for the hex dump locally from a the caller of
the logging.

Heiner

Barry Schwarz · Sep 13, 2012

Hi,

For one of my projects, I created a simple logging framework. I have an
(opaque) context object which contains a buffer:

typedef struct dc_context_t {
char msg[4096];
/* More stuff here. */
} dc_context_t;

and a printf-like function to log a message. The logging function uses
the vsnprintf() function to write a string representation of the
arguments to the context buffer. The resulting string is then passed to
a user-defined callback function.

int
dc_context_log (dc_context_t *context, const char *format, ...)
{
va_list ap;

va_start (ap, format);
vsnprintf (context->msg, sizeof (context->msg), format, ap);
va_end (ap);

callback (context->msg);
}

This works perfect for all printf-style logging. For example:

dc_context_log (context, "%s (%d)", "An error message", 3);

However, I would like to log some binary data too. Let's say I want to
log a message with an inline hexdump, something like this:

"DEBUG: data=<DATA>, size=<SIZE>"

Since there is no printf format string available for hexdumps, I could
write a little dc_context_hexdump() helper function to convert the
binary data into a hex string, and then produce the log message in 3
steps:

dc_context_log (context, "%s: data=", "DEBUG");
dc_context_hexdump (context, hexdata, size);
dc_context_log (context, "", size);

This would work fine, except that I need the output in only one piece.
There are several reasons for that. I simplified the code a bit for this
post, but in reality the dc_context_log() function also receives extra
info, like the __FILE__ and __LINE__ macros. When printing the output, I
only want to see the file and line info appear only once, and not three
times. Also in a multithreaded environment, another thread may call the
logging function between each of the three calls, and thus interleaving
the output.

How would you workaround this?

A possible approach might be to convert the binary data to string
first, and then pass that to the dc_context_log call. Something like
this:

void
somefunction (dc_context_t *context, unsigned char *data, unsigned int
size)
{
/* Do something useful here. */

/* Log the binary data. */
hexstr = hexdump (data, size);
dc_context_log (context, "%s: data=%s, size=%u",
"DEBUG", hexstr, size);
}

But this hexdump function isn't thread-safe (because it would have to
return a pointer to a static buffer). And even if I would make a
re-entrant hexdump_r() version, it's not ideal because then I would have
to allocate a buffer inside the somefunction(). I don't want to do that
because the dc_context_log is actually called through a macro, which can
be configured at compile time to omit the actual call (e.g. in
production builds). But then the memory allocation would still be there,
and I don't want to pollute application code with #ifdef's. An temporary
buffer on the stack won't work either, because the size isn't fixed.

You need a buffer that will survive after hexdump returns. I can
think of only three ways to achieve this: static, dynamic, and
provided by somefunction.

If you need hexdump to be thread safe, then static is probably not the
answer.

If you choose to allocate the buffer dynamically, it need not be done
in somefunction. It could be done in hexdump. Either way though, it
would need to be freed in somefunction after dc_context_log returns.
You could insure that hexdump is called only when dc_context_log is
about to be called by moving the assignment of hexstr into the
argument list of the call to dc_context_log. That is, replace
hexstr = hexdump (data, size);
dc_context_log (context, "%s: data=%s, size=%u",
"DEBUG", hexstr, size);
with
dc_context_log (context, "%s: data=%s, size=%u",
"DEBUG", hexstr = hexdump (data, size), size);
In this way, when your macro suppresses the call to dc_context_log it
also suppresses the call to hexdump and no allocation is performed. If
you initialize hexstr to NULL, then you can call free() regardless of
whether an allocation has been performed or not and your code is not
polluted.

If you choose to define the buffer as an automatic array in
somefunction, it need not be any bigger than the array msg in your
structure. You may not know how much of it you will use but you do
know it cannot be any bigger or your call to vsnprintf will overrun
members in the structure.

Ben Bacarisse · Sep 13, 2012

Jef Driesen said:
However, I would like to log some binary data too. Let's say I want to
log a message with an inline hexdump, something like this:

"DEBUG: data=<DATA>, size=<SIZE>"

Since there is no printf format string available for hexdumps, I could
write a little dc_context_hexdump() helper function to convert the
binary data into a hex string, and then produce the log message in 3
steps:

dc_context_log (context, "%s: data=", "DEBUG");
dc_context_hexdump (context, hexdata, size);
dc_context_log (context, "", size);

This would work fine, except that I need the output in only one piece.
There are several reasons for that. I simplified the code a bit for this
post, but in reality the dc_context_log() function also receives extra
info, like the __FILE__ and __LINE__ macros. When printing the output, I
only want to see the file and line info appear only once, and not three
times. Also in a multithreaded environment, another thread may call the
logging function between each of the three calls, and thus interleaving
the output.

How would you workaround this?

A possible approach might be to convert the binary data to string
first, and then pass that to the dc_context_log call. Something like
this:

void
somefunction (dc_context_t *context, unsigned char *data, unsigned int
size)
{
/* Do something useful here. */

/* Log the binary data. */
hexstr = hexdump (data, size);
dc_context_log (context, "%s: data=%s, size=%u",
"DEBUG", hexstr, size);
}

But this hexdump function isn't thread-safe (because it would have to
return a pointer to a static buffer). And even if I would make a
re-entrant hexdump_r() version, it's not ideal because then I would have
to allocate a buffer inside the somefunction(). I don't want to do that
because the dc_context_log is actually called through a macro, which can
be configured at compile time to omit the actual call (e.g. in
production builds). But then the memory allocation would still be there,
and I don't want to pollute application code with #ifdef's. An temporary
buffer on the stack won't work either, because the size isn't fixed.

Suggestions are welcome!

GCC's printf (well, glibs's printf in fact) allows you to define new
format specifiers. Maybe that would suit?

Rolling your own format processing is not that hard. I'd process the
format in chunks that bracket your own format codes. Here's an
example:

void logit(const char *fmt, ...)
{
va_list al;
va_start(al, fmt);
const char *fp;
while ((fp = strstr(fmt, "%H")) != NULL) {
char tmp[fp - fmt + 1];
memcpy(tmp, fmt, sizeof tmp - 1);
tmp[sizeof tmp - 1] = 0;
vprintf(tmp, al);

printf("<%d>", va_arg(al, int));

fmt = fp + 2;
}
vprintf(fmt, al);
va_end(al);
}

Here, a format of %H is treated as if it meant "<%d>" but you can do
whatever you like at that point (in fact, my use-case called a function
keyed from the the format letter).

Obviously you'll need a cleverer function to match a custom format. %H
should not match after a %, for example, and your formats are likely to
have counts in them.

I don't think it even breaks any of the rules about handling va_lists.

Jef Driesen · Sep 13, 2012

GCC's printf (well, glibs's printf in fact) allows you to define new
format specifiers. Maybe that would suit?

No, because I also need to support non glibc systems (Windows).

Rolling your own format processing is not that hard. I'd process the
format in chunks that bracket your own format codes. Here's an
example:

void logit(const char *fmt, ...)
{
va_list al;
va_start(al, fmt);
const char *fp;
while ((fp = strstr(fmt, "%H")) != NULL) {
char tmp[fp - fmt + 1];
memcpy(tmp, fmt, sizeof tmp - 1);
tmp[sizeof tmp - 1] = 0;
vprintf(tmp, al);

printf("<%d>", va_arg(al, int));

fmt = fp + 2;
}
vprintf(fmt, al);
va_end(al);
}

Here, a format of %H is treated as if it meant "<%d>" but you can do
whatever you like at that point (in fact, my use-case called a function
keyed from the the format letter).

Obviously you'll need a cleverer function to match a custom format. %H
should not match after a %, for example, and your formats are likely to
have counts in them.

I don't think it even breaks any of the rules about handling va_lists.

I also thought of something like this, but the manpages for the vprintf
functions says this on my linux system:

These functions do not call the va_end macro. Because they invoke the va_arg
macro, the value of ap is undefined after the call. See stdarg(3).

Jef

Jef Driesen · Sep 13, 2012

Hi,

For one of my projects, I created a simple logging framework. I have an
(opaque) context object which contains a buffer:

typedef struct dc_context_t {
char msg[4096];
/* More stuff here. */
} dc_context_t;

and a printf-like function to log a message. The logging function uses
the vsnprintf() function to write a string representation of the
arguments to the context buffer. The resulting string is then passed to
a user-defined callback function.

int
dc_context_log (dc_context_t *context, const char *format, ...)
{
va_list ap;

va_start (ap, format);
vsnprintf (context->msg, sizeof (context->msg), format, ap);
va_end (ap);

callback (context->msg);
}

This works perfect for all printf-style logging. For example:

dc_context_log (context, "%s (%d)", "An error message", 3);

However, I would like to log some binary data too. Let's say I want to
log a message with an inline hexdump, something like this:

"DEBUG: data=<DATA>, size=<SIZE>"

Since there is no printf format string available for hexdumps, I could
write a little dc_context_hexdump() helper function to convert the
binary data into a hex string, and then produce the log message in 3
steps:

dc_context_log (context, "%s: data=", "DEBUG");
dc_context_hexdump (context, hexdata, size);
dc_context_log (context, "", size);

This would work fine, except that I need the output in only one piece.
There are several reasons for that. I simplified the code a bit for this
post, but in reality the dc_context_log() function also receives extra
info, like the __FILE__ and __LINE__ macros. When printing the output, I
only want to see the file and line info appear only once, and not three
times. Also in a multithreaded environment, another thread may call the
logging function between each of the three calls, and thus interleaving
the output.

How would you workaround this?

A possible approach might be to convert the binary data to string
first, and then pass that to the dc_context_log call. Something like
this:

void
somefunction (dc_context_t *context, unsigned char *data, unsigned int
size)
{
/* Do something useful here. */

/* Log the binary data. */
hexstr = hexdump (data, size);
dc_context_log (context, "%s: data=%s, size=%u",
"DEBUG", hexstr, size);
}

But this hexdump function isn't thread-safe (because it would have to
return a pointer to a static buffer). And even if I would make a
re-entrant hexdump_r() version, it's not ideal because then I would have
to allocate a buffer inside the somefunction(). I don't want to do that
because the dc_context_log is actually called through a macro, which can
be configured at compile time to omit the actual call (e.g. in
production builds). But then the memory allocation would still be there,
and I don't want to pollute application code with #ifdef's. An temporary
buffer on the stack won't work either, because the size isn't fixed.

Click to expand...

You need a buffer that will survive after hexdump returns. I can
think of only three ways to achieve this: static, dynamic, and
provided by somefunction.

If you need hexdump to be thread safe, then static is probably not the
answer.

If you choose to allocate the buffer dynamically, it need not be done
in somefunction. It could be done in hexdump. Either way though, it
would need to be freed in somefunction after dc_context_log returns.
You could insure that hexdump is called only when dc_context_log is
about to be called by moving the assignment of hexstr into the
argument list of the call to dc_context_log. That is, replace
hexstr = hexdump (data, size);
dc_context_log (context, "%s: data=%s, size=%u",
"DEBUG", hexstr, size);
with
dc_context_log (context, "%s: data=%s, size=%u",
"DEBUG", hexstr = hexdump (data, size), size);
In this way, when your macro suppresses the call to dc_context_log it
also suppresses the call to hexdump and no allocation is performed. If
you initialize hexstr to NULL, then you can call free() regardless of
whether an allocation has been performed or not and your code is not
polluted.

One of the reasons to have the msg array embedded into the context structure is
to avoid any memory allocations in the logging functions. The necessary memory
is already pre-allocated with the context structure. With this approach the
memory allocation would be re-introduced. But it's still an interesting idea of
course.

If you choose to define the buffer as an automatic array in
somefunction, it need not be any bigger than the array msg in your
structure. You may not know how much of it you will use but you do
know it cannot be any bigger or your call to vsnprintf will overrun
members in the structure.

True, but the size of the msg array in the context structure isn't public. It's
an opaque data structure and thus the size can be changed at will. And then the
code would be out of sync again.

Jef

Ben Bacarisse · Sep 13, 2012

Jef Driesen said:
On 13-09-12 22:24, Ben Bacarisse wrote:

Rolling your own format processing is not that hard. I'd process the
format in chunks that bracket your own format codes. Here's an
example:

void logit(const char *fmt, ...)
{
va_list al;
va_start(al, fmt);
const char *fp;
while ((fp = strstr(fmt, "%H")) != NULL) {
char tmp[fp - fmt + 1];
memcpy(tmp, fmt, sizeof tmp - 1);
tmp[sizeof tmp - 1] = 0;
vprintf(tmp, al);

printf("<%d>", va_arg(al, int));

fmt = fp + 2;
}
vprintf(fmt, al);
va_end(al);
}

Here, a format of %H is treated as if it meant "<%d>" but you can do
whatever you like at that point (in fact, my use-case called a function
keyed from the the format letter).

Obviously you'll need a cleverer function to match a custom format. %H
should not match after a %, for example, and your formats are likely to
have counts in them.

I don't think it even breaks any of the rules about handling va_lists.

Click to expand...

I also thought of something like this, but the manpages for the vprintf
functions says this on my linux system:

These functions do not call the va_end macro. Because they invoke the va_arg
macro, the value of ap is undefined after the call. See stdarg(3).

Ah, yes. The above is from some old code when things were a little
wilder...

I don't think all is lost. I think there is a fiddly dance you can do
with va_end and va_copy to get the job done:

void logit(const char *fmt, ...)
{
va_list al;
va_start(al, fmt);
const char *fp;
while ((fp = strstr(fmt, "%H")) != NULL) {
char tmp[fp - fmt + 1];
memcpy(tmp, fmt, sizeof tmp - 1);
tmp[sizeof tmp - 1] = 0;

va_list al_copy;
va_copy(al_copy, al);

vprintf(tmp, al_copy);

va_end(al_copy);
va_end(al);
va_copy(al, al_copy);

printf("<%d>", va_arg(al, int));
fmt = fp + 2;
}
vprintf(fmt, al);
va_end(al);
}

This passes a cursory reading of the va_* macro restrictions in the
standard but I'd welcome some more input...

Nick Keighley · Sep 14, 2012

GCC's printf (well, glibs's printf in fact) allows you to define new
format specifiers. Maybe that would suit?

Click to expand...

No, because I also need to support non glibc systems (Windows).

Rolling your own format processing is not that hard. I'd process the
format in chunks that bracket your own format codes. Here's an
example:

Click to expand...

void logit(const char *fmt, ...)
{
va_list al;
va_start(al, fmt);
const char *fp;
while ((fp = strstr(fmt, "%H")) != NULL) {
char tmp[fp - fmt + 1];
memcpy(tmp, fmt, sizeof tmp - 1);
tmp[sizeof tmp - 1] = 0;
vprintf(tmp, al);

Click to expand...

printf("<%d>", va_arg(al, int));

Click to expand...

fmt = fp + 2;
}
vprintf(fmt, al);
va_end(al);
}

Click to expand...

Here, a format of %H is treated as if it meant "<%d>" but you can do
whatever you like at that point (in fact, my use-case called a function
keyed from the the format letter).

Click to expand...

Obviously you'll need a cleverer function to match a custom format. %H
should not match after a %, for example, and your formats are likely to
have counts in them.

Click to expand...

I don't think it even breaks any of the rules about handling va_lists.

Click to expand...

I also thought of something like this, but the manpages for the vprintf
functions says this on my linux system:

These functions do not call the va_end macro. Because they invoke the va_arg
macro, the value of ap is undefined after the call. See stdarg(3).

could you pase the format string yourself? If its %H do your special
stuff otherwise pass it on to *printf()?

Ben Bacarisse · Sep 14, 2012

Nick Keighley said:
On 13-09-12 22:24, Ben Bacarisse wrote:

Rolling your own format processing is not that hard. Â I'd process the
format in chunks that bracket your own format codes. Â Here's an
example:

Click to expand...

Â Â void logit(const char *fmt, ...)
Â Â {
Â Â Â Â va_list al;
Â Â Â Â va_start(al, fmt);
Â Â Â Â const char *fp;
Â Â Â Â while ((fp = strstr(fmt, "%H")) != NULL) {
Â Â Â Â Â Â Â char tmp[fp - fmt + 1];
Â Â Â Â Â Â Â memcpy(tmp, fmt, sizeof tmp - 1);
Â Â Â Â Â Â Â tmp[sizeof tmp - 1] = 0;
Â Â Â Â Â Â Â vprintf(tmp, al);

Click to expand...

Â Â Â Â Â Â Â printf("<%d>", va_arg(al, int));

Click to expand...

Â Â Â Â Â Â Â fmt = fp + 2;
Â Â Â Â }
Â Â Â Â vprintf(fmt, al);
Â Â Â Â va_end(al);
Â Â }

Click to expand...

Here, a format of %H is treated as if it meant "<%d>" but you can do
whatever you like at that point (in fact, my use-case called a function
keyed from the the format letter).

Click to expand...

Obviously you'll need a cleverer function to match a custom format. Â %H
should not match after a %, for example, and your formats are likely to
have counts in them.

Click to expand...

I don't think it even breaks any of the rules about handling va_lists.

Click to expand...

I also thought of something like this, but the manpages for the vprintf
functions says this on my linux system:

These functions do not call the va_end macro. Â Because they invoke the va_arg
macro, the value of ap is undefined after the call. See stdarg(3).

Click to expand...

could you pase the format string yourself? If its %H do your special
stuff otherwise pass it on to *printf()?

That is essentially what my example does.

But you are proposing, I think, to pass to v*printf a format containing
only one conversion string at a time, rather than in chunks as my code
does. If so, that does not really solve the problem of the va_list
becoming indeterminate. If I've misunderstood, please post a code
sketch.

Either way, you have to juggle the lists using va_copy and whilst I
think passing one argument at a time to v*printf may make it slightly
simpler, the small extras needed to fix the above (code that I posted
more recently) pays off compared to having to parse every format.

Jef Driesen · Sep 14, 2012

Jef Driesen said:
Jef Driesen said:

On 13-09-12 22:24, Ben Bacarisse wrote:

Rolling your own format processing is not that hard. I'd process the
format in chunks that bracket your own format codes. Here's an
example:

void logit(const char *fmt, ...)
{
va_list al;
va_start(al, fmt);
const char *fp;
while ((fp = strstr(fmt, "%H")) != NULL) {
char tmp[fp - fmt + 1];
memcpy(tmp, fmt, sizeof tmp - 1);
tmp[sizeof tmp - 1] = 0;
vprintf(tmp, al);

printf("<%d>", va_arg(al, int));

fmt = fp + 2;
}
vprintf(fmt, al);
va_end(al);
}

Here, a format of %H is treated as if it meant "<%d>" but you can do
whatever you like at that point (in fact, my use-case called a function
keyed from the the format letter).

Obviously you'll need a cleverer function to match a custom format. %H
should not match after a %, for example, and your formats are likely to
have counts in them.

I don't think it even breaks any of the rules about handling va_lists.

Click to expand...

I also thought of something like this, but the manpages for the vprintf
functions says this on my linux system:

These functions do not call the va_end macro. Because they invoke the va_arg
macro, the value of ap is undefined after the call. See stdarg(3).

Click to expand...

Ah, yes. The above is from some old code when things were a little
wilder...

I don't think all is lost. I think there is a fiddly dance you can do
with va_end and va_copy to get the job done:

void logit(const char *fmt, ...)
{
va_list al;
va_start(al, fmt);
const char *fp;
while ((fp = strstr(fmt, "%H")) != NULL) {
char tmp[fp - fmt + 1];
memcpy(tmp, fmt, sizeof tmp - 1);
tmp[sizeof tmp - 1] = 0;

va_list al_copy;
va_copy(al_copy, al);

vprintf(tmp, al_copy);

va_end(al_copy);
va_end(al);
va_copy(al, al_copy);

printf("<%d>", va_arg(al, int));
fmt = fp + 2;
}
vprintf(fmt, al);
va_end(al);
}

This passes a cursory reading of the va_* macro restrictions in the
standard but I'd welcome some more input...

I think this version has the same problem as the original. After
calling the va_end macro, the va_list variable becomes undefined
according to the manpages on my linux system:

"After the call va_end(ap) the variable ap is undefined."

So I guess that if this version works, then the original would have
worked too.

Jef

Jef Driesen · Sep 14, 2012

Hi,

For one of my projects, I created a simple logging framework. I have an
(opaque) context object which contains a buffer:

typedef struct dc_context_t {
char msg[4096];
/* More stuff here. */
} dc_context_t;

and a printf-like function to log a message. The logging function uses
the vsnprintf() function to write a string representation of the
arguments to the context buffer. The resulting string is then passed to
a user-defined callback function.

int
dc_context_log (dc_context_t *context, const char *format, ...)
{
va_list ap;

va_start (ap, format);
vsnprintf (context->msg, sizeof (context->msg), format, ap);
va_end (ap);

callback (context->msg);
}

This works perfect for all printf-style logging. For example:

dc_context_log (context, "%s (%d)", "An error message", 3);

However, I would like to log some binary data too. Let's say I want to
log a message with an inline hexdump, something like this:

"DEBUG: data=<DATA>, size=<SIZE>"

Since there is no printf format string available for hexdumps, I could
write a little dc_context_hexdump() helper function to convert the
binary data into a hex string, and then produce the log message in 3
steps:

dc_context_log (context, "%s: data=", "DEBUG");
dc_context_hexdump (context, hexdata, size);
dc_context_log (context, "", size);

This would work fine, except that I need the output in only one piece.
There are several reasons for that. I simplified the code a bit for this
post, but in reality the dc_context_log() function also receives extra
info, like the __FILE__ and __LINE__ macros. When printing the output, I
only want to see the file and line info appear only once, and not three
times. Also in a multithreaded environment, another thread may call the
logging function between each of the three calls, and thus interleaving
the output.

How would you workaround this?

A possible approach might be to convert the binary data to string
first, and then pass that to the dc_context_log call. Something like
this:

void
somefunction (dc_context_t *context, unsigned char *data, unsigned int
size)
{
/* Do something useful here. */

/* Log the binary data. */
hexstr = hexdump (data, size);
dc_context_log (context, "%s: data=%s, size=%u",
"DEBUG", hexstr, size);
}

But this hexdump function isn't thread-safe (because it would have to
return a pointer to a static buffer). And even if I would make a
re-entrant hexdump_r() version, it's not ideal because then I would have
to allocate a buffer inside the somefunction(). I don't want to do that
because the dc_context_log is actually called through a macro, which can
be configured at compile time to omit the actual call (e.g. in
production builds). But then the memory allocation would still be there,
and I don't want to pollute application code with #ifdef's. An temporary
buffer on the stack won't work either, because the size isn't fixed.

Suggestions are welcome!

An alternative solution that I have been thinking about, is to
introduce a number of helper functions:

dc_context_printf
dc_context_hexdump
dc_context_flush

The first two functions would just write to the msg buffer as before,
but now with an extra variable to track the current writing position,
such that successive calls will append their data. The third function is
there to generating the output and reset the writing position.

Thread-safety could be achieved by providing two more locking functions
(dc_context_lock and dc_context_unlock) called before and after the
printing functions.

The only downside, is that I would need several function calls to
generate a single log message. For my example in the original post,
there would be at least 6 calls:

dc_context_lock (context);
dc_context_printf (context, "%s: data=", "DEBUG");
dc_context_hexdump (context, data, size);
dc_context_printf (context, "size=%d", size);
dc_context_flush (context);
dc_context_unlock (context);

Although it works, it's also getting pretty verbose, and all these function
calls will have some runtime overhead too. (The original dc_context_log function
does only call the vsnprintf function if the loglevel is high enough.)

Jef

Ben Bacarisse · Sep 14, 2012

Jef Driesen said:
Jef Driesen said:

On 13-09-12 22:24, Ben Bacarisse wrote:

Click to expand...

Rolling your own format processing is not that hard. I'd process the
format in chunks that bracket your own format codes. Here's an
example:

void logit(const char *fmt, ...)
{
va_list al;
va_start(al, fmt);
const char *fp;
while ((fp = strstr(fmt, "%H")) != NULL) {
char tmp[fp - fmt + 1];
memcpy(tmp, fmt, sizeof tmp - 1);
tmp[sizeof tmp - 1] = 0;
vprintf(tmp, al);

printf("<%d>", va_arg(al, int));

fmt = fp + 2;
}
vprintf(fmt, al);
va_end(al);
}

Here, a format of %H is treated as if it meant "<%d>" but you can do
whatever you like at that point (in fact, my use-case called a function
keyed from the the format letter).

Obviously you'll need a cleverer function to match a custom format. %H
should not match after a %, for example, and your formats are likely to
have counts in them.

I don't think it even breaks any of the rules about handling va_lists.

I also thought of something like this, but the manpages for the vprintf
functions says this on my linux system:

These functions do not call the va_end macro. Because they invoke the va_arg
macro, the value of ap is undefined after the call. See stdarg(3).

Click to expand...

Ah, yes. The above is from some old code when things were a little
wilder...

I don't think all is lost. I think there is a fiddly dance you can do
with va_end and va_copy to get the job done:

void logit(const char *fmt, ...)
{
va_list al;
va_start(al, fmt);
const char *fp;
while ((fp = strstr(fmt, "%H")) != NULL) {
char tmp[fp - fmt + 1];
memcpy(tmp, fmt, sizeof tmp - 1);
tmp[sizeof tmp - 1] = 0;

va_list al_copy;
va_copy(al_copy, al);

vprintf(tmp, al_copy);

va_end(al_copy);
va_end(al);
va_copy(al, al_copy);

printf("<%d>", va_arg(al, int));
fmt = fp + 2;
}
vprintf(fmt, al);
va_end(al);
}

This passes a cursory reading of the va_* macro restrictions in the
standard but I'd welcome some more input...

Click to expand...

I think this version has the same problem as the original. After
calling the va_end macro, the va_list variable becomes undefined
according to the manpages on my linux system:

"After the call va_end(ap) the variable ap is undefined."

The linux man pages are probably not a good source for details like
this. Both quotes you've given conflict with the C standard and I
suspect that is just a result of summarising the situation rather
bluntly -- not a reflection of the fact the glibc's implementation is
deliberately not conforming.

The standard says:

"The va_end macro may modify ap so that it is no longer usable
(without being reinitialized by the va_start or va_copy macro). If
there is no corresponding invocation of the va_start or va_copy macro,
or if the va_end macro is not invoked before the return, the behavior
is undefined."

That's not exactly the same thing. I had attempted to use va_copy get
the va_list correctly reinitialised, but I don't think I've got it
right. So, yes, there is still a problem I think.

There may be no correct va_start/va_copy/va_end pattern that works. If
so, that's a great shame, but it was presumably unavoidable. I can't
believe this old trick was ruled un-portable without it having to be so
ruled. I'll give it some more thought...

Ben Bacarisse · Sep 14, 2012

Ben Bacarisse said:
There may be no correct va_start/va_copy/va_end pattern that works. If
so, that's a great shame, but it was presumably unavoidable. I can't
believe this old trick was ruled un-portable without it having to be so
ruled. I'll give it some more thought...

I have thunk and I think there is no suitable pattern of
va_start/va_copy/va_end calls that will work.

This means that the only way to add your own formats (portably) is to
keep a synchronised copy of the va_list: you duplicate it at the start,
and you duplicate the current state when you call vprintf to process one
or more of its formats. When vprintf returns, you must scan the format
and execute the same va_args calls the vprintf will have used.

That makes it much more fiddly because of all the types that might be
involved. And to do this requires actually parsing the format
specifiers which is not trivial.

Seems a pity...

Jef Driesen · Sep 17, 2012

I have thunk and I think there is no suitable pattern of
va_start/va_copy/va_end calls that will work.

This means that the only way to add your own formats (portably) is to
keep a synchronised copy of the va_list: you duplicate it at the start,
and you duplicate the current state when you call vprintf to process one
or more of its formats. When vprintf returns, you must scan the format
and execute the same va_args calls the vprintf will have used.

That makes it much more fiddly because of all the types that might be
involved. And to do this requires actually parsing the format
specifiers which is not trivial.

Seems a pity...

A pity indeed. Implementing my own printf function isn't trivial and a bit
overkill for my needs.

Jef

Kaz Kylheku · Sep 17, 2012

I have thunk and I think there is no suitable pattern of
va_start/va_copy/va_end calls that will work.

This means that the only way to add your own formats (portably) is to
keep a synchronised copy of the va_list: you duplicate it at the start,
and you duplicate the current state when you call vprintf to process one
or more of its formats. When vprintf returns, you must scan the format
and execute the same va_args calls the vprintf will have used.

That makes it much more fiddly because of all the types that might be
involved. And to do this requires actually parsing the format
specifiers which is not trivial.

It's a piece of cake. And anyway, you don't have to support everything
in printf. If you require yourself to do a complete, perfect job,
that's a good way to get out of getting anything done.

I've used this in projects past, just for quick and dirty debugging,
not part of any permanent framework.

It's not a complete solution but it could be massaged into one.

For instance, note how it assumes that only one argument is consumed
by printf, and it just eats a void *. This worked on the target platforms
I was working on, where I was using mostly %d and %s arguments.

It also doesn't handle variable widths and precisions (* character).

These things would not be very hard to add. You need to add some
sophistication to the parser, which has to produce some info in a bunch of
local variables: number of stars which occured (to consume that many ints)
and the type of the argument (requiring a switch statement to various
va_arg calls).

int hexprintf(const char *fmt, ...)
{
int ch, count = 0;
va_list vl;

va_start (vl, fmt);

for (ch = *fmt++; ch != 0; ch = *fmt++) {
switch (ch) {
case '%': /* printf conversion */
{
/* regular printf conversion */
char conv[64] = "%";
const char *end = fmt;
va_list vl_copy;

va_copy (vl_copy, vl);

end += strspn(end, "#0-+ "); /* skip flags */
end += strspn(end, "0123456789*"); /* skip precision */
if (*end == '.') /* skip precision */
end += 1 + strspn(end + 1, "0123456789*");

/* skip length */
if (strncmp(end, "hh", 2) == 0)
end += 2;
else if (strncmp(end, "ll", 2) == 0)
end += 2;
else if (strspn(end, "hlLqjzt") > 0)
end += 1;

if (*end == '@') {
/* our hex conversion */
const unsigned char *data = va_arg (vl, unsigned char *);
int i, size = va_arg (vl, int);

for (i = 0; i < size; i++) {
count += printf("%02x", (unsigned int) data);
if (i < size - 1) {
putchar(' ');
count++;
}
}
} else {
/* copy and call printf */
memcpy(conv + 1, fmt, end - fmt + 1);
count += vprintf(conv, vl_copy);
va_end (vl_copy);
va_arg (vl, void *);
}
fmt = end + 1;
}
break;
default:
putchar(ch);
count++;
break;
}
}

va_end (vl);
return count;
}

Ben Bacarisse · Sep 17, 2012

Kaz Kylheku said:
It's a piece of cake.

Something is a piece of cake, yes, but is it what the OP was talking
about? After all, I've had a bit code that does this for decades, but
that was (quite reasonably) rejected because the OP wanted stronger
guarantees that it will work.

Your code makes different assumptions which make well suit the OP
better, but it's still a bit of a punt.

And anyway, you don't have to support everything
in printf. If you require yourself to do a complete, perfect job,
that's a good way to get out of getting anything done.

I agree, though I think printf-like logging things need a bit more care
that some other APIs because users of the logging can easily forget what
restrictions you've placed on allowable formats.

I've used this in projects past, just for quick and dirty debugging,
not part of any permanent framework.

It's not a complete solution but it could be massaged into one.

Ditto with my "not guaranteed" solution.

For instance, note how it assumes that only one argument is consumed
by printf, and it just eats a void *. This worked on the target platforms
I was working on, where I was using mostly %d and %s arguments.

It also doesn't handle variable widths and precisions (* character).

These things would not be very hard to add. You need to add some
sophistication to the parser, which has to produce some info in a bunch of
local variables: number of stars which occured (to consume that many ints)
and the type of the argument (requiring a switch statement to various
va_arg calls).

Yes, not hard. You could also add code to make %n work (it's about two
lines) and I'd want fix it to treat %% correctly.

But I don't want a pointless argument over my "not trivial" and your
"not hard". My point was that it seems a shame that there is no way to
avoid scanning the format. A standard solution could have been
provided. The v*printf functions could take, for example, extra pointer
argument which would collect a value to be passed to a new va_ macro
(va_skip?) so that the caller can synchronise its own argument list.
That might be a bit OTT, but maybe the restriction on using an
indeterminate argument list could be relaxed so that it can copied? On
what systems, I wonder, would that would cause a problem?

<snip code>

Jorgen Grahn · Sep 20, 2012

...

Maybe you supply the buffer for the hex dump locally from a the caller of
the logging.

Yes, like inet_ntop() does. And such functions should also
return a pointer to the buffer, so you can:

char buf[SOMETHING];
printf("Hello, I'm %s\n",
inet_pton(AF_INET, &some_ip_address, buf, sizeof buf));

Not that this solves all his problems, though. Printf works to well
because it appends to an infinite stream. You can hack the text up
into pieces (call printf() and friends many times) and it doesn't
matter. But with logging, each write is an entry, a distinct event;
printing each bit separately and printing one big chunk don't give
the same result.

/Jorgen

real time printf-like logging	4	Dec 1, 2007
extending printf	26	Mar 8, 2014
Telegram bot for searching and downloading songs	0	Dec 9, 2024
Dynamic Printf	11	Apr 22, 2009
Logging	9	Nov 14, 2010
Question on printf with %x	8	Apr 17, 2013
why printf("%d", arg) works with arg of type int, short, char	21	Mar 1, 2014
Weird Behavior with Rays in C and OpenGL	4	Feb 12, 2024

Problem with printf-like logging framework and inline hexdumps.

Jef Driesen

Heinrich Wolf

Barry Schwarz

Ben Bacarisse

Jef Driesen

Jef Driesen

Ben Bacarisse

Nick Keighley

Ben Bacarisse

Jef Driesen

Jef Driesen

Ben Bacarisse

Ben Bacarisse

Jef Driesen

Kaz Kylheku

Ben Bacarisse

Jorgen Grahn

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads