fprintf internally supports print character array in hex?

M

mathog

Consider this code snippet:

char buffer[128];
int i,len;
/* code that puts data in buffer and sets len */
for(i=0;i<len;i++){fprintf(stdout,"%x",buffer);}

With the newer formatting options in fprintf (like "*" and "%m") can the
for loop now be replaced with a single fprintf like:

fprintf(stdout,"%????x",buffer,len);

where ???? is some combination of the newer format options.

Thanks,

David Mathog
 
B

Ben Pfaff

mathog said:
Consider this code snippet:

char buffer[128];
int i,len;
/* code that puts data in buffer and sets len */
for(i=0;i<len;i++){fprintf(stdout,"%x",buffer);}


You don't want %02x?
With the newer formatting options in fprintf (like "*" and "%m") can
the for loop now be replaced with a single fprintf like:

fprintf(stdout,"%????x",buffer,len);

No.

"*" only affects the field width or precision. That doesn't
help.

"%m" isn't standardized. In glibc, it prints the string
corresponding to an error code in `errno'. That doesn't help
either.

I don't know of any extensions for this.
 
B

Ben Bacarisse

mathog said:
Consider this code snippet:

char buffer[128];
int i,len;
/* code that puts data in buffer and sets len */
for(i=0;i<len;i++){fprintf(stdout,"%x",buffer);}


Minor point: fprintf(stdout, "%x", (unsigned)buffer); char promotes
(most likely) to int but %x expects and unsigned int. It's not going to
go wrong, but it stops anyone thinking "hang on...".
With the newer formatting options in fprintf (like "*" and "%m")

* is very old and %m so new I've never heard of it before. Did you mean
%n (which is also quite old)?
can the for loop now be replaced with a single fprintf like:

fprintf(stdout,"%????x",buffer,len);

where ???? is some combination of the newer format options.

No. The only "multiple" formats are for strings formatted as
characters.

I don't want this to sound snotty, but doesn't your reference manual
(online, local to your system, dead trees, whatever) answer this
question for you? Programming is hard enough without having to remember
what %#hho does (I just had look that up). The widely cited draft of
the C standard is not a bad reference for the library. The descriptions
are often simpler and clearer than anywhere else.
 
B

Ben Pfaff

mathog said:
Well, that was sort of my point. By (slightly odd) analogy, it is
as if the language implements

od -c file

(via the %s format) but not

od -x file

The only difference is how each char is formatted. Ah, wait, I see
why they didn't do it, the zero character is a string delimiter, so
that sort of format string would not be very useful for printing
general
hex data. But printing strings which are not null terminated is also
a problem.

No it's not. Use "%.*s".
 
B

Ben Bacarisse

mathog said:
Well, that was sort of my point. By (slightly odd) analogy, it is
as if the language implements

od -c file

(via the %s format) but not

od -x file

The only difference is how each char is formatted.

Exactly. That's why I said "for strings formatted as characters".
There is not way to alter how the string's elements are formatted --
it's as characters and not hex or decimal numbers or...

Anyway, we're on the same page: there's no built-in way to do this.
Ah, wait, I see
why they didn't do it, the zero character is a string delimiter, so
that sort of format string would not be very useful for printing
general hex data. But printing strings which are not null terminated is also
a problem. Both of these could be handled by a width based %s variant
like:

fprintf(stdout,"%*sc",len,buffer);

As "the other Ben" has pointed out this is already available: "%.*s".
Which would be "print to a field width of len using character formatting
from character data in array 'buffer'". This would be equivalent to:

fprintf(stdout,"%s",buffer);

when buffer is null terminated and len=strlen(buffer). But the first
form would work even if buffer contained nulls or was not terminated.

Similarly:

fprintf(stdout,"%*sx",2*len,buffer);

would print the hex format (.2 form) of the characters. And by
extension, for wide chars

fprintf(stdout,"%*lsx,4*len,buffer);

What's the 4 for? I don't like the idea of multiplying the length -- it
seems irrelevant. len bytes are to be printed and the format should say
how, surely?

I see three problems: first you need to find a way that does not add
characters to the format. I need to be able to write "%sc" and have a
'c' come after the string. Secondly, you'd want to borrow as much power
as you can from the existing formats. I should be able to use field
widths, and precision, and alternate forms, and padding characters for
my multiple hex bytes (or, indeed, my multiple ints or whatever). The
third is that one would often want to include literal characters as well
(printing multiple bytes in hex is problematic to read unless there is a
separator).

I started sketching a design, but it turned into "no one expects the
Spanish Inquisition". I added a third problem above and then came up
with two more mid way through. It's and interesting exercise and I use
it as a teaching problem is I were still in that game.

<snip>
 
M

mathog

Ben said:
I see three problems: first you need to find a way that does not add
characters to the format. I need to be able to write "%sc" and have a
'c' come after the string. Secondly, you'd want to borrow as much power
as you can from the existing formats. I should be able to use field
widths, and precision, and alternate forms, and padding characters for
my multiple hex bytes (or, indeed, my multiple ints or whatever). The
third is that one would often want to include literal characters as well
(printing multiple bytes in hex is problematic to read unless there is a
separator).

printf isn't really right for this. The problem isn't hard, but the
syntax gets messy in a hurry, and printf is getting messier and messier
as more things are added to it.

Almost certainly this has been worked out more elegantly in some other
language, but, in general terms a print function needs:

1. input identifiers (defaults to order of arguments, if
that has an unambiguous meaning)
2. number of elements to print (defaults to 1)
3. format for each element (width, precision, hex, integer. etc.
There is no default for this part.)
4. data type (c=char, w=wide char, s=short, i=int, etc., defaults
to whatever the format would use in printf).

Printf went off course back in the K&R days by mixing 3 and 4, and
I don't think it is possible at this point to separate them again.
Anyway, a more generic format specifier might look like

%repeat#(%<modifiers>format(datatype(input)))

If the datatype matches the default for format, then leave off
(datatype). If only one instance is needed leave off %number#().
If the current index for an input is needed precede it with i.
The repeat construct could be nested.

For simple things the function still looks like printf:

general_print(stdout,"%s\n",string);

but soon it doesn't. Print a line of 400 hex characters, 4 hex
charactersfor each of 100 wide chars:

general_print(stdout,"%100#(%4.4x(w))\n",wbuffer);

or if length is a variable, one of these

general_print(stdout,"%*#(%2.2x(w))\n",len,wbuffer);
general_print(stdout,"%#(1)(%2.2x(w))\n",len,wbuffer);

Allow nested repeats and access to the (implicit) input indices and it
becomes more powerful, but the syntax gets complicated. Here's hoping I
counted parens correctly:

general_print(stdout,
"%#(1)(%3d(i2) %5#(%3x(i(2)):%2d(l(3)), )%3x((2)):%2d(l(3))\n)",
len,buffer1,buffer2);

which if len was 4 would print:
0 xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy
5 xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy
10 xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy
15 xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy

where the xxx values are hex taken as ints from buffer 1 and yy are
decimal values as longs from buffer2.

If the input indices are not hidden it will look a lot like fortran
formatting.

By this point it looks like named format blocks would help, since the
same pattern was used twice in the example above. Something like
this if standard defines are to be understood within the print format
line:

#define ABLOCK %3d(i(2)):%2d(l(3))
general_print(stdout,
"%#(1)(%3d(i2) %5#({ABLOCK}, ){ABLOCK}\n)",
len,buffer1,buffer2);

And that's enough speculation for one day.


Regards,

David Mathog
 
J

Jorgen Grahn

* is very old and %m so new I've never heard of it before. Did you mean
%n (which is also quite old)?

I think he meant that he hasn't learned about * until recently. There
are (for some reason) lots of programmers who haven't, and work around
the painful perceived lack of it in various complicated ways.

%m is a Gnu thing, like someone else wrote. And since gcc itself
complains about it not being standard, I think few people use it even
when they "know" their code is never going to be compiled against
any other libc.

/Jorgen
 
N

Nick Keighley

Ben Bacarisse wrote:

[trying to get printf to print an array in hex]
printf isn't really right for this.

that was my thought on your /first/ post!

The problem isn't hard,

I think it /is/ hard, that's why it hasn't been solved in a
satidfactory way.
but the
syntax gets messy in a hurry, and printf is getting messier and messier
as more things are added to it.

printf basically implements a "little language". You are trying to
bolt more features onto its interpreter. You can't just say "its easy
but the syntax is messy" because syntax is the essence of the problem.
Almost certainly this has been worked out more elegantly in some other
language, but, in general terms a print function needs:

  1.  input identifiers (defaults to order of arguments, if
      that has an unambiguous meaning)
  2.  number of elements to print (defaults to 1)

those other languages manage without passing it explicitly
  3.  format for each element (width, precision, hex, integer. etc.
      There is no default for this part.)

actually I'd have thought there would be defaults
  4.  data type  (c=char, w=wide char, s=short, i=int, etc., defaults
      to whatever the format would use in printf).

again those other langauges (Pascal, scheme, C++ spring to mind) can
manage without passing this information explicitly.
Printf went off course back in the K&R days by mixing 3 and 4, and
I don't think it is possible at this point to separate them again.

I quite like prinf formats. It acheives reasonable expressiveness with
a compact but not unreadable format. C++ (for instance) looks very
pretty right up until you want to fiddle with format. Then it explodes
in size.
Anyway, a more generic format specifier might look like

   %repeat#(%<modifiers>format(datatype(input)))

If the datatype matches the default for format, then leave off
(datatype).  If only one instance is needed leave off %number#().
If the current index for an input is needed precede it with i.
The repeat construct could be nested.

For simple things the function still looks like printf:

   general_print(stdout,"%s\n",string);

but soon it doesn't.  Print a line of 400 hex characters, 4 hex
charactersfor each of 100 wide chars:

   general_print(stdout,"%100#(%4.4x(w))\n",wbuffer);

hmm. this might actually work. Have you considerd implememting it?
or if length is a variable, one of these

   general_print(stdout,"%*#(%2.2x(w))\n",len,wbuffer);
   general_print(stdout,"%#(1)(%2.2x(w))\n",len,wbuffer);

Allow nested repeats and access to the (implicit) input indices and it
becomes more powerful, but the syntax gets complicated.  Here's hoping I
counted parens correctly:

   general_print(stdout,
     "%#(1)(%3d(i2)  %5#(%3x(i(2)):%2d(l(3)), )%3x((2)):%2d(l(3))\n)",
      len,buffer1,buffer2);

ah. Lisp.
We haven't moved very far from 1958 have we?
which if len was 4 would print:
   0  xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy
   5  xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy
  10  xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy
  15  xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy

where the xxx values are hex taken as ints from buffer 1 and yy are
decimal values as longs from buffer2.

If the input indices are not hidden it will look a lot like fortran
formatting.

I remember Fortran being much clunkier. But that was long ago.
 
B

BartC

Anyway, a more generic format specifier might look like

%repeat#(%<modifiers>format(datatype(input)))
Allow nested repeats and access to the (implicit) input indices and it
becomes more powerful, but the syntax gets complicated. Here's hoping I
counted parens correctly:

general_print(stdout,
"%#(1)(%3d(i2) %5#(%3x(i(2)):%2d(l(3)), )%3x((2)):%2d(l(3))\n)",
len,buffer1,buffer2);

It's more powerful, but also impossible to understand.
which if len was 4 would print:
0 xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy
5 xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy
10 xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy
15 xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy

where the xxx values are hex taken as ints from buffer 1 and yy are
decimal values as longs from buffer2.

A more conventional approach might look like:

for (i=0; i<len; ++i) {
if ((i%5)==0) printf("%4d ",i);
printf("%03d:%02d",buffer1,buffer2);
printf("%s",(i+1)%5!=0?", ":"\n");
}

which at least is possible to debug if it's not quite right. And typos are
more likely to be picked up at compile time.
And that's enough speculation for one day.

I think I once proposed an array format-specifier, but can't remember the
details. But based on your idea, it might look like:

printf ("(%5#, %d)", array); or printf("(%*#, %d)",5,array);

which might print out (10,20,30,40,50). Only repeat count and separator
string are specified. And the array elements must of the type implied by the
format.

Anything more ambitious I think would quickly get too complex.
 
D

David Thompson

Not quite. od -c converts a nongraphic char to a visible sequence
(and also spaces between each char to allow for that). %s transmits
supported controls for their control effect, and any unused codes
(e.g. 80-FF if ASCII in CHAR_BIT==8) with undefined effect.
od -c is more akin to (but not the same as) cat -v or sed &lower-l;
Also od defaultly prefixes each line with the (starting) location but
that can be suppressed with -An.
Exactly. That's why I said "for strings formatted as characters".
There is not way to alter how the string's elements are formatted --
it's as characters and not hex or decimal numbers or...

Anyway, we're on the same page: there's no built-in way to do this.
I see three problems: first you need to find a way that does not add
characters to the format. I need to be able to write "%sc" and have a
'c' come after the string. Secondly, you'd want to borrow as much power
as you can from the existing formats. I should be able to use field
widths, and precision, and alternate forms, and padding characters for
my multiple hex bytes (or, indeed, my multiple ints or whatever). The
third is that one would often want to include literal characters as well
(printing multiple bytes in hex is problematic to read unless there is a
separator).
Fortran formats, on which C's *printf/scanf were loosely based, allow
a given format specifier to be repeated, or format specifiers and/or
literal data grouped by parens to be repeated. This might give us
something like (with unneeded string concats added for emphasis):
printf ("%16:" "%(" "%02hhx" "," "%)" "\n", array_of_16_bytes);
I think you could even do a half-decent job on complex numbers, now
that C99 has them, but haven't actually worked out the details.

Aside: OP has %repeat# downthread, but # is already used as a flag
(for octal or hex force 0 or 0x/0X, for floating force decimal point
even if not needed and for g/G force trailing zeros). There aren't
many good punctuation characters left, and not all that many letters
that aren't already either conversions or modifiers. If we could
require 8859-1 (instead of just near-ASCII) for C source, we'd have
dozens of new characters to make format strings even more cryptic and
illegible. Wait -- what was the benefit of this plan again?
I started sketching a design, but it turned into "no one expects the
Spanish Inquisition". I added a third problem above and then came up
with two more mid way through. It's and interesting exercise and I use
it as a teaching problem is I were still in that game.
Then can I get the comfortable pillow?
 
G

Gene

Consider this code snippet:

char buffer[128];
int  i,len;
   /* code that puts data in buffer and sets len */
for(i=0;i<len;i++){fprintf(stdout,"%x",buffer);}

With the newer formatting options in fprintf (like "*" and "%m") can the
for loop now be replaced with a single fprintf like:

    fprintf(stdout,"%????x",buffer,len);

where ???? is some combination of the newer format options.


No and - trust me - you don't want to reinvent anything like Common
Lisp's format specifiers. At least not if you ever must read anyone
else's code that uses them ... or your own after more than a month.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top