fprintf internally supports print character array in hex?

Discussion in 'C Programming' started by mathog, Sep 23, 2011.

  1. mathog

    mathog Guest

    Consider this code snippet:

    char buffer[128];
    int i,len;
    /* code that puts data in buffer and sets len */
    for(i=0;i<len;i++){fprintf(stdout,"%x",buffer);}

    With the newer formatting options in fprintf (like "*" and "%m") can the
    for loop now be replaced with a single fprintf like:

    fprintf(stdout,"%????x",buffer,len);

    where ???? is some combination of the newer format options.

    Thanks,

    David Mathog
    mathog, Sep 23, 2011
    #1
    1. Advertising

  2. mathog

    Ben Pfaff Guest

    mathog <> writes:

    > Consider this code snippet:
    >
    > char buffer[128];
    > int i,len;
    > /* code that puts data in buffer and sets len */
    > for(i=0;i<len;i++){fprintf(stdout,"%x",buffer);}


    You don't want %02x?

    > With the newer formatting options in fprintf (like "*" and "%m") can
    > the for loop now be replaced with a single fprintf like:
    >
    > fprintf(stdout,"%????x",buffer,len);


    No.

    "*" only affects the field width or precision. That doesn't
    help.

    "%m" isn't standardized. In glibc, it prints the string
    corresponding to an error code in `errno'. That doesn't help
    either.

    I don't know of any extensions for this.
    --
    char a[]="\n .CJacehknorstu";int putchar(int);int main(void){unsigned long b[]
    ={0x67dffdff,0x9aa9aa6a,0xa77ffda9,0x7da6aa6a,0xa67f6aaa,0xaa9aa9f6,0x11f6},*p
    =b,i=24;for(;p+=!*p;*p/=4)switch(0[p]&3)case 0:{return 0;for(p--;i--;i--)case+
    2:{i++;if(i)break;else default:continue;if(0)case 1:putchar(a[i&15]);break;}}}
    Ben Pfaff, Sep 23, 2011
    #2
    1. Advertising

  3. mathog

    mathog Guest

    Ben Pfaff wrote:
    > I don't know of any extensions for this.


    OK.

    Thanks,

    David Mathog
    mathog, Sep 23, 2011
    #3
  4. mathog <> writes:

    > Consider this code snippet:
    >
    > char buffer[128];
    > int i,len;
    > /* code that puts data in buffer and sets len */
    > for(i=0;i<len;i++){fprintf(stdout,"%x",buffer);}


    Minor point: fprintf(stdout, "%x", (unsigned)buffer); char promotes
    (most likely) to int but %x expects and unsigned int. It's not going to
    go wrong, but it stops anyone thinking "hang on...".

    > With the newer formatting options in fprintf (like "*" and "%m")


    * is very old and %m so new I've never heard of it before. Did you mean
    %n (which is also quite old)?

    > can the for loop now be replaced with a single fprintf like:
    >
    > fprintf(stdout,"%????x",buffer,len);
    >
    > where ???? is some combination of the newer format options.


    No. The only "multiple" formats are for strings formatted as
    characters.

    I don't want this to sound snotty, but doesn't your reference manual
    (online, local to your system, dead trees, whatever) answer this
    question for you? Programming is hard enough without having to remember
    what %#hho does (I just had look that up). The widely cited draft of
    the C standard is not a bad reference for the library. The descriptions
    are often simpler and clearer than anywhere else.

    --
    Ben.
    Ben Bacarisse, Sep 23, 2011
    #4
  5. mathog

    Ben Pfaff Guest

    mathog <> writes:

    > Ben Bacarisse wrote:
    >
    >> No. The only "multiple" formats are for strings formatted as
    >> characters.

    >
    > Well, that was sort of my point. By (slightly odd) analogy, it is
    > as if the language implements
    >
    > od -c file
    >
    > (via the %s format) but not
    >
    > od -x file
    >
    > The only difference is how each char is formatted. Ah, wait, I see
    > why they didn't do it, the zero character is a string delimiter, so
    > that sort of format string would not be very useful for printing
    > general
    > hex data. But printing strings which are not null terminated is also
    > a problem.


    No it's not. Use "%.*s".
    --
    char a[]="\n .CJacehknorstu";int putchar(int);int main(void){unsigned long b[]
    ={0x67dffdff,0x9aa9aa6a,0xa77ffda9,0x7da6aa6a,0xa67f6aaa,0xaa9aa9f6,0x11f6},*p
    =b,i=24;for(;p+=!*p;*p/=4)switch(0[p]&3)case 0:{return 0;for(p--;i--;i--)case+
    2:{i++;if(i)break;else default:continue;if(0)case 1:putchar(a[i&15]);break;}}}
    Ben Pfaff, Sep 23, 2011
    #5
  6. mathog <> writes:

    > Ben Bacarisse wrote:
    >
    >> No. The only "multiple" formats are for strings formatted as
    >> characters.

    >
    > Well, that was sort of my point. By (slightly odd) analogy, it is
    > as if the language implements
    >
    > od -c file
    >
    > (via the %s format) but not
    >
    > od -x file
    >
    > The only difference is how each char is formatted.


    Exactly. That's why I said "for strings formatted as characters".
    There is not way to alter how the string's elements are formatted --
    it's as characters and not hex or decimal numbers or...

    Anyway, we're on the same page: there's no built-in way to do this.

    > Ah, wait, I see
    > why they didn't do it, the zero character is a string delimiter, so
    > that sort of format string would not be very useful for printing
    > general hex data. But printing strings which are not null terminated is also
    > a problem. Both of these could be handled by a width based %s variant
    > like:
    >
    > fprintf(stdout,"%*sc",len,buffer);


    As "the other Ben" has pointed out this is already available: "%.*s".

    > Which would be "print to a field width of len using character formatting
    > from character data in array 'buffer'". This would be equivalent to:
    >
    > fprintf(stdout,"%s",buffer);
    >
    > when buffer is null terminated and len=strlen(buffer). But the first
    > form would work even if buffer contained nulls or was not terminated.
    >
    > Similarly:
    >
    > fprintf(stdout,"%*sx",2*len,buffer);
    >
    > would print the hex format (.2 form) of the characters. And by
    > extension, for wide chars
    >
    > fprintf(stdout,"%*lsx,4*len,buffer);


    What's the 4 for? I don't like the idea of multiplying the length -- it
    seems irrelevant. len bytes are to be printed and the format should say
    how, surely?

    I see three problems: first you need to find a way that does not add
    characters to the format. I need to be able to write "%sc" and have a
    'c' come after the string. Secondly, you'd want to borrow as much power
    as you can from the existing formats. I should be able to use field
    widths, and precision, and alternate forms, and padding characters for
    my multiple hex bytes (or, indeed, my multiple ints or whatever). The
    third is that one would often want to include literal characters as well
    (printing multiple bytes in hex is problematic to read unless there is a
    separator).

    I started sketching a design, but it turned into "no one expects the
    Spanish Inquisition". I added a third problem above and then came up
    with two more mid way through. It's and interesting exercise and I use
    it as a teaching problem is I were still in that game.

    <snip>
    --
    Ben.
    Ben Bacarisse, Sep 23, 2011
    #6
  7. mathog

    mathog Guest

    Ben Bacarisse wrote:

    > I see three problems: first you need to find a way that does not add
    > characters to the format. I need to be able to write "%sc" and have a
    > 'c' come after the string. Secondly, you'd want to borrow as much power
    > as you can from the existing formats. I should be able to use field
    > widths, and precision, and alternate forms, and padding characters for
    > my multiple hex bytes (or, indeed, my multiple ints or whatever). The
    > third is that one would often want to include literal characters as well
    > (printing multiple bytes in hex is problematic to read unless there is a
    > separator).


    printf isn't really right for this. The problem isn't hard, but the
    syntax gets messy in a hurry, and printf is getting messier and messier
    as more things are added to it.

    Almost certainly this has been worked out more elegantly in some other
    language, but, in general terms a print function needs:

    1. input identifiers (defaults to order of arguments, if
    that has an unambiguous meaning)
    2. number of elements to print (defaults to 1)
    3. format for each element (width, precision, hex, integer. etc.
    There is no default for this part.)
    4. data type (c=char, w=wide char, s=short, i=int, etc., defaults
    to whatever the format would use in printf).

    Printf went off course back in the K&R days by mixing 3 and 4, and
    I don't think it is possible at this point to separate them again.
    Anyway, a more generic format specifier might look like

    %repeat#(%<modifiers>format(datatype(input)))

    If the datatype matches the default for format, then leave off
    (datatype). If only one instance is needed leave off %number#().
    If the current index for an input is needed precede it with i.
    The repeat construct could be nested.

    For simple things the function still looks like printf:

    general_print(stdout,"%s\n",string);

    but soon it doesn't. Print a line of 400 hex characters, 4 hex
    charactersfor each of 100 wide chars:

    general_print(stdout,"%100#(%4.4x(w))\n",wbuffer);

    or if length is a variable, one of these

    general_print(stdout,"%*#(%2.2x(w))\n",len,wbuffer);
    general_print(stdout,"%#(1)(%2.2x(w))\n",len,wbuffer);

    Allow nested repeats and access to the (implicit) input indices and it
    becomes more powerful, but the syntax gets complicated. Here's hoping I
    counted parens correctly:

    general_print(stdout,
    "%#(1)(%3d(i2) %5#(%3x(i(2)):%2d(l(3)), )%3x((2)):%2d(l(3))\n)",
    len,buffer1,buffer2);

    which if len was 4 would print:
    0 xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy
    5 xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy
    10 xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy
    15 xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy

    where the xxx values are hex taken as ints from buffer 1 and yy are
    decimal values as longs from buffer2.

    If the input indices are not hidden it will look a lot like fortran
    formatting.

    By this point it looks like named format blocks would help, since the
    same pattern was used twice in the example above. Something like
    this if standard defines are to be understood within the print format
    line:

    #define ABLOCK %3d(i(2)):%2d(l(3))
    general_print(stdout,
    "%#(1)(%3d(i2) %5#({ABLOCK}, ){ABLOCK}\n)",
    len,buffer1,buffer2);

    And that's enough speculation for one day.


    Regards,

    David Mathog
    mathog, Sep 24, 2011
    #7
  8. mathog

    Jorgen Grahn Guest

    On Fri, 2011-09-23, Ben Bacarisse wrote:
    > mathog <> writes:

    ....
    >> With the newer formatting options in fprintf (like "*" and "%m")

    >
    > * is very old and %m so new I've never heard of it before. Did you mean
    > %n (which is also quite old)?


    I think he meant that he hasn't learned about * until recently. There
    are (for some reason) lots of programmers who haven't, and work around
    the painful perceived lack of it in various complicated ways.

    %m is a Gnu thing, like someone else wrote. And since gcc itself
    complains about it not being standard, I think few people use it even
    when they "know" their code is never going to be compiled against
    any other libc.

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
    Jorgen Grahn, Sep 29, 2011
    #8
  9. Re: fprintf internally supports print character array in hex?

    On Sep 24, 12:12 am, mathog <> wrote:
    > Ben Bacarisse wrote:


    [trying to get printf to print an array in hex]

    > > I see three problems: first you need to find a way that does not add
    > > characters to the format.  I need to be able to write "%sc" and have a
    > > 'c' come after the string.  Secondly, you'd want to borrow as much power
    > > as you can from the existing formats.  I should be able to use field
    > > widths, and precision, and alternate forms, and padding characters for
    > > my multiple hex bytes (or, indeed, my multiple ints or whatever).  The
    > > third is that one would often want to include literal characters as well
    > > (printing multiple bytes in hex is problematic to read unless there is a
    > > separator).

    >
    > printf isn't really right for this.


    that was my thought on your /first/ post!


    > The problem isn't hard,


    I think it /is/ hard, that's why it hasn't been solved in a
    satidfactory way.

    > but the
    > syntax gets messy in a hurry, and printf is getting messier and messier
    > as more things are added to it.


    printf basically implements a "little language". You are trying to
    bolt more features onto its interpreter. You can't just say "its easy
    but the syntax is messy" because syntax is the essence of the problem.

    > Almost certainly this has been worked out more elegantly in some other
    > language, but, in general terms a print function needs:
    >
    >   1.  input identifiers (defaults to order of arguments, if
    >       that has an unambiguous meaning)
    >   2.  number of elements to print (defaults to 1)


    those other languages manage without passing it explicitly

    >   3.  format for each element (width, precision, hex, integer. etc.
    >       There is no default for this part.)


    actually I'd have thought there would be defaults

    >   4.  data type  (c=char, w=wide char, s=short, i=int, etc., defaults
    >       to whatever the format would use in printf).


    again those other langauges (Pascal, scheme, C++ spring to mind) can
    manage without passing this information explicitly.

    > Printf went off course back in the K&R days by mixing 3 and 4, and
    > I don't think it is possible at this point to separate them again.


    I quite like prinf formats. It acheives reasonable expressiveness with
    a compact but not unreadable format. C++ (for instance) looks very
    pretty right up until you want to fiddle with format. Then it explodes
    in size.

    > Anyway, a more generic format specifier might look like
    >
    >    %repeat#(%<modifiers>format(datatype(input)))
    >
    > If the datatype matches the default for format, then leave off
    > (datatype).  If only one instance is needed leave off %number#().
    > If the current index for an input is needed precede it with i.
    > The repeat construct could be nested.
    >
    > For simple things the function still looks like printf:
    >
    >    general_print(stdout,"%s\n",string);
    >
    > but soon it doesn't.  Print a line of 400 hex characters, 4 hex
    > charactersfor each of 100 wide chars:
    >
    >    general_print(stdout,"%100#(%4.4x(w))\n",wbuffer);


    hmm. this might actually work. Have you considerd implememting it?

    > or if length is a variable, one of these
    >
    >    general_print(stdout,"%*#(%2.2x(w))\n",len,wbuffer);
    >    general_print(stdout,"%#(1)(%2.2x(w))\n",len,wbuffer);
    >
    > Allow nested repeats and access to the (implicit) input indices and it
    > becomes more powerful, but the syntax gets complicated.  Here's hoping I
    > counted parens correctly:
    >
    >    general_print(stdout,
    >      "%#(1)(%3d(i2)  %5#(%3x(i(2)):%2d(l(3)), )%3x((2)):%2d(l(3))\n)",
    >       len,buffer1,buffer2);


    ah. Lisp.
    We haven't moved very far from 1958 have we?

    > which if len was 4 would print:
    >    0  xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy
    >    5  xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy
    >   10  xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy
    >   15  xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy
    >
    > where the xxx values are hex taken as ints from buffer 1 and yy are
    > decimal values as longs from buffer2.
    >
    > If the input indices are not hidden it will look a lot like fortran
    > formatting.


    I remember Fortran being much clunkier. But that was long ago.

    > By this point it looks like named format blocks would help, since the
    > same pattern was used twice in the example above.  Something like
    > this if standard defines are to be understood within the print format
    > line:
    >
    > #define ABLOCK %3d(i(2)):%2d(l(3))
    >    general_print(stdout,
    >      "%#(1)(%3d(i2)  %5#({ABLOCK}, ){ABLOCK}\n)",
    >       len,buffer1,buffer2);
    >
    > And that's enough speculation for one day.
    Nick Keighley, Sep 30, 2011
    #9
  10. mathog

    BartC Guest

    "mathog" <> wrote in message
    news:j5j3oo$49s$...

    > Anyway, a more generic format specifier might look like
    >
    > %repeat#(%<modifiers>format(datatype(input)))


    > Allow nested repeats and access to the (implicit) input indices and it
    > becomes more powerful, but the syntax gets complicated. Here's hoping I
    > counted parens correctly:
    >
    > general_print(stdout,
    > "%#(1)(%3d(i2) %5#(%3x(i(2)):%2d(l(3)), )%3x((2)):%2d(l(3))\n)",
    > len,buffer1,buffer2);


    It's more powerful, but also impossible to understand.

    > which if len was 4 would print:
    > 0 xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy
    > 5 xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy
    > 10 xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy
    > 15 xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy, xxx:yy
    >
    > where the xxx values are hex taken as ints from buffer 1 and yy are
    > decimal values as longs from buffer2.


    A more conventional approach might look like:

    for (i=0; i<len; ++i) {
    if ((i%5)==0) printf("%4d ",i);
    printf("%03d:%02d",buffer1,buffer2);
    printf("%s",(i+1)%5!=0?", ":"\n");
    }

    which at least is possible to debug if it's not quite right. And typos are
    more likely to be picked up at compile time.

    > And that's enough speculation for one day.


    I think I once proposed an array format-specifier, but can't remember the
    details. But based on your idea, it might look like:

    printf ("(%5#, %d)", array); or printf("(%*#, %d)",5,array);

    which might print out (10,20,30,40,50). Only repeat count and separator
    string are specified. And the array elements must of the type implied by the
    format.

    Anything more ambitious I think would quickly get too complex.

    --
    Bartc
    BartC, Sep 30, 2011
    #10
  11. On Fri, 23 Sep 2011 21:34:39 +0100, Ben Bacarisse
    <> wrote:

    > mathog <> writes:
    >
    > > Ben Bacarisse wrote:
    > >
    > >> No. The only "multiple" formats are for strings formatted as
    > >> characters.

    > >
    > > Well, that was sort of my point. By (slightly odd) analogy, it is
    > > as if the language implements
    > >
    > > od -c file
    > >
    > > (via the %s format) but not
    > >

    Not quite. od -c converts a nongraphic char to a visible sequence
    (and also spaces between each char to allow for that). %s transmits
    supported controls for their control effect, and any unused codes
    (e.g. 80-FF if ASCII in CHAR_BIT==8) with undefined effect.
    od -c is more akin to (but not the same as) cat -v or sed &lower-l;
    Also od defaultly prefixes each line with the (starting) location but
    that can be suppressed with -An.

    > > od -x file
    > >
    > > The only difference is how each char is formatted.

    >
    > Exactly. That's why I said "for strings formatted as characters".
    > There is not way to alter how the string's elements are formatted --
    > it's as characters and not hex or decimal numbers or...
    >
    > Anyway, we're on the same page: there's no built-in way to do this.
    >

    <snip>
    > I see three problems: first you need to find a way that does not add
    > characters to the format. I need to be able to write "%sc" and have a
    > 'c' come after the string. Secondly, you'd want to borrow as much power
    > as you can from the existing formats. I should be able to use field
    > widths, and precision, and alternate forms, and padding characters for
    > my multiple hex bytes (or, indeed, my multiple ints or whatever). The
    > third is that one would often want to include literal characters as well
    > (printing multiple bytes in hex is problematic to read unless there is a
    > separator).
    >

    Fortran formats, on which C's *printf/scanf were loosely based, allow
    a given format specifier to be repeated, or format specifiers and/or
    literal data grouped by parens to be repeated. This might give us
    something like (with unneeded string concats added for emphasis):
    printf ("%16:" "%(" "%02hhx" "," "%)" "\n", array_of_16_bytes);
    I think you could even do a half-decent job on complex numbers, now
    that C99 has them, but haven't actually worked out the details.

    Aside: OP has %repeat# downthread, but # is already used as a flag
    (for octal or hex force 0 or 0x/0X, for floating force decimal point
    even if not needed and for g/G force trailing zeros). There aren't
    many good punctuation characters left, and not all that many letters
    that aren't already either conversions or modifiers. If we could
    require 8859-1 (instead of just near-ASCII) for C source, we'd have
    dozens of new characters to make format strings even more cryptic and
    illegible. Wait -- what was the benefit of this plan again?

    > I started sketching a design, but it turned into "no one expects the
    > Spanish Inquisition". I added a third problem above and then came up
    > with two more mid way through. It's and interesting exercise and I use
    > it as a teaching problem is I were still in that game.
    >

    Then can I get the comfortable pillow?

    > <snip>
    David Thompson, Oct 3, 2011
    #11
  12. mathog

    Gene Guest

    Re: fprintf internally supports print character array in hex?

    On Sep 23, 1:23 pm, mathog <> wrote:
    > Consider this code snippet:
    >
    > char buffer[128];
    > int  i,len;
    >    /* code that puts data in buffer and sets len */
    > for(i=0;i<len;i++){fprintf(stdout,"%x",buffer);}
    >
    > With the newer formatting options in fprintf (like "*" and "%m") can the
    > for loop now be replaced with a single fprintf like:
    >
    >     fprintf(stdout,"%????x",buffer,len);
    >
    > where ???? is some combination of the newer format options.


    No and - trust me - you don't want to reinvent anything like Common
    Lisp's format specifiers. At least not if you ever must read anyone
    else's code that uses them ... or your own after more than a month.
    Gene, Oct 3, 2011
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    10
    Views:
    6,140
    Neredbojias
    Aug 19, 2005
  2. Bengt Richter
    Replies:
    6
    Views:
    449
    Juha Autero
    Aug 19, 2003
  3. Replies:
    7
    Views:
    3,547
  4. Martin DeMello

    what does print call internally?

    Martin DeMello, Oct 10, 2007, in forum: Ruby
    Replies:
    12
    Views:
    194
    Eric Hodel
    Oct 11, 2007
  5. Helmut Tessarek

    how does $#array work internally?

    Helmut Tessarek, Nov 8, 2012, in forum: Perl Misc
    Replies:
    11
    Views:
    339
    Keith Thompson
    Nov 11, 2012
Loading...

Share This Page