Non-constant constant strings

Discussion in 'C Programming' started by Rick C. Hodgin, Jan 19, 2014.

  1. Rick C. Hodgin

    Seebs Guest

    Sure it can. It could involve a call to a function called __addquad().

    Seebs, Jan 29, 2014
    1. Advertisements

  2. As I mentioned in the following paragraph (though the part you quoted
    was not quite correct).
    Keith Thompson, Jan 29, 2014
    1. Advertisements

  3. Rick C. Hodgin

    James Kuyper Guest

    So his first statement was clearly intended to distinguish between an
    actual C function call, and implicit function call. I would have chosen
    different wording to make that distinction clearer, but he did address
    the issue you raise.
    James Kuyper, Jan 29, 2014
  4. No. My position is that C is exceedingly mechanical ... so much so that
    there is nearly a 1:1 ratio between the things it does and the things the
    CPU must do to conduct the workload. That closeness is known to all who
    know C and assembly. The fact that other C developers may not know it as
    a matter of stored knowledge is immaterial because the relationship exists

    I also prefer Colby Jack on pizza in lieu of Mozzarella. I'm not sure
    what bearing that has on this discussion, but it seemed appropriate to
    add as a drop in (in case we ever want to get together and have a pizza
    party). :)

    Best regards,
    Rick C. Hodgin
    Rick C. Hodgin, Jan 29, 2014
  5. Rick C. Hodgin

    James Kuyper Guest

    The term ratio really implies numbers; in the only senses I can figure
    out for which numbers might apply, 1:1 is clearly false. For instance,
    the number of lines of assembler is almost completely unrelated to the
    number of lines of C code. It can be quite a bit larger or smaller,
    depending upon what the C code actually says.

    I assume that what you really mean is "correspondence", rather than
    "ratio". For some simple low-level languages, it it possible to set up a
    1-to-1 correspondence between language constructs and the generated
    assembly code. However, such a language is really nothing more than a
    high-level assembler. C is not, and never has been, such a language,
    though the correspondence was closer in the early days of C than it is
    now. You're underestimating the looseness of the correspondence between
    C code and assembly; combined with that fact that you've also proposed
    eliminating some of that looseness, this suggests that you may be
    unaware of just how much advantage the typical C compiler takes of that
    looseness. Turn optimization up high on a sophisticated modern C
    compiler targeting a well-known platform that has been around for a
    while. If you really believe that there is (or at least, should be) a
    1-to-1 correspondence, you're going to be quite shocked at how hard it
    is to identify the corresponding elements.

    One of the more common events in this newsgroup is someone posting a
    message complaining about the fact that he can't figure out how to make
    a C compiler generate assembly language that matches that person's
    opinion of how the assembler should be written. This complaint is based
    upon the same mistaken 1-to-1 assumption that you're making. Such people
    should either abandon that assumption, or write in assembler; the C
    standard, as a matter of deliberate design goes far out of it's way to
    make sure that the correspondence does NOT have to be 1-to-1.
    The fact that some people mistakenly assume a 1-to-1 correspondence is
    immaterial to the fact that, as a matter of very deliberate design, it
    is not actually required to be 1-to-1, and usually isn't.
    James Kuyper, Jan 29, 2014
  6. Do you ignore optimization?

    A compiler can, for example, generate no code at all for a given
    statement, or for an entire function, if it can prove that that
    statement or function has no effect.

    If I write:

    #include <stdio.h>
    int main(void) {
    int x = 2;
    int y = 2;
    printf("2 + 2 = %d\n", x + y);

    I expect the program, when I run it, to print:

    2 + 2 = 4

    I don't care (except as a matter of idle curiosity) whether the
    compiler generates code that performs the addition and calls printf,
    or reduces the 3 lines of code to the equivalent of

    puts("2 + 2 = 4");

    I care about the behavior of the running program. Machine or
    assembly language is nothing more or less than a means to that end.

    (If I cared for some reason about the existence of an ADD instruction in
    the generated code, then I'd use assembly language.)
    Keith Thompson, Jan 29, 2014
  7. Rick C. Hodgin

    Phil Carmody Guest

    You had the opportunity to snip the rest of the drivel too.
    This discussion, and all like it, will never lead anywhere

    Phil Carmody, Jan 29, 2014
  8. Agreed.

    Best regards,
    Rick C. Hodgin
    Rick C. Hodgin, Jan 29, 2014
  9. I said "nearly a 1:1 ratio between the things it does and the things the
    CPU must do to conduct the workload."

    int main(void)
    int a, b;

    populate_my_variables(&a, &b);
    printf("The product is: %d\n", a, b);
    return 0;

    ; Off the top of my head, please forgive any mistakes:
    ; // int main(void)
    ; // {
    ; // int a, b
    enter 8,0
    ; [ebp-0] - a
    ; [ebp-4] - b

    ; // populate_my_variables(&a, &b);
    ; // [implicit return][populate_my_variables][address_of a][address_of b]
    push ebp ; [address_of b]
    mov eax,ebp ; [address_of b];
    sub eax,4
    push eax
    call populate_my_variables ; [populate_my_variables]
    add esp,8 ; [implicit return]

    ; // printf("The product is: %d\n", a, b);
    ; // [printf]["The..\n"][a]
    push dword ptr [ebp-4] ;
    push dword ptr [ebp-0] ; [a]
    push address_of "The product is: %d\n" ; ["The..\n"]
    call printf ; [printf]
    add esp,12

    ; // return 0
    mov eax,0

    ; // }

    In this case, there are 14 separate things that must be considered
    for conversion:

    [1]int [2]main([3]void)
    int [4]a, [5]b;

    [6]populate_my_variables([7]&a, [8]&b);
    [9]printf([10]"The product is: %d\n", [11]a, [12]b);
    [13]return [14]0;

    These are translated to 15 separate things done in assembly (including
    function overhead), and this in wholly un-optimized mode.

    ; Off the top of my head, please forgive any mistakes:
    ; // int main(void)
    ; // {
    ; // int a, b
    01: enter 8,0
    ; [ebp-0] - a
    ; [ebp-4] - b

    ; // populate_my_variables(&a, &b);
    ; // [implicit return][populate_my_variables][address_of a][address_of b]
    02: push ebp ; [address_of b]
    03: mov eax,ebp ; [address_of b];
    04: sub eax,4
    05: push eax
    06: call populate_my_variables ; [populate_my_variables]
    07: add esp,8 ; [implicit return]

    ; // printf("The product is: %d\n", a, b);
    ; // [printf]["The..\n"][a]
    08: push dword ptr [ebp-4] ;
    09: push dword ptr [ebp-0] ; [a]
    10: push address_of "The product is: %d\n" ; ["The..\n"]
    11: call printf ; [printf]
    12: add esp,12

    ; // return 0
    13: mov eax,0

    ; // }
    14: leave
    15: ret

    My compiler actually does things notably differently so I never have to
    pass more than one parameter (which is register passed). But, that's
    a whole separate discussion.
    I apologize if the wrong idea was conveyed through my wording. I hope
    the example above now makes it clearer.
    Best regards,
    Rick C. Hodgin
    Rick C. Hodgin, Jan 29, 2014
  10. This should be:

    int main(void)
    int a, b;

    populate_my_variables(&a, &b);
    printf("The values are a:%d b:%d\n", a, b);
    return 0;

    Best regards,
    Rick C. Hodgin
    Rick C. Hodgin, Jan 29, 2014
  11. So you've completely reversed your position?
    Keith Thompson, Jan 29, 2014
  12. Not at all. You've misunderstood a great deal about what I've been
    talking about, and there are aspects of the things you say which I do
    agree with, but these are secondary to the remaining components which
    are not being accurately conveyed between us because we are not using
    a common language.

    Best regards,
    Rick C. Hodgin
    Rick C. Hodgin, Jan 29, 2014
  13. Rick C. Hodgin

    James Kuyper Guest

    Well, it's the optimized mode that's really most relevant, and this code
    is far to simple to allow significant opportunities for optimization.

    You'll have to provide a platform-dependent definition of how to count
    the number of things that a given piece of C code must do, in order to
    make your comment meaningful. If you do, I guarantee that whatever
    definition you choose, it will be trivial to identify some combination
    of source code, platform, compiler, and compiler options for which the
    ratio is NOT 1:1. In fact, it will be far easier to identify such
    combinations than to identify ones for which it is 1:1.

    It's feasible, in code this simple, to take each line of assembly code
    and associate it with a unique part of the original source code. If you
    call those the "things" that the C code must do, then of course the
    numbers will match. But different compilers will produce different sets
    of assembly language instructions when targeting different platforms,
    and when different optimizations are turned on. Those can't all be in a
    1-to-1 relationship to the same "thing" count; which renders your claim
    that there must be such a relationship nonsense. I've seen the same C
    code converted into 10 assembly language instructions by one compiler,
    and 500 instructions by another. Both sets of "things" to do were fully
    consistent with the requirements of the C standard. Which number
    constituted the correct count of the "things" that the C code was
    supposed to do? Note that some of the relevant optimizations take things
    like a+b*c and convert then into a single floating point instruction
    that takes three arguments.

    On the flip side, on a platform where there is no native support for
    floating point or for any data type larger than 32 bits, a simple
    statement a=b, which does a maximum of three "things" as far as C is
    concerned, can involve a MUCH large list of assembly language if a and b
    have the types "long long" and "long double complex", respectively.
    Actually, the fact that your compiler does something different is the
    norm, not the exception, which is precisely what is being discussed.
    James Kuyper, Jan 29, 2014
  14. Optimized code is nearly always shorter than un-optimized code.
    In general, everything that has a name definition, has an operator, is
    separated by a comma, is part of an assignment, or is part of a logic test.
    No doubt. I'm not proposing a standard. It's just the way things are.
    It works the same in all code (except for certain types of parallel
    heterogeneous code, some types of parallel homogeneous code, and
    some code that works in serial on parallel items (such as SSE code)
    due to the requirements of filling data horizontally in preparation
    for vertical processing.
    And the same platforms.
    The 1:1 ratio is on un-optimized code. It typically only gets better
    when optimizations are added.
    Let's go with the upper echelon of leading compilers in our comparisons,
    shall we?
    The one that is not obviously taxed by inadequacy.
    Yes. That's exactly my point. Optimizations generally only improve
    upon the 1:1 ratio, but the ratio generally holds true across all
    generated assembly instructions on 32-bit x86 platforms. It may not
    be true on other platforms which contain more general purpose registers,
    or other hardware features which alter the way the C code is translated.
    There are a million ways you can bring it unique conditions which destroy
    my argument. It doesn't change the merit of it on the whole.
    To this point, I had not mentioned mechanics of my compiler's implementation,
    but only outward syntax and theory.

    Best regards,
    Rick C. Hodgin
    Rick C. Hodgin, Jan 30, 2014
  15. Rick C. Hodgin

    Seebs Guest

    .... Uh. I am pretty sure I know a number of people who know C and assembly
    who don't "know" that. Possibly because it's not true.

    Seebs, Jan 30, 2014
  16. Rick C. Hodgin

    Kaz Kylheku Guest

    Firstly, there is no upper bound on the length of a piece of code to solve a
    task, or the time that it takes; there is no uniquely determined "unoptimized"

    For any program which we call "optimized", we can make a program which is
    longer, and slower and call that one "unoptimized". Q.E.D.: unoptimized
    programs are longer.

    However, the changes which make a given program faster, do not always make it

    If we insert NOP instructions to that branch targets are cache-aligned, the program grows longer.

    If we unroll loops for speed, the program grows longer.

    If we inline functions for speed, all the places where they are inlined grow

    If we use static lookup tables instead of computing something at run-time, the
    program may grow larger.

    The "meat and potatoes" optimizations performed on crudely compiled code
    usually do make it shorter, because crudely compiled code does some obviously
    poor things, like load values into registers which are then not subsquently
    used and such: a consequence of translating the various pieces of the program
    in isolation, according to fixed translation templates that "dove tail"
    together according to rules that make the code generation easy.
    Shorter characterization: anything that has a node in the abstract syntax tree.
    Kaz Kylheku, Jan 30, 2014
  17. I suppose, but some of that went away with the ANSI standard.

    For one, ANSI allows initializing auto arrays, which involve a
    fair amount of work each time. For another, passing struct by
    value. K&R allowed passing a pointer to a struct, but not the
    value of the struct. Again, more hidden work.

    But I still miss an exponential operator.


    -- glen
    glen herrmannsfeldt, Jan 30, 2014
  18. (snip)
    I do remember when 2**32 was close enough to infinity.
    Maybe greater than the MTBF for many processors, or at least
    more than the CPU time you would want to pay for.

    Some years ago, I had a book on digital logic where one of the
    projects was a 40 bit counter counting at 1Hz. Next to each
    light was the time that it would first come on.

    As most people here can figure out those times, I won't mention
    them, but the 40th one is a year starting with 19 and three more
    digits after that.

    -- glen
    glen herrmannsfeldt, Jan 30, 2014
  19. Rick C. Hodgin

    BartC Guest

    I once implemented an actual /machine-oriented/ language which was quite low
    level (lower than C), and even then there wasn't a 1:1 correspondence with
    machine instructions, although it was extremely easy to see how any
    statement mapped to actual instructions (it didn't need a stack for example
    as there was no operator precedence, and there was no optimisation).

    The execution model however was directly linked to the machine, with the
    data-types being the available word-sizes, and you could refer to registers
    by name.

    But it wasn't a high-level assembler in my opinion, because it wasn't
    possible to directly express machine instructions (iirc).

    C implements a more abstract execution model, with data-types that may or
    may not be available in the hardware, although the model is still simple:
    int types of various widths, floating point, and pointers. It tries to be
    independent of the hardware, but invariably an 'int' type might still be a
    machine word in width.

    It can still be possible, with the simpler types and detailed knowledge of
    the target hardware, to guess what machine instructions *might* be used to
    implement a statement, and thereby get some idea of its efficiency or
    otherwise. But optimising compilers make that more difficult. The 700 pages
    of the C standard which sets out how any construct has to behave don't make
    it any easier either.

    I might actually be agreeing with you for once...
    BartC, Jan 30, 2014
  20. Rick C. Hodgin

    James Kuyper Guest

    On 01/29/2014 07:59 PM, Rick C. Hodgin wrote:
    They're not particularly unique, if you think there's as few as a
    million of them. Personally, I think there's a lot more than that. I
    think that some variant of those "unique conditions" applies in almost
    every imaginable case. Therefore, the way you had to adjust your claim
    to deal with each of those issues pretty much completely demolishes your

    What you really seem to be claiming is that it might be possible to
    define a psuedo-generic assembly language for which there can be a
    1-to-1 correspondence between C constructs and instructions in that
    assembly language, if no optimizations are performed. I'm not willing to
    concede the feasibility of defining such an assembly language unless and
    until you actually provide a precise definition for it, but it might be
    possible. However, if it can be done, I suspect that many aspects of it
    would look more like a complicated encoding of C than a real-world
    assembly language. But even if you can define such a language, the only
    way to accommodate real world assembly languages is to dismiss the
    differences in the way they handle things from the way the
    pseudo-generic one does as mere "optimizations", as you have already
    done. Your claim becomes pretty pointless if such optimizations are
    sufficient to render it irrelevant, because your claim was in response
    to statements Keith was making that included optimized translations of C
    James Kuyper, Jan 30, 2014
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.