[OT] pointer vs. index notation

Discussion in 'C Programming' started by Ark Khasin, Nov 12, 2007.

  1. Ark Khasin

    Ark Khasin Guest

    So, p is *(p+i) is (for fun) *(i+p) is i[p].
    Yet I observed more than once that the equivalent constructs yield
    different code generated (and sometimes of different efficiency).
    Any idea why? Does notation serve as a hint to the compiler?
     
    Ark Khasin, Nov 12, 2007
    #1
    1. Advertisements

  2. Ark Khasin

    santosh Guest



    It has been mentioned that the pointer notation leads to slightly more
    efficient code generation, but I suppose compilers these days generate
    the same code for both the forms.

    <OT>
    On this machine (Pentium Dual Core), gcc *does* different assembler code
    for both the notations. The array notation seems to give rise to a full
    base-indexed displacement instruction while the pointer notation use a
    simple indirection.
    </OT>
     
    santosh, Nov 12, 2007
    #2
    1. Advertisements

  3. Ark Khasin

    Chris Torek Guest



    Generally speaking, it should not. However:
    It depends on the compiler.

    Most compilers today build trees (and maybe additional data structures
    as well, but they at least start out with trees) to represent
    expressions. The expression p "should" (logically at least,
    and using Lisp notation for the tree) turn into (* (+ p i)). The
    expression i[p] would turn into (* (+ i p)).

    Given a compiler that uses trees, it may then run the optimization
    pass on those trees. It is possible that whoever wrote this pass
    thought to look for (* (+ p i)) patterns and optimize them, but
    forgot to include (* (+ i p)) patterns. Those thus escape at least
    some optimizations and survive to the code-generation portion of
    the compiler.

    As long as the code-generation part of the compiler handles the
    second pattern (instead of crashing with "internal compiler error:
    cannot figure out how to produce code for expression" or whatever),
    you will still get machine (or assembly or whatever output format)
    code for both constructs.

    If you look inside actual C compilers that do optimization with
    expression trees, there is usually some horrible thousands-of-lines
    switch statement or "if/else" chain or some such to match particular
    trees (though some use tables, and some have hybrids of tables
    *and* horrible 3000-line switch statements :) ), and it is easy
    for the programmer to forget to put in symmetric cases. Fortunately
    it is also easy to add them afterward -- so you just need to point
    out to the compiler-writers that p works well and i[p] does not,
    and ten coding minutes (and 3 hours to compile the compiler) later,
    they both work equally well.
     
    Chris Torek, Nov 12, 2007
    #3


  4. I do not know how *any* compiler could dereference an integer value (as
    the i[p] notation suggests).
    On my x86_64 machine, using gcc 4.1.2, -O3 optimization it does not.
    Code is completely equal.

    Greetings,
    Johannes
     
    Johannes Bauer, Nov 12, 2007
    #4
  5. Ark Khasin

    Chris Dollin Guest



    It may suggest it, but it doesn't mean it. `i[p]` is (as is wrote above)
    `*(i+p)`, which commutes to `*(p+i)`, for which `p` is (just) shorthand.
    In all four of these expressions, it's the sum of `p` and `i` which is
    dereferences, not either of then individual values.
     
    Chris Dollin, Nov 12, 2007
    #5


  6. I fully understand your line of argumentation. And indeed, they are
    equivalent. I just tried out

    char moo[] = "Test.";
    printf("%c\n", 2[moo]);

    And was quite surprised, to be honest, that it compiled cleanly and
    yielded the expected result. Knowing C for 10 years I'm still sometimes
    surprised what a sick yet beautiful language it is ;-)

    Greetings,
    Johannes
     
    Johannes Bauer, Nov 12, 2007
    #6
  7. Ark Khasin

    Willem Guest

    Johannes wrote:
    ) I fully understand your line of argumentation. And indeed, they are
    ) equivalent. I just tried out
    )
    ) char moo[] = "Test.";
    ) printf("%c\n", 2[moo]);

    Try:

    printf("%c\n", 2["Test."]);

    ;)


    SaSW, Willem
    --
    Disclaimer: I am in no way responsible for any of the statements
    made in the above text. For all I know I might be
    drugged or something..
    No I'm not paranoid. You all think I'm paranoid, don't you !
    #EOT
     
    Willem, Nov 12, 2007
    #7
  8. Ark Khasin

    dj3vande Guest

    Is there a good reason why they don't do a normalization pass first, to
    convert symmetric operations into a canonical form? It seems to me
    that that would be an easy and cheap way to reduce the size of the
    massive switch-or-table, though some care would have to be taken to
    avoid normalizing asymmetric operations like subtraction.


    dave
     
    dj3vande, Nov 12, 2007
    #8
  9. Ark Khasin

    Chris Torek Guest

    I imagine some do, although the last time I was digging around
    inside gcc, I think it did not, and I have not seen it in the
    innards of one or two other compilers I have touched. (I should
    note that I have not gone spelunking in gcc since the 2.95 days.)
    Yes. It also means one more pass over the tree, which has some
    time cost.
     
    Chris Torek, Nov 12, 2007
    #9

  10. [...]

    See question 6.11 in the comp.lang.c FAQ, <http://www.c-faq.com/>.
     
    Keith Thompson, Nov 12, 2007
    #10
  11. Ark Khasin

    CBFalconer Guest

    A reference of that type is converted into a pointer (the p) and an
    integer (the i). As long as there is one of each the results can
    be added and then dereferenced (assuming within range). Note that
    the integer is multiplied by (sizof *p).
     
    CBFalconer, Nov 12, 2007
    #11
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.