[OT] pointer vs. index notation

Discussion in 'C Programming' started by Ark Khasin, Nov 12, 2007.

  1. Ark Khasin

    Ark Khasin Guest

    So, p is *(p+i) is (for fun) *(i+p) is i[p].
    Yet I observed more than once that the equivalent constructs yield
    different code generated (and sometimes of different efficiency).
    Any idea why? Does notation serve as a hint to the compiler?
    --
    Ark
    Ark Khasin, Nov 12, 2007
    #1
    1. Advertising

  2. Ark Khasin

    santosh Guest

    In article <z7RZi.10725$h61.9895@trndny02>, Ark Khasin
    <> wrote on Monday 12 Nov 2007 10:32 am:

    > So, p is *(p+i) is (for fun) *(i+p) is i[p].
    > Yet I observed more than once that the equivalent constructs yield
    > different code generated (and sometimes of different efficiency).
    > Any idea why? Does notation serve as a hint to the compiler?


    It has been mentioned that the pointer notation leads to slightly more
    efficient code generation, but I suppose compilers these days generate
    the same code for both the forms.

    <OT>
    On this machine (Pentium Dual Core), gcc *does* different assembler code
    for both the notations. The array notation seems to give rise to a full
    base-indexed displacement instruction while the pointer notation use a
    simple indirection.
    </OT>
    santosh, Nov 12, 2007
    #2
    1. Advertising

  3. Ark Khasin

    Chris Torek Guest

    In article <z7RZi.10725$h61.9895@trndny02>
    Ark Khasin <> wrote:
    >So, p is *(p+i) is (for fun) *(i+p) is i[p].
    >Yet I observed more than once that the equivalent constructs yield
    >different code generated (and sometimes of different efficiency).


    Generally speaking, it should not. However:

    >Any idea why? Does notation serve as a hint to the compiler?


    It depends on the compiler.

    Most compilers today build trees (and maybe additional data structures
    as well, but they at least start out with trees) to represent
    expressions. The expression p "should" (logically at least,
    and using Lisp notation for the tree) turn into (* (+ p i)). The
    expression i[p] would turn into (* (+ i p)).

    Given a compiler that uses trees, it may then run the optimization
    pass on those trees. It is possible that whoever wrote this pass
    thought to look for (* (+ p i)) patterns and optimize them, but
    forgot to include (* (+ i p)) patterns. Those thus escape at least
    some optimizations and survive to the code-generation portion of
    the compiler.

    As long as the code-generation part of the compiler handles the
    second pattern (instead of crashing with "internal compiler error:
    cannot figure out how to produce code for expression" or whatever),
    you will still get machine (or assembly or whatever output format)
    code for both constructs.

    If you look inside actual C compilers that do optimization with
    expression trees, there is usually some horrible thousands-of-lines
    switch statement or "if/else" chain or some such to match particular
    trees (though some use tables, and some have hybrids of tables
    *and* horrible 3000-line switch statements :) ), and it is easy
    for the programmer to forget to put in symmetric cases. Fortunately
    it is also easy to add them afterward -- so you just need to point
    out to the compiler-writers that p works well and i[p] does not,
    and ten coding minutes (and 3 hours to compile the compiler) later,
    they both work equally well.
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://web.torek.net/torek/index.html
    Reading email is like searching for food in the garbage, thanks to spammers.
    Chris Torek, Nov 12, 2007
    #3
  4. santosh schrieb:
    > In article <z7RZi.10725$h61.9895@trndny02>, Ark Khasin
    > <> wrote on Monday 12 Nov 2007 10:32 am:
    >
    >> So, p is *(p+i) is (for fun) *(i+p) is i[p].
    >> Yet I observed more than once that the equivalent constructs yield
    >> different code generated (and sometimes of different efficiency).
    >> Any idea why? Does notation serve as a hint to the compiler?

    >
    > It has been mentioned that the pointer notation leads to slightly more
    > efficient code generation, but I suppose compilers these days generate
    > the same code for both the forms.


    I do not know how *any* compiler could dereference an integer value (as
    the i[p] notation suggests).

    > <OT>
    > On this machine (Pentium Dual Core), gcc *does* different assembler code
    > for both the notations.


    On my x86_64 machine, using gcc 4.1.2, -O3 optimization it does not.
    Code is completely equal.

    Greetings,
    Johannes

    --
    "PS: Ein Realname wäre nett. Ich selbst nutze nur keinen, weil mich die
    meisten hier bereits mit Namen kennen." -- Markus Gronotte aka "Makus"
    aka "Kosst Amojan" aka "maqqusz" in de.sci.electronics
    <45608268$0$5719$-online.net>
    Johannes Bauer, Nov 12, 2007
    #4
  5. Ark Khasin

    Chris Dollin Guest

    Johannes Bauer wrote:

    > santosh schrieb:
    >> In article <z7RZi.10725$h61.9895@trndny02>, Ark Khasin
    >> <> wrote on Monday 12 Nov 2007 10:32 am:
    >>
    >>> So, p is *(p+i) is (for fun) *(i+p) is i[p].
    >>> Yet I observed more than once that the equivalent constructs yield
    >>> different code generated (and sometimes of different efficiency).
    >>> Any idea why? Does notation serve as a hint to the compiler?

    >>
    >> It has been mentioned that the pointer notation leads to slightly more
    >> efficient code generation, but I suppose compilers these days generate
    >> the same code for both the forms.

    >
    > I do not know how *any* compiler could dereference an integer value (as
    > the i[p] notation suggests).


    It may suggest it, but it doesn't mean it. `i[p]` is (as is wrote above)
    `*(i+p)`, which commutes to `*(p+i)`, for which `p` is (just) shorthand.
    In all four of these expressions, it's the sum of `p` and `i` which is
    dereferences, not either of then individual values.

    --
    Chris 1["CJD"] Dollin

    Hewlett-Packard Limited registered no:
    registered office: Cain Road, Bracknell, Berks RG12 1HN 690597 England
    Chris Dollin, Nov 12, 2007
    #5
  6. Chris Dollin schrieb:

    >> I do not know how *any* compiler could dereference an integer value (as
    >> the i[p] notation suggests).

    >
    > It may suggest it, but it doesn't mean it. `i[p]` is (as is wrote above)
    > `*(i+p)`, which commutes to `*(p+i)`, for which `p` is (just) shorthand.
    > In all four of these expressions, it's the sum of `p` and `i` which is
    > dereferences, not either of then individual values.


    I fully understand your line of argumentation. And indeed, they are
    equivalent. I just tried out

    char moo[] = "Test.";
    printf("%c\n", 2[moo]);

    And was quite surprised, to be honest, that it compiled cleanly and
    yielded the expected result. Knowing C for 10 years I'm still sometimes
    surprised what a sick yet beautiful language it is ;-)

    Greetings,
    Johannes

    --
    "PS: Ein Realname wäre nett. Ich selbst nutze nur keinen, weil mich die
    meisten hier bereits mit Namen kennen." -- Markus Gronotte aka "Makus"
    aka "Kosst Amojan" aka "maqqusz" in de.sci.electronics
    <45608268$0$5719$-online.net>
    Johannes Bauer, Nov 12, 2007
    #6
  7. Ark Khasin

    Willem Guest

    Johannes wrote:
    ) I fully understand your line of argumentation. And indeed, they are
    ) equivalent. I just tried out
    )
    ) char moo[] = "Test.";
    ) printf("%c\n", 2[moo]);

    Try:

    printf("%c\n", 2["Test."]);

    ;)


    SaSW, Willem
    --
    Disclaimer: I am in no way responsible for any of the statements
    made in the above text. For all I know I might be
    drugged or something..
    No I'm not paranoid. You all think I'm paranoid, don't you !
    #EOT
    Willem, Nov 12, 2007
    #7
  8. Ark Khasin

    Guest

    In article <>,
    Chris Torek <> wrote:

    >If you look inside actual C compilers that do optimization with
    >expression trees, there is usually some horrible thousands-of-lines
    >switch statement or "if/else" chain or some such to match particular
    >trees (though some use tables, and some have hybrids of tables
    >*and* horrible 3000-line switch statements :) ), and it is easy
    >for the programmer to forget to put in symmetric cases.


    Is there a good reason why they don't do a normalization pass first, to
    convert symmetric operations into a canonical form? It seems to me
    that that would be an easy and cheap way to reduce the size of the
    massive switch-or-table, though some care would have to be taken to
    avoid normalizing asymmetric operations like subtraction.


    dave
    , Nov 12, 2007
    #8
  9. Ark Khasin

    Chris Torek Guest

    >In article <>,
    >Chris Torek <> wrote:
    >>If you look inside actual C compilers that do optimization with
    >>expression trees, there is usually some horrible thousands-of-lines
    >>switch statement or "if/else" chain or some such to match particular
    >>trees (though some use tables, and some have hybrids of tables
    >>*and* horrible 3000-line switch statements :) ), and it is easy
    >>for the programmer to forget to put in symmetric cases.


    In article <fha9vk$clq$>,
    <> wrote:
    >Is there a good reason why they don't do a normalization pass first, to
    >convert symmetric operations into a canonical form?


    I imagine some do, although the last time I was digging around
    inside gcc, I think it did not, and I have not seen it in the
    innards of one or two other compilers I have touched. (I should
    note that I have not gone spelunking in gcc since the 2.95 days.)

    >It seems to me that that would be an easy and cheap way to reduce
    >the size of the massive switch-or-table, though some care would
    >have to be taken to avoid normalizing asymmetric operations like
    >subtraction.


    Yes. It also means one more pass over the tree, which has some
    time cost.
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://web.torek.net/torek/index.html
    Reading email is like searching for food in the garbage, thanks to spammers.
    Chris Torek, Nov 12, 2007
    #9
  10. Johannes Bauer <> writes:
    > santosh schrieb:
    >> In article <z7RZi.10725$h61.9895@trndny02>, Ark Khasin
    >> <> wrote on Monday 12 Nov 2007 10:32 am:
    >>
    >>> So, p is *(p+i) is (for fun) *(i+p) is i[p].
    >>> Yet I observed more than once that the equivalent constructs yield
    >>> different code generated (and sometimes of different efficiency).
    >>> Any idea why? Does notation serve as a hint to the compiler?

    >>
    >> It has been mentioned that the pointer notation leads to slightly more
    >> efficient code generation, but I suppose compilers these days generate
    >> the same code for both the forms.

    >
    > I do not know how *any* compiler could dereference an integer value (as
    > the i[p] notation suggests).

    [...]

    See question 6.11 in the comp.lang.c FAQ, <http://www.c-faq.com/>.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Looking for software development work in the San Diego area.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Nov 12, 2007
    #10
  11. Ark Khasin

    CBFalconer Guest

    Johannes Bauer wrote:
    > santosh schrieb:
    >

    .... snip ...
    >>
    >> It has been mentioned that the pointer notation leads to slightly
    >> more efficient code generation, but I suppose compilers these days
    >> generate the same code for both the forms.

    >
    > I do not know how *any* compiler could dereference an integer value
    > (as the i[p] notation suggests).


    A reference of that type is converted into a pointer (the p) and an
    integer (the i). As long as there is one of each the results can
    be added and then dereferenced (assuming within range). Note that
    the integer is multiplied by (sizof *p).

    --
    Chuck F (cbfalconer at maineline dot net)
    <http://cbfalconer.home.att.net>
    Try the download section.



    --
    Posted via a free Usenet account from http://www.teranews.com
    CBFalconer, Nov 12, 2007
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Angela
    Replies:
    3
    Views:
    353
    Artie Gold
    Oct 27, 2003
  2. Grey Squirrel

    Hungarian Notation Vs. Pascal Notation?

    Grey Squirrel, Mar 19, 2007, in forum: ASP .Net
    Replies:
    6
    Views:
    1,292
    Steve C. Orr [MCSD, MVP, CSM, ASP Insider]
    Mar 21, 2007
  3. Tameem
    Replies:
    454
    Views:
    11,754
  4. Robert Mark Bram

    Dot notation V Bracket notation

    Robert Mark Bram, Jul 4, 2003, in forum: Javascript
    Replies:
    3
    Views:
    465
    Robert Mark Bram
    Jul 5, 2003
  5. Tomasz Chmielewski

    sorting index-15, index-9, index-110 "the human way"?

    Tomasz Chmielewski, Mar 4, 2008, in forum: Perl Misc
    Replies:
    4
    Views:
    275
    Tomasz Chmielewski
    Mar 4, 2008
Loading...

Share This Page