An unusual use of case/switch

Discussion in 'C Programming' started by Xavier Roche, Sep 16, 2013.

  1. Xavier Roche

    Rosario1903 Guest

    this is wrong because foo has ten char spaces
    and index is 49 in foo[index]
    what is wrong about above is: the mult operations
    pointer mult pointer
    integer or pointer mult pointer
    are rare

    the only operations with pointer are not rare are + and -
    pointer + or - unsigned
    pointer - pointer
    pointer % unsigned

    so would be
    against:
    u32 foo = 0xC0084001;
    u32 v=7;

    v=v*v;
    bar(foo[v]);

    that means
    bar( *(u32*) ((u8*)foo+v*sizeof(u32)) )

    that has one chances to be right
    if i not wrote something wrong and understand the post above...
     
    Rosario1903, Sep 27, 2013
    #41
    1. Advertisements

  2. Xavier Roche

    Rosario1903 Guest

    wrong:
    u8 *foo = 0xC0084001;
    u32 v=7;

    v=v*v;
    bar(foo[v]);

    that means
    bar( * (foo+v*sizeof(u32)) )
     
    Rosario1903, Sep 27, 2013
    #42
    1. Advertisements

  3. Xavier Roche

    Rosario1903 Guest

    wrong

    wrong:
    u8 *foo = 0xC0084001;
    u32 v=7;

    v=v*v;
    bar(foo[v]);

    that means
    bar( * ( foo + v ) )
     
    Rosario1903, Sep 27, 2013
    #43
  4. Xavier Roche

    Maxim Fomin Guest

    27.09.2013 12:07, Rosario1903 пишет:
    Thanks. Without your help nobody would have spotted the bug.
     
    Maxim Fomin, Sep 27, 2013
    #44
  5. I'll try to make my point again without being quite so dismissive.
    The fact that something is represented as a bit-string does not imply
    that it's an integer. In computer systems, bit-strings are used to
    represent everything that can be represented. Yes, addresses are
    represented as bit-strings; so are floating-point numbers.

    There is a mapping between addresses and integers, exposed in C by the
    semantics of pointer-to-integer and integer-to-pointer conversions.
    On many systems, that mapping is trivial. On others it may not be.

    If you're talking about a particular low-level system, outside the
    context of C, it may well be reasonable to treat addresses as integers.
    That doesn't mean that addresses *are* integers, and it certainly
    doesn't mean that addresses are ints (where "int" is a specific type
    that exists in C).

    A side point: it's quite common for addresses and "int" to have
    different sizes.
    The result of the multiplication is well defined (it's 49, obviously).
    If "index" were a pointer, that operation would be about as meaningful
    as the square root of beige.
    I'm sure misusing an integer value as an out-of-bounds array index makes
    some point, but I don't know what it is.

    I can imagine cases where the result of an integer multiplication could
    sensibly be used as an array index.
    I presume u32 is an unsigned 32-bit integer type. If so, the result of
    the multiplication is well defined by C, but if the operands are integer
    representations of memory addresses it's probably not meaningful.

    In C terms, 0xC0084001 is not an address. It may well be the result of
    converting an address to u32, and converting 0xC0084001 to some pointer
    type would yield an address. And one can certainly informally talk
    about, say, "the address 0xC0084001" -- assuming that makes sense for
    the particular system being discussed.
    I would ask you to reconsider putting words in my mouth. In particular,
    I've said that multiplying two addresses is meaningless, and that
    addresses are not integers (in C, of course); I don't know how an
    example of multiplying two integers is relevant to that.

    To summarize:

    In C, addresses (i.e., values of pointer type) are distinct from
    integers. They can be converted back and forth; the result of such a
    conversion is "intended to be consistent with the addressing structure
    of the execution environment", whatever that may be. Very commonly this
    makes the conversion trivial, just reintrepreting the bits.

    In contexts other than C, it may well make sense to treat addresses and
    integers as the same thing, or as very similar things. An address might
    logically be an integer index into an "array" that corresponds to the
    machine's entire (virtual or physical) memory. Or it might be something
    else altogether; it depends on the system.

    Since this is a C newsgroup, it is my opinion that it's a good idea to
    make the distinction between integers and addresses clear. Among other
    things, it might prevent inexperienced C programmers from wondering why

    void *ptr = 0xC0084001;

    is rejected.

    I'm aware that a lot of this is stuff that you know as well as I do.
    I'm restating some of it in the hope of figuring out just where we
    disagree.

    Do you disagree with any of that?
     
    Keith Thompson, Sep 27, 2013
    #45
  6. Xavier Roche

    Seebs Guest

    I am not convinced this is actually the case. Or rather, I am not convinced
    it is absolutely, always, without exception, and universally the case.

    Let's take, just as an example, a fairly obscure architecture used for
    some chips a while back made by some little company based out of Portland
    or thereabouts, which used to have "addressing modes", which had names like
    "near" and "far" and such. These denoted states in which an address might
    be, say, a pair of 16-bit values which were combined together in such a way
    that multiple distinct sets of 32 bits might represent the same address.
    Which is to say... I do not think it fair to say that they were "integers"
    in the usual sense. Because you couldn't just do integer arithmetic on the
    32-bit value, or on either 16-bit value, and expect sensible results in all
    cases.
    I think that's a bit unfair. Keith's claim wasn't "it is always and without
    exception appropriate and sane to multiply an integer by another integer", but
    merely "multiplication is a defined operation on integers".

    A rebuttal would look something like:

    void *v1 = malloc(1);
    void *v2 = malloc(1);

    intptr_t a1 = (intptr_t) v1;
    intptr_t a2 = (intptr_t) v2;

    intptr_t result = a1 * a2;

    Is it remotely coherent to talk about multiplying or dividing addresses?
    If not, they're not conceptually integers, even if they happen to map onto
    memory in a way that makes "integer" a useful way to think about them in
    some contexts.

    -s
     
    Seebs, Sep 29, 2013
    #46
  7. Xavier Roche

    Phil Carmody Guest

    I apologise, to Keith too, for my perhaps over-the-top flippancy.
    That falls under the "not necessarily atomic" clause. The paragraphs
    were integral, the offsets were integral. The idea that went through
    my mind as I was writing the above was even less flat than the 8086
    paragraph/offset model, it was that of paged memory, where there
    wasn't even an address wide enough for the whole address, first you
    had to page in the appropriate page (with an integral page number),
    and then you could address just that page (with an integral offset).

    I don't know anyone who works in chip design who doesn't think of
    addresses as anything other than (not-necessarily-contiguous ranges
    of) integers. C's job is to model that conveniently. Sometimes, in the
    case of memory mapped I/O where accesses might not be performed
    identically to how real memory is accessed (e.g. real RAM may have
    different byte-endianness than the memory-mapped I/O regions), the C
    programmer will wish to absolutely prevent the dereferencing of such
    "pointers", and may chose to not use C pointers at all. Others, for
    example in linux, may use a pointer, but have it annotated for the
    compiler as non-dereferenceable. Others just call it a void* and
    hope. There's no right way, there's just a collection of not-
    dreadfully-wrong ways, and to me the existence of these
    contradictory/incompatible approaches shows that C is not perfect in
    modelling real world hardware. I therefore don't like to just say
    "C has defined what a pointer is, that's the end of the story"

    Phil
     
    Phil Carmody, Oct 1, 2013
    #47
  8. Cheerfully accepted!
     
    Keith Thompson, Oct 1, 2013
    #48
  9. Xavier Roche

    Seebs Guest

    Yeah. But... Even though it may be structurally like an integer, I find
    it pretty useless to try to think of it as an integer.

    Integers are numeric values. It makes sense to talk about multiplying or
    dividing them. It makes sense to talk about the average of a set. None of
    these make sense when talking about pointers in C.
    I don't know many people who work in chip design to begin with. However,
    I'm not sure it's necessarily trhe case that a way of thinking about things
    that's useful for chip design is a good approach when thinking about C.
    I'd agree, and I'd furthermore state that part of the reason for this is
    that real world hardware doesn't all have compatible conceptual designs.
    What I've found is that I can write a whole heck of a lot of code thinking
    about it that way and have no problems. And that I see a lot of buggy code
    coming from people who are thinking too hard about what they "know" about
    the underlying machine, and not enough about the slightly different rules C
    imposes in order to let compilers make good optimization choices.

    -s
     
    Seebs, Oct 2, 2013
    #49
  10. Indeed. A concrete example: some years ago I saw some code that
    was intended to compute the distance in bytes between the memory
    locations pointed to by two pointers. The computation was something
    like:

    char *ptr0 = ...;
    char *ptr1 = ...;
    int difference = (uintptr_t)ptr1 - (uintptr_t)ptr0;

    The person who wrote the code presumably was thinking of pointers
    in a way that goes beyond what C says about them. And it would
    not work correctly on at least one of the platforms on which the
    code needed to run.
     
    Keith Thompson, Oct 2, 2013
    #50
  11. Xavier Roche

    Tim Rentsch Guest

    Don't these numbers strike you as suspicious, in particular the
    discrepancy between the two functions under the sun compiler?
    The difference is well over a factor of 10, and given the counts
    involved suggests an operation rate (in the faster measurement)
    of more than 25 billion instructions per second. Clearly there
    is something unusual going on here - perhaps the result of
    targeting a particular benchmark, or the compiler playing a
    little bit fast and loose with the semantics, but in any case
    surely more than just a better job of unrolling the loop.
    Moreover the sun compiler was worse on the duff's device function
    than gcc, which shows the sun compiler to be significantly uneven
    in the results it achieves. I don't think these results show
    very much except that there is something funny going on and other
    variations need to be measured before any stronger conclusions
    should be drawn.
     
    Tim Rentsch, Oct 22, 2013
    #51
  12. Xavier Roche

    Phil Carmody Guest

    Naive loop:

    Duff's Device:
    A simple 16-chars-at-a-time vectorisation of the naive loop
    (which gcc *should* have been able to do too, they specifically
    look for loops like these) could easily provide nearly 16x
    speed-up on some architectures (with next-to-zero loop overhead).
    Add in an uncooperative cache architecture, and the byte writes
    could be even more expensive. I wish my Alpha was still running,
    as that was notorious for (lack of) byte accesses.

    Personally, I'm happy to see compilers remove the need for
    uglifying micro-optimisations whereever possible.

    I would be interested to see Ian's disassembly, though. (all 4)

    Phil
     
    Phil Carmody, Oct 22, 2013
    #52
  13. Xavier Roche

    jgh Guest

    Ah ha! I remember now. It's the difference between:

    loop: movb (r0)+,(r1)+
    sob r2,loop

    and:
    mov r2,r3
    bic -8,r3
    asl r3
    add pc,r3 ; branch according to r2 AND 7 to go0-go7
    loop: movb (r0)+,(r1)+
    go7: movb (r0)+,(r1)+
    go6: movb (r0)+,(r1)+
    go5: movb (r0)+,(r1)+
    go4: movb (r0)+,(r1)+
    go3: movb (r0)+,(r1)+
    go2: movb (r0)+,(r1)+
    go1: movb (r0)+,(r1)+
    sub r2,#8
    bne loop

    (untested)

    JGH
     
    jgh, Oct 22, 2013
    #53
  14. Xavier Roche

    Tim Rentsch Guest

    I don't think this conclusion can be right, unless you're
    talking about something different than what I think you mean.
    A non-optimized version of the original loop would look like

    load indirect
    increment pointer
    store indirect
    decrement count
    test and branch

    which times 16 would be 80 "instructions". Now suppose the
    optimizer magically eliminates all but the fetching of
    source bytes. We still are left with 16 "instructions" for
    one pass through the unrolled loop, which seems unlikely to
    be 16 times faster than the original. [snip]

    I admit the performance predictor of "instructions" that I'm
    using here is simplified and crude, but even so I wouldn't
    expect it to be off by the factor of three that would be
    needed to reconcile the discrepancy. Did I misunderstand
    your suggestion?
    Me too, definitely. Unfortunately how well they do is
    pretty uneven, so it still can't be taken for granted.
    I'd like to see compilers do a better job of smoothing
    out the worst cases, and put less emphasis on doing
    fantastically well on contrived benchmark cases.
    Ditto. I also would like to see some results of some
    similar test cases (for example, *--p = *q++, and other
    simple variations), to get a better sense of what's really
    going on here.

    Incidentally, in my own tests, there was a huge difference
    between a loop doing *p++ = *q++ for its body, and another
    loop doing *--p = *--q for its body.
     
    Tim Rentsch, Oct 24, 2013
    #54
  15. Xavier Roche

    Phil Carmody Guest

    Nope, vectorisation a la SIMD, not mere unrolling.

    Preamble
    Load 16 bytes
    Increment pointer by 16
    Store 16 bytes
    Decrement count by 16
    Test and branch
    Remnants

    GCC specifically attempts to do this on some architectures/platforms,
    but clearly fails on Ian's, as this *p++=*q++ case is presumably one
    of the most basic examples.

    Still looking forward to an objdump... ;-)

    Phil
     
    Phil Carmody, Oct 25, 2013
    #55
  16. Xavier Roche

    Tim Rentsch Guest

    Oh I see, you assume 16 bytes can be loaded in parallel
    (and the number 16 was not just picked at random but
    relates to memory architecture). And this may be how
    the performance gain was achieved. At the same time,
    it illustrates the need for trying other code sequences,
    eg, those which are not (easily) parallelizble in this
    way.
    Re-ditto.
     
    Tim Rentsch, Oct 27, 2013
    #56
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.