Implementation -defined behavior

Discussion in 'C Programming' started by amit.codename13@gmail.com, May 6, 2009.

  1. Guest

    will the following code have implementation defined behavior???

    int main()
    {
    int *i,j;
    i=(int*)10;
    return 0;
    }

    is it certain as to what value is stored in i???
     
    , May 6, 2009
    #1
    1. Advertising

  2. On 6 May 2009 at 22:02, wrote:
    > int main()
    > {
    > int *i,j;
    > i=(int*)10;
    > return 0;
    > }
    >
    > is it certain as to what value is stored in i???


    Yes it is. (The representation of that value is not certain: it may be
    little or big endian depending on the architecture the code is compiled
    for.)

    Pointers are interchangeable with the signed integer type intptr_t, and
    /any/ integer type is guaranteed to be large enough to store 10 without
    overflow.

    In practise, of course, intptr_t will either be int32_t or int64_t on a
    modern non-embedded implementation.
     
    Antoninus Twink, May 6, 2009
    #2
    1. Advertising

  3. Flash Gordon Guest

    Antoninus Twink wrote:
    > On 6 May 2009 at 22:02, wrote:
    >> int main()
    >> {
    >> int *i,j;
    >> i=(int*)10;
    >> return 0;
    >> }
    >>
    >> is it certain as to what value is stored in i???

    >
    > Yes it is.


    <snip>

    Wrong. It could be a trap and if it isn't the conversion is
    implementation defined. I can't be bothered to correct all the other
    errors you made.
    --
    Flash Gordon
     
    Flash Gordon, May 7, 2009
    #3
  4. On 6 May 2009 at 23:15, Flash Gordon wrote:
    > Wrong. It could be a trap and if it isn't the conversion is
    > implementation defined.


    Let's break it up into an extra step then, to be clear about what's
    going on.

    int *ip;
    intptr_t i;
    i = 10;
    ip = (int *) i;
    assert((intptr_t) i == 10);

    Would you like to name me an implementation where this code compiles and
    the assertion fails when it runs?
     
    Antoninus Twink, May 7, 2009
    #4
  5. Guest

    On May 7, 4:15 am, Antoninus Twink <> wrote:
    > On  6 May 2009 at 23:15, Flash Gordon wrote:
    >
    > > Wrong. It could be a trap and if it isn't the conversion is
    > > implementation defined.

    >
    > Let's break it up into an extra step then, to be clear about what's
    > going on.
    >
    > int *ip;
    > intptr_t i;
    > i = 10;
    > ip = (int *) i;
    > assert((intptr_t) i == 10);
    >
    > Would you like to name me an implementation where this code compiles and
    > the assertion fails when it runs?


    thats exactly what i need the answer for...
     
    , May 7, 2009
    #5
  6. On 7 May 2009 at 19:55, Eric Sosman wrote:
    > (Unfortunately, one careless respondent has given Question B's answer
    > for Question A.)

    [snip]

    I think we are in complete agreement here, Eric. Why not leave the
    stupid polemics to the Heathfield-Thomson-Falconer trolling machine?
     
    Antoninus Twink, May 7, 2009
    #6
  7. Guest

    On May 7, 12:55 pm, Eric Sosman <> wrote:
    > wrote:
    > > On May 7, 4:15 am, Antoninus Twink <> wrote:
    > >> On  6 May 2009 at 23:15, Flash Gordon wrote:

    >
    > >>> Wrong. It could be a trap and if it isn't the conversion is
    > >>> implementation defined.
    > >> Let's break it up into an extra step then, to be clear about what's
    > >> going on.

    >
    > >> int *ip;
    > >> intptr_t i;
    > >> i = 10;
    > >> ip = (int *) i;
    > >> assert((intptr_t) i == 10);

    >
    > >> Would you like to name me an implementation where this code compiles and
    > >> the assertion fails when it runs?

    >
    > > thats exactly what i need the answer for...

    >
    >      Note that you've changed your question.  You began  with
    > "Is it certain?" and now you've switched to something more like
    > "Is it likely?"  You should not be surprised when different
    > questions get different answers.  (Unfortunately, one careless
    > respondent has given Question B's answer for Question A.)
    >
    >      "Is it certain" that converting 10 to an int* gives a known
    > value?  No, it is not.  Ints can be converted to pointers (and
    > vice versa), but everything about the conversion is implementation-
    > defined.  Even the validity of the converted value is up to the
    > implementation.
    >
    >      "Is it likely" that converting 10 to an int* gives a known
    > value?  Yes, it is.  On most machines, pointers behave very much
    > like some flavor of integer, and there is a natural correspondence
    > between pointer values and integer values.  The value 10 is almost
    > certainly included in the range of the correspondence.
    >
    >      A question you didn't ask, but might have: "Is the converted
    > value a valid int* pointer value?"  Possibly, but probably not.
    > On many systems, the int corresponding to an int* must be a multiple
    > of four.  Even on those with more relaxed alignment requirements,
    > it is common to find that "low core" addresses are off-limits and
    > inaccessible.
    >
    > --
    >


    that was bad on my side... you have answered all the questions that i
    wanted to get answer of...

    thanks...

    i misinterpreted what antonius told cos i was little biased that it is
    *guareenteed* that the assertion that a value 10 is stored in i would
    never fail
     
    , May 8, 2009
    #7
  8. James Kuyper Guest

    wrote:
    ....
    > i misinterpreted what antonius told cos i was little biased that it is
    > *guareenteed* that the assertion that a value 10 is stored in i would
    > never fail


    A key point to understand is that a value of 10 cannot be stored in an
    pointer. The value of a pointer is the location in memory that it points
    at. That location might have an address of 10, and that address might be
    stored in the representation of the pointer, but the actual value of the
    pointer is not 10.

    On a more complicated level, Antonius' comments to the contrary
    notwithstanding, converting a value of 10 into a pointer is not
    guaranteed to produce a pointer that, when converted back to an integer
    type, will have a value of 10. That's possible, and commonplace, but not
    guaranteed.

    It is guaranteed that converting a valid pointer value to a intptr_t
    (if available) and back to it's original type will produce a pointer
    value that compares equal to the original. It might seem that this
    guarantee implies the other guarantee, but it doesn't; multiple
    different integer values might convert to the same pointer value,
    without violating the guarantee, but the reverse conversion can produce
    only one of those integer values, which need not be the same as the one
    you started with.
     
    James Kuyper, May 8, 2009
    #8
  9. On 8 May 2009 at 13:25, James Kuyper wrote:
    > Multiple different integer values might convert to the same pointer
    > value, without violating the guarantee, but the reverse conversion can
    > produce only one of those integer values, which need not be the same
    > as the one you started with.


    I repeat my invitation to name an implementation for which this is the
    case.
     
    Antoninus Twink, May 8, 2009
    #9
  10. Ike Naar Guest

    In article <>,
    Antoninus Twink <> wrote:
    >Let's break it up into an extra step then, to be clear about what's
    >going on.
    >
    >int *ip;
    >intptr_t i;
    >i = 10;
    >ip = (int *) i;
    >assert((intptr_t) i == 10);


    You're converting i, an intptr_t that has the value 10, to type intptr_t,
    and then assert that the converted value equals 10 . What's the point?

    Or did you mean ``assert((intptr_t) ip == 10);'' ?
     
    Ike Naar, May 8, 2009
    #10
  11. Richard Bos Guest

    "" <> wrote:

    > i misinterpreted what antonius told


    That might be because Antoninus is indeed very much like a twink:
    creamy, and splurging, but when push comes to shove, not very full of
    real value.

    Richard
     
    Richard Bos, May 9, 2009
    #11
  12. Guest

    Antoninus Twink <> wrote:
    > On 8 May 2009 at 13:25, James Kuyper wrote:
    > > Multiple different integer values might convert to the same pointer
    > > value, without violating the guarantee, but the reverse conversion can
    > > produce only one of those integer values, which need not be the same
    > > as the one you started with.

    >
    > I repeat my invitation to name an implementation for which this is the
    > case.


    Any word addressed machine where 10 is not correctly aligned for an int
    and the pointer to int mapping produces a byte offset from address 0. I
    believe the Cray falls into that camp.
    --
    Larry Jones

    Hmph. -- Calvin
     
    , May 13, 2009
    #12
  13. Guest

    On May 13, 1:24 pm, wrote:
    > Antoninus Twink <> wrote:
    > > On  8 May 2009 at 13:25, James Kuyper wrote:
    > > > Multiple different integer values might convert to the same pointer
    > > > value, without violating the guarantee, but the reverse conversion can
    > > > produce only one of those integer values, which need not be the same
    > > > as the one you started with.

    >
    > > I repeat my invitation to name an implementation for which this is the
    > > case.

    >
    > Any word addressed machine where 10 is not correctly aligned for an int
    > and the pointer to int mapping produces a byte offset from address 0.  I
    > believe the Cray falls into that camp.

    <snip>

    It's also fairly easy to run into problems with 64 bit CPUs running 32
    bit ABIs.

    The 16 core MIPS64 NPU sitting here on my desk is one such example.
    Any 32 bit code using the KSEG/SSEG kernel segments must take care to
    construct sign-extended pointer values. As an example, take the KSEG0
    base address. If you end up with:

    0x00000000 80000000 (incorrect)

    in a 64 bit register rather than:

    0xFFFFFFFF 80000000 (correct)

    .... you'll crash upon a dereference. Depending on your toolchain and
    your configuration of it, a statement such as:

    int *p = (int *) 0x80000000;

    .... can generate either one! Perhaps even worse, the resulting code
    may or may not work depending on the processor's current mode and
    whether or not the SR(UX) bit is set in the CPU's status register.

    Generally speaking, the newer and nicer toolchain setups will sign
    extend for you (because it's what you probably want), and this is fine
    because that's what everyone is talking about when they say
    "implementation defined". The vanishingly few people that actually
    want 0x00000000 80000000 can simply use a uint64_t / unsigned long
    long / (u)intptr_t type before converting it to a pointer.

    Something to look out for is crossing the sign extension boundary with
    arithmetic. This is virtually certain to not be handled properly. If
    you think about it, the reason why makes perfect sense: the region
    that 32-bit code sees as continuous is in fact composed of the very
    discontinuous bottom (0x0000000000000000-0x000000007fffffff) and top
    (0xffffffff80000000-0xffffffffffffffff) of the 64 bit space.

    For normal 32-bit legacy userspace programs, 0x80000000 and above is
    off limits and accesses to it "can't happen", so nobody really cares
    except kernel / system software engineers, who are expected to be
    aware of the situation.

    I've seen an instance in the field of a bug where two pointers printf
    () to the same value (in a 32 bit environment) but do not compare
    equal with == due to defects in the way that one of them was
    constructed. I almost lost my voice from having to tell them "STFU,
    it's __not__ a compiler bug".


    Mark F. Haigh
     
    , May 14, 2009
    #13
  14. In article <> writes:
    ....
    > Any word addressed machine where 10 is not correctly aligned for an int
    > and the pointer to int mapping produces a byte offset from address 0. I
    > believe the Cray falls into that camp.


    The Cray has many surprises in pointers, but this is not one of them. 10
    is a perfect word address, as is 11. It is when you come to byte addresses
    that things are different. 10 is a byte address, the next byte is at
    281474976710666 ;-).
    --
    dik t. winter, cwi, science park 123, 1098 xg amsterdam, nederland, +31205924131
    home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/
     
    Dik T. Winter, May 14, 2009
    #14
  15. Guest

    Dik T. Winter <> wrote:
    >
    > The Cray has many surprises in pointers, but this is not one of them. 10
    > is a perfect word address, as is 11. It is when you come to byte addresses
    > that things are different. 10 is a byte address, the next byte is at
    > 281474976710666 ;-).


    Ah, so the pointer/integer mapping leaves the bits alone. I thought
    they got rotated to put the byte offset into the low-order bits where it
    belongs. :)
    --
    Larry Jones

    I always have to help Dad establish the proper context. -- Calvin
     
    , May 14, 2009
    #15
  16. wrote:
    > Something to look out for is crossing the sign extension boundary with
    > arithmetic. This is virtually certain to not be handled properly. If
    > you think about it, the reason why makes perfect sense: the region
    > that 32-bit code sees as continuous is in fact composed of the very
    > discontinuous bottom (0x0000000000000000-0x000000007fffffff) and top
    > (0xffffffff80000000-0xffffffffffffffff) of the 64 bit space.


    AMD faced this when defining their 64-bit extensions; they "solved" the
    problem by declaring pointers to be signed and mandating sign-extension
    when a 32-bit value was assigned to a register. The upper bits can (and
    must) be ignored by 32-bit code, but they're still there and must have
    the correct values in case some 64-bit code examines them (e.g. because
    it's running on a 64-bit OS).

    (In contrast, Intel mandated zero extension when a 16-bit value was
    assigned to a 32-bit register; they also mandated no default extension
    at all when assigning a value to an 8-bit half of a 16-bit register, but
    both sign-extending and zero-extending instructions were available. Oh,
    the fun compiler writers must have keeping track of all that...)

    > For normal 32-bit legacy userspace programs, 0x80000000 and above is
    > off limits and accesses to it "can't happen", so nobody really cares
    > except kernel / system software engineers, who are expected to be
    > aware of the situation.


    This is where AMD's signed pointers become particularly useful. Today's
    OS kernels can live comfortably in 2GB of RAM, whether in 32-bit or
    64-bit modes. Therefore, rather than making them live in the "top" half
    of memory, which might be at 2GB:4GB or 8,589,934,592:17,179,869,184GB,
    they live at -2GB:0 in either mode. User space is 0:2GB in 32-bit mode
    or 0:8,589,934,592GB in 64-bit mode -- but most kernel code doesn't need
    to care about that.

    S

    --
    Stephen Sprunk "Stupid people surround themselves with smart
    CCIE #3723 people. Smart people surround themselves with
    K5SSS smart people who disagree with them." --Isaac Jaffe
     
    Stephen Sprunk, Jun 6, 2009
    #16
  17. On 6 Jun 2009 at 22:39, Stephen Sprunk wrote:
    > [good stuff]


    Interesting and informative post - thanks!
     
    Antoninus Twink, Jun 7, 2009
    #17
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Oodini
    Replies:
    1
    Views:
    1,788
    Keith Thompson
    Sep 27, 2005
  2. Angel Tsankov
    Replies:
    1
    Views:
    872
    Victor Bazarov
    Apr 5, 2006
  3. Michael Tsang
    Replies:
    32
    Views:
    1,127
    Richard Bos
    Mar 1, 2010
  4. Michael Tsang
    Replies:
    54
    Views:
    1,209
    Phil Carmody
    Mar 30, 2010
  5. Jon
    Replies:
    1
    Views:
    373
    Peter Nilsson
    Nov 8, 2010
Loading...

Share This Page