Stack organisation locals/args

Discussion in 'C Programming' started by datenpunk@gmail.com, Jul 11, 2013.

  1. Guest

    hi,

    here is some rather simple code with a complex question ...

    the code - called from main:

    int long call_me(int long a, int short b) {
    int long c;
    c = 0;
    c = c + a;
    c = c - b;
    return c;
    }

    when debugging this I see that the stack has this order (low to high)

    1. a (EBP + 8)
    2. Return address (EBP +4)
    3. EBP
    4. c (EBP -4)
    5. b (EBP - 20) /* some debugger stuff between b and c)

    the question is: could it be that C mixes locals with params? I would expect to have a and b and then c but they are mixed up and b is after/ahead of EBP. Thanks in advance!

    Thanks in advance

    Daniel Khan
    , Jul 11, 2013
    #1
    1. Advertising

  2. On Thursday, July 11, 2013 12:19:45 AM UTC+1, wrote:

    > the question is: could it be that C mixes locals with params? I would
    > expect to have a and b and then c but they are mixed up and b is
    > after/ahead of EBP. Thanks in advance!
    >
    >

    You're passing b as a short. What I suspect is that it's being passed as
    16 bits, then transferred to a 32 bit variable for efficiency reasons.

    But the way to be sure is to compile to assembly. Debuggers don't necessarily
    tell the unvarnished truth, because what they're doing is treating machine
    code as something that can be viewed at source level.
    Malcolm McLean, Jul 11, 2013
    #2
    1. Advertising

  3. Eric Sosman Guest

    On 7/10/2013 7:19 PM, wrote:
    > hi,
    >
    > here is some rather simple code with a complex question ...
    >
    > the code - called from main:
    >
    > int long call_me(int long a, int short b) {
    > int long c;
    > c = 0;
    > c = c + a;
    > c = c - b;
    > return c;
    > }
    >
    > when debugging this I see that the stack has this order (low to high)
    >
    > 1. a (EBP + 8)
    > 2. Return address (EBP +4)
    > 3. EBP
    > 4. c (EBP -4)
    > 5. b (EBP - 20) /* some debugger stuff between b and c)
    >
    > the question is: could it be that C mixes locals with params?


    Yes.

    As a quite common case, all three of a,b,c might reside
    in registers and have no memory addresses at all.

    But the main point is this: The code mentions three
    variables a,b,c and performs assorted operations on them.
    The definition of the C language specifies what results the
    operations must produce (barring things like overflow), and
    *any* stratagem the implementation adopts is okay so long as
    it produces those results.

    > I would expect [...]


    ... things you shouldn't.

    --
    Eric Sosman
    d
    Eric Sosman, Jul 11, 2013
    #3
  4. James Kuyper Guest

    On 07/10/2013 07:19 PM, wrote:
    > hi,
    >
    > here is some rather simple code with a complex question ...
    >
    > the code - called from main:
    >
    > int long call_me(int long a, int short b) {
    > int long c;
    > c = 0;
    > c = c + a;
    > c = c - b;
    > return c;
    > }
    >
    > when debugging this I see that the stack has this order (low to high)
    >
    > 1. a (EBP + 8)
    > 2. Return address (EBP +4)
    > 3. EBP
    > 4. c (EBP -4)
    > 5. b (EBP - 20) /* some debugger stuff between b and c)
    >
    > the question is: could it be that C mixes locals with params? I would expect to have a and b and then c but they are mixed up and b is after/ahead of EBP. Thanks in advance!


    The standard only specifies how C code must behave, it doesn't specify
    the details about how the compiler arranges for that behavior to occur.
    Different compilers for different platforms arrange it in different
    ways. In particular, the standard specifies nothing about how memory for
    function parameters and local variables is allocated, other than the
    fact that the lifetime of both the function parameters and variables
    local to the outmost block of the function ends when the function
    returns. That makes it reasonable (but not necessary) for all of those
    variables to be stored in the same general location, but the standard
    says nothing to suggest what order they might be in.

    Even if your expectations about the locations of those variables had
    been correct, they would only have been correct for a particular
    platform and a particular compiler - you couldn't count on them being
    correct in any other context.
    --
    James Kuyper
    James Kuyper, Jul 11, 2013
    #4
  5. Joe Pfeiffer Guest

    writes:

    > hi,
    >
    > here is some rather simple code with a complex question ...
    >
    > the code - called from main:
    >
    > int long call_me(int long a, int short b) {
    > int long c;
    > c = 0;
    > c = c + a;
    > c = c - b;
    > return c;
    > }
    >
    > when debugging this I see that the stack has this order (low to high)
    >
    > 1. a (EBP + 8)
    > 2. Return address (EBP +4)
    > 3. EBP
    > 4. c (EBP -4)
    > 5. b (EBP - 20) /* some debugger stuff between b and c)
    >
    > the question is: could it be that C mixes locals with params? I would
    > expect to have a and b and then c but they are mixed up and b is
    > after/ahead of EBP. Thanks in advance!


    Others have commented on this in the context of C (and pointed out,
    rightly, that it would be legal); as I look at it from the context of
    ia32 it looks really weird. Do you know how high an optimization level
    is being used? Without enough optimization turned on to start passing
    parameters in registers or something, it pretty much has to push b, then
    push a, then perform a call pushing the return address, and then reserve
    space for the old EBP and c.
    Joe Pfeiffer, Jul 11, 2013
    #5
  6. Joe Pfeiffer <> wrote:
    > writes:


    (snip)
    >> int long call_me(int long a, int short b) {
    >> int long c;
    >> c = 0;
    >> c = c + a;
    >> c = c - b;
    >> return c;


    (snip)

    >> 1. a (EBP + 8)
    >> 2. Return address (EBP +4)
    >> 3. EBP
    >> 4. c (EBP -4)
    >> 5. b (EBP - 20) /* some debugger stuff between b and c)


    (snip)
    > Others have commented on this in the context of C (and pointed out,
    > rightly, that it would be legal); as I look at it from the context of
    > ia32 it looks really weird. Do you know how high an optimization level
    > is being used?


    Compilers are doing a lot of optimization these days, including
    inlining and tail call optimization. (Well, not the latter for this.)

    > Without enough optimization turned on to start passing parameters
    > in registers or something, it pretty much has to push b, then
    > push a, then perform a call pushing the return address, and
    > then reserve space for the old EBP and c.


    If you inline it, though, and/or pass parameters in registers
    it could easily do something like that.

    -- glen
    glen herrmannsfeldt, Jul 11, 2013
    #6
  7. Guest

    Thank you all for the explanation. This really helped a lot already.

    For me to understand it completely:

    >> If you inline it, though, and/or pass parameters in registers
    >> it could easily do something like that.


    If b is stored inside a register - where does the system store this information (how does the system know where to find b)?

    Meanwhile I am thinking that eclipse has a bug here.
    I went through the whole stackframe and in fact b is where I expected it. Right at the beginning of the stackframe @EBP + 0xC. Only the location of the variable inside eclipse shows a "wrong" address and in fact the value inside this address is 0x08040001 looks like some garbage.

    If someone is interested - this is how it looks like: https://www.evernote.com/shard/s16/...2761053e9252/56a1d31c6c418c19f07df4ac9678f7d2

    Again thanks a lot.

    Daniel Khan
    , Jul 11, 2013
    #7
  8. On Thu, 11 Jul 2013 00:08:17 -0700, datenpunk wrote:

    > Thank you all for the explanation. This really helped a lot already.
    >
    > For me to understand it completely:
    >
    >>> If you inline it, though, and/or pass parameters in registers it could
    >>> easily do something like that.

    >
    > If b is stored inside a register - where does the system store this
    > information (how does the system know where to find b)?


    Inside a function, the compiler just has to remember which variable is
    currently stored in which register.
    For passing function arguments, both the caller and the callee have to
    follow the same convention. Such a convention can be that the parameters
    are pushed onto the stack from right to left. But equally valid
    conventions are that the parameters are pushed onto the stack from left
    to right or that the first 5 parameters are passed in registers r4 to r9
    and that the return address is stored in r15.
    Depending on the convention for passing function arguments, the compiler
    knows where to look for the second argument to the function.

    >
    > Daniel Khan


    Bart v Ingen Schenau
    Bart van Ingen Schenau, Jul 11, 2013
    #8
  9. Guest

    Am Donnerstag, 11. Juli 2013 12:54:44 UTC+2 schrieb Bart van Ingen Schenau:

    > > If b is stored inside a register - where does the system store this
    > > information (how does the system know where to find b)?


    > Inside a function, the compiler just has to remember which variable is
    > currently stored in which register.


    Thanks a lot - that makes sense.

    Daniel Khan
    , Jul 11, 2013
    #9
  10. Noob Guest

    Noob, Jul 11, 2013
    #10
  11. Guest

    Am Donnerstag, 11. Juli 2013 13:35:51 UTC+2 schrieb Noob:

    > Please note that this entire discussion is off-topic here.
    > comp.compilers and comp.lang.asm.x86 might be good places
    > to ask these questions.
    >


    Thank you.
    , Jul 11, 2013
    #11
  12. James Kuyper Guest

    On 07/11/2013 03:08 AM, wrote:
    > Thank you all for the explanation. This really helped a lot already.
    >
    > For me to understand it completely:
    >
    >>> If you inline it, though, and/or pass parameters in registers
    >>> it could easily do something like that.

    >
    > If b is stored inside a register - where does the system store this information (how does the system know where to find b)?


    That information is stored in your program itself, in the form of
    instructions to retrieve the value of b from the appropriate register.
    The compiler needed to keep track of the location of b when generating
    those instructions, but the instructions themselves are all that is left
    of that information, once the code has been generated.

    > Meanwhile I am thinking that eclipse has a bug here.

    I'll let someone who knows something about eclipse respond to that issue.
    --
    James Kuyper
    James Kuyper, Jul 11, 2013
    #12
  13. Noob <root@127.0.0.1> writes:
    > Daniel wrote:
    >> Thank you all for the explanation. This really helped a lot already.

    >
    > Please note that this entire discussion is off-topic here.


    Some of it, but certainly not all of it. It's illuminating some
    important points about how C is defined, particularly that (unlike
    assembly language) a C program defines behvior, not machine code.

    A question is not a bad one just because the answer is no.

    >> If b is stored inside a register - where does the system store this
    >> information (how does the system know where to find b)?

    >
    > comp.compilers and comp.lang.asm.x86 might be good places
    > to ask these questions.
    >
    > See also
    >
    > https://en.wikipedia.org/wiki/X86_calling_conventions
    > https://en.wikipedia.org/wiki/Application_binary_interface


    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Working, but not speaking, for JetHead Development, Inc.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Jul 11, 2013
    #13
  14. Bart van Ingen Schenau <> wrote:

    (snip)
    >>>> If you inline it, though, and/or pass parameters in registers
    >>>> it could easily do something like that.


    (snip)
    >> If b is stored inside a register - where does the system store this
    >> information (how does the system know where to find b)?


    > Inside a function, the compiler just has to remember which variable
    > is currently stored in which register.
    > For passing function arguments, both the caller and the callee have
    > to follow the same convention. Such a convention can be that
    > the parameters are pushed onto the stack from right to left.
    > But equally valid conventions are that the parameters are pushed
    > onto the stack from left to right


    It is a little more complicated if you allow for varargs.

    As I understand it, ANSI C allows for a different convention for
    varargs and non-varargs, but the systems I know (not all that
    many) use the same convention. Pushing an unknown number of
    arguments right to left means that the caller can find the left
    most argument(s) without knowing how many there are.

    The early 808x compilers (for Pascal and Fortran) used a convention
    where the arguments are pushed left to right, and the called routine
    pops them off the stack with a special form of RET. As that is not
    compatible with varargs, different conventions were used when C
    compilers appeared, where the calling routine pops the arguments.

    > or that the first 5 parameters are passed in registers r4 to r9


    Many systems that pass in registers still save stack space for them.

    > and that the return address is stored in r15.


    No, the return address is always in R14!

    BALR R14,R15

    > Depending on the convention for passing function arguments,
    > the compiler knows where to look for the second argument to
    > the function.


    -- glen
    glen herrmannsfeldt, Jul 11, 2013
    #14
  15. BartC Guest

    <> wrote in message
    news:...

    > int long call_me(int long a, int short b) {
    > int long c;
    > c = 0;
    > c = c + a;
    > c = c - b;
    > return c;
    > }
    >
    > when debugging this I see that the stack has this order (low to high)
    >
    > 1. a (EBP + 8)
    > 2. Return address (EBP +4)
    > 3. EBP
    > 4. c (EBP -4)
    > 5. b (EBP - 20) /* some debugger stuff between b and c)
    >
    > the question is: could it be that C mixes locals with params? I would
    > expect to have a and b and then c but they are mixed up and b is
    > after/ahead of EBP. Thanks in advance!


    *I* would expect, with a simple-minded non-optimising compiler, that the
    order on the stack might be, low to high:

    c, return address, old frame pointer, a, b (with b widened to the machine
    word size).

    How this is not really predictable so it's unwise to rely on any particular
    ordering.

    --
    Bartc
    BartC, Jul 11, 2013
    #15
  16. Lew Pitcher Guest

    On Thursday 11 July 2013 13:35, in comp.lang.c, wrote:

    > Bart van Ingen Schenau <> wrote:
    >
    > (snip)
    >>>>> If you inline it, though, and/or pass parameters in registers
    >>>>> it could easily do something like that.

    >
    > (snip)
    >>> If b is stored inside a register - where does the system store this
    >>> information (how does the system know where to find b)?

    >
    >> Inside a function, the compiler just has to remember which variable
    >> is currently stored in which register.
    >> For passing function arguments, both the caller and the callee have
    >> to follow the same convention. Such a convention can be that
    >> the parameters are pushed onto the stack from right to left.
    >> But equally valid conventions are that the parameters are pushed
    >> onto the stack from left to right

    >
    > It is a little more complicated if you allow for varargs.
    >
    > As I understand it, ANSI C allows for a different convention for
    > varargs and non-varargs, but the systems I know (not all that
    > many) use the same convention. Pushing an unknown number of
    > arguments right to left means that the caller can find the left
    > most argument(s) without knowing how many there are.
    >
    > The early 808x compilers (for Pascal and Fortran) used a convention
    > where the arguments are pushed left to right, and the called routine
    > pops them off the stack with a special form of RET. As that is not
    > compatible with varargs, different conventions were used when C
    > compilers appeared, where the calling routine pops the arguments.
    >
    >> or that the first 5 parameters are passed in registers r4 to r9

    >
    > Many systems that pass in registers still save stack space for them.
    >
    >> and that the return address is stored in r15.

    >
    > No, the return address is always in R14!
    >
    > BALR R14,R15


    Which loaded the address of the next sequential instruction into R14, and
    then branched to the address held in R15. Alternatively, with a named
    entrypoint, you would code
    BAL 14,ENTRYPOINT
    (where ENTRYPOINT was the symbolic name or external name of the instruction
    that started the subroutine).

    To get back to the (previous) next sequential instruction, you would execute
    a
    BR 14
    which would effectively branch to the address held in R14

    Of course, for proper linkage, the subroutine would save the "callers"
    registers on entry (that is, as the first part of the logic pointed to by
    R15), and establish it's own base register and SAVEAREA. On exit, the
    subroutine would restore all the saved registers immediately prior to the
    BR 14. The address of the save/restore area (aka the SAVEAREA) was always
    pointed to by R13

    So, the caller would do something like...
    LA 13,SA1
    ...
    BAL 14,DOIT CALL DOIT SUBROUTINE
    ...
    SA1 DS 18F

    and the callee would do something like
    DOIT STM 14,12,12(13) SAVE CALLER REGS EXCEPT R13 IN HIS SA
    BALR 12,0 LOAD NSI ADDRESS INTO R12
    USING *,12 R12 NOW OUR BASE REG
    ST 13,SA2+4 SAVE CALLERS R13 IN OUR SAVEAREA
    LA 13,SA2 R13 IS NOW OUR SAVEAREA
    ...
    L 13,SA2+4 R13 POINTS TO CALLER SAVEAREA
    LM 14,12,12(13) RESTORE CALLERS REGISTERS
    BR 14 RETURN TO CALLER
    SA2 DS 18F


    FWIW, many s360/370/390/etc apps stored the caller's arguments in space
    following the BAL or BALR that invoked the callee. Thus, the caller might
    look like
    LA 13,SA1
    ...
    BAL 14,DOIT CALL DOIT SUBROUTINE
    ARG1 DS F
    ARG2 DS F
    ARG3 DS H
    DS H
    ARG3 DS CL3
    ...
    SA1 DS 18F

    and the callee would access these parameters as offsets from the caller's
    R14. Of course, this meant that the callee had to adjust the caller's R14
    to account for the arguments /prior/ to performing the BR 14.

    >> Depending on the convention for passing function arguments,
    >> the compiler knows where to look for the second argument to
    >> the function.

    >
    > -- glen


    --
    Lew Pitcher
    "In Skills, We Trust"
    Lew Pitcher, Jul 11, 2013
    #16
  17. Lew Pitcher <> wrote:

    (snip, someone wrote)
    >>>> If b is stored inside a register - where does the system store this
    >>>> information (how does the system know where to find b)?


    (snip)
    >>> and that the return address is stored in r15.


    (then I wrote)
    >> No, the return address is always in R14!


    >> BALR R14,R15


    > Which loaded the address of the next sequential instruction
    > into R14, and then branched to the address held in R15.
    > Alternatively, with a named entrypoint, you would code
    > BAL 14,ENTRYPOINT
    > (where ENTRYPOINT was the symbolic name or external name of
    > the instruction that started the subroutine).


    For external routines, you load R15 from an address constant.
    For internal ones, you could do that.

    > To get back to the (previous) next sequential instruction,
    > you would execute a
    > BR 14
    > which would effectively branch to the address held in R14


    Not only that, but there is an IBM utility program named IEFBR14.

    In its original implementation it contained just one instruction,
    but later an SR 15,15 was added such that the return code would
    be zero.

    > Of course, for proper linkage, the subroutine would save the "callers"
    > registers on entry (that is, as the first part of the logic pointed to by
    > R15), and establish it's own base register and SAVEAREA. On exit, the
    > subroutine would restore all the saved registers immediately prior to the
    > BR 14. The address of the save/restore area (aka the SAVEAREA) was always
    > pointed to by R13


    Leaf routines don't need to provide a save area, but do need to
    save registers.

    > So, the caller would do something like...
    > LA 13,SA1
    > ...
    > BAL 14,DOIT CALL DOIT SUBROUTINE
    > ...
    > SA1 DS 18F


    > and the callee would do something like
    > DOIT STM 14,12,12(13) SAVE CALLER REGS EXCEPT R13 IN HIS SA
    > BALR 12,0 LOAD NSI ADDRESS INTO R12
    > USING *,12 R12 NOW OUR BASE REG
    > ST 13,SA2+4 SAVE CALLERS R13 IN OUR SAVEAREA
    > LA 13,SA2 R13 IS NOW OUR SAVEAREA
    > ...
    > L 13,SA2+4 R13 POINTS TO CALLER SAVEAREA
    > LM 14,12,12(13) RESTORE CALLERS REGISTERS
    > BR 14 RETURN TO CALLER
    > SA2 DS 18F


    For those used to a stack, this should be a double linked list.
    To do that, you instead load the new save area address into a
    register other than 13, save that in the previous save area,
    then copy to R13.

    > FWIW, many s360/370/390/etc apps stored the caller's arguments in space
    > following the BAL or BALR that invoked the callee. Thus, the caller might
    > look like
    > LA 13,SA1
    > ...
    > BAL 14,DOIT CALL DOIT SUBROUTINE
    > ARG1 DS F
    > ARG2 DS F
    > ARG3 DS H
    > DS H
    > ARG3 DS CL3
    > ...
    > SA1 DS 18F


    > and the callee would access these parameters as offsets from the caller's
    > R14. Of course, this meant that the callee had to adjust the caller's R14
    > to account for the arguments /prior/ to performing the BR 14.


    Some might have done that, but again it won't work for varargs.
    Ones I remember, which I believe includes many system macros,
    use BAL 1,AROUND to load the address into R1 while branching
    around the arguments. R1 is the usual argument list register.

    But for internal subroutines you could do that.

    And, in case anyone noticed, to allow for recursion you must
    dynamically allocate the new save area. Most IBM utilities and
    compiled Fortran code used static allocation and static save areas.

    >>> Depending on the convention for passing function arguments,
    >>> the compiler knows where to look for the second argument to
    >>> the function.


    -- glen
    glen herrmannsfeldt, Jul 11, 2013
    #17
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ken Varn
    Replies:
    2
    Views:
    623
    Ken Varn
    Jun 22, 2005
  2. Replies:
    3
    Views:
    489
    David Eppstein
    Sep 17, 2003
  3. Pierre Fortin

    args v. *args passed to: os.path.join()

    Pierre Fortin, Sep 18, 2004, in forum: Python
    Replies:
    2
    Views:
    684
    Pierre Fortin
    Sep 18, 2004
  4. er
    Replies:
    2
    Views:
    499
  5. Andrew Tomazos
    Replies:
    5
    Views:
    572
Loading...

Share This Page