Variables allocated on stack

B

Bogdan

Hi

I had during an interview this question: which variables are
allocated on the stack in a function ? I am unsure how the output
variable is handled. Is this one too allocated on the stack and popped
when the function exists ?
Here is an example:

char* some_function(char c)
{
char *out = malloc(10);
return out;
}

On the local stack are there one char and 2 pointers to chars ?

Bogdan
 
I

Ian Collins

Hi

I had during an interview this question: which variables are
allocated on the stack in a function ? I am unsure how the output
variable is handled. Is this one too allocated on the stack and popped
when the function exists ?
Here is an example:

char* some_function(char c)
{
char *out = malloc(10);
return out;
}

On the local stack are there one char and 2 pointers to chars ?

The question was bogus, either trick or the interviewer had a rather
myopic viewpoint.

There isn't a correct answer. On some machines (that do have a stack),
nothing will be on the stack. Both an and out will be in registers.

Then there machines that don't have a stack....
 
J

jacob navia

Le 24/11/11 22:28, Bogdan a écrit :
Hi

I had during an interview this question: which variables are
allocated on the stack in a function ? I am unsure how the output
variable is handled. Is this one too allocated on the stack and popped
when the function exists ?
Here is an example:

char* some_function(char c)
{
char *out = malloc(10);
return out;
}

On the local stack are there one char and 2 pointers to chars ?

Bogdan

Conceptually a function has a stack that holds argument's values
and local variables. In your example the "c" argument and the
"out" pointer are in that stack.

The return value is normally in a register but it can be also
be understood as a stack.

In your example, a register would hold the value of the "out" pointer
after the execution of the function.

Today in a normal x86 based PC (linux, macintosh or windows) the first
parts of the execution stack are held in register, which one variying
with the calling conventions of the operating systems in question.

Under windows, the first 4 arguments are held in registers, under linux
or macintosh the first 7 arguments go into registers, when possible.

Do not forget that some arguments never go into registers because
they are too big for them. For instance, when you have a 476 byte
structure and you pass it by value the value goes into the stack
even if maybe the first 128 bytes would fit into a series of registers.

Depending on the specific machine architecture, some types have a
preferred set of registers where they are passed to functions, instead
of the general purpose registers. Specifically double/long double values
go into the xmm or the FPU registers (in a x86 CPU) or into the FP unit
(Power PC) etc.

What needs to be retained is that C implies (because it accepts
recursive functions) a stack like organization of the activation frames
of each function.
 
J

Jens Thoms Toerring

Bogdan said:
I had during an interview this question: which variables are
allocated on the stack in a function ? I am unsure how the output
variable is handled. Is this one too allocated on the stack and popped
when the function exists ?
Here is an example:
char* some_function(char c)
{
char *out = malloc(10);
return out;
}
On the local stack are there one char and 2 pointers to chars ?

This can actually be a clever question, at least if it's meant
to test your understanding on several levels (assuming that the
person asking it is aware of that).

The correct answer to the question (unless further qualified)
is, of course, that it's impossible answer it. The C standard
doesn't mandate the existence of a "stack", "heap" etc. (none
of these words appear in the standard at all). It is left to
the discretion of the compiler where it gets the memory needed
from.

On the other hand on all machines I am familiar with compilers
use a stack and a heap. And (non-static) local variables like
'out' are allocated (if necessary) on the stack (while alloca-
tions with malloc() are made from the heap).

Things get even more interesting since there is a chance
that a local variable actually might not reside in memory at
all, e.g. in this function the compiler (at least when allowed
to do a bit of optimization) might rather likely not need room
on the stack for 'out' but will be able to get along with a CPU
register - the function could easily rewritten as

char * some_function( char c )
{
return malloc( 10 );
}

and then there's no 'out' variable left.

The next question is how values are passed between the caller
and the function. In the simplest case (again, on machines one
typically encounters - remember that there is no requirement
that a stack exists at all) arguments and return values are
put on the stack. But if there aren't too many of them the
compilers often passes them in CPU registers. Thus for the
function there is a good chance that neither 'c' nor 'out'
ever make it to the stack and instead just reside in CPU
registers (which, again, might depend on the settings for
the compiler, gooogle for e.g. "calling convention").

So, except for the first statement that the question, can't
be answered, it all depends on how the compiler works (and
the architecture the program is compiler for). It only can
be determined for sure by inspecting the assembler code the
compiler generates. But that, of course, is then not a
question about C but about the way a certain compiler im-
plements what is required by the C standard (though it's
also an interesting question that a compentent C programmer
is not unlikely to at least have spend a bit of time won-
dering about).
Regards, Jens
 
M

Malcolm McLean

Hi

 I had during an interview this question: which variables are
allocated on the stack in a function ? I am unsure how the output
variable is handled. Is this one too allocated on the stack and popped
when the function exists ?
Here is an example:

char* some_function(char c)
{
     char *out = malloc(10);
     return out;

}

On the local stack are there one char and 2 pointers to chars ?
A modern compiler would probably just optimise that function to a call
to malloc().

However the abstract C compiler needs a stack for the function calls,
and a stack for variables. When you call a function, the address of
the instruction to return to in caller is pushed onto the call stack.
Then the parameters are pushed onto the variable stack (in this case,
c). Then the local variables are pushed onto the variable stack (in
this case, out). The function then does its calculations, the variable
stack is popped, the return address is popped off the call stack and
the function returns. So what happens to the return value? Almost
always, it is returned in a register.

However pushing and popping stacks is not very efficient. So compilers
use registers as much as possible. On most processors, the first four
parameters will be passed in registers. Also, the call stack can be
the same as the variable stack.
 
B

BartC

Malcolm McLean said:
A modern compiler would probably just optimise that function to a call
to malloc().

However the abstract C compiler needs a stack for the function calls,
and a stack for variables. When you call a function, the address of
the instruction to return to in caller is pushed onto the call stack.
Then the parameters are pushed onto the variable stack (in this case,
c). Then the local variables are pushed onto the variable stack (in
this case, out). The function then does its calculations, the variable
stack is popped, the return address is popped off the call stack and
the function returns. So what happens to the return value? Almost
always, it is returned in a register.

If you're going to talk about abstract call and variable stacks, then why
not put the return value onto the abstract variable stack too?
However pushing and popping stacks is not very efficient. So compilers
use registers as much as possible. On most processors, the first four
parameters will be passed in registers.

That's fine until the registers are needed for something else or you need to
make another function call. Then they need to be saved, perhaps by pushing
onto the stack, then popping again. Except this might be done a million
times within the function, instead of just once.
Also, the call stack can be
the same as the variable stack.

Some compilers have an option to display the code they generate. Then it is
possible to see some actual code.
 
J

jacob navia

Le 25/11/11 12:05, BartC a écrit :
That's fine until the registers are needed for something else or you
need to make another function call. Then they need to be saved, perhaps
by pushing onto the stack, then popping again. Except this might be done
a million times within the function, instead of just once.

Unlikely. In most cases the compiler knows if the function calls
another function and will save the registers at the start
in the function prologue...
 
B

BartC

jacob navia said:
Le 25/11/11 12:05, BartC a écrit :

Unlikely. In most cases the compiler knows if the function calls
another function and will save the registers at the start
in the function prologue...

Fair enough. In that case it's just another way to push those parameters,
via registers, leaving it to the callee to do the actual pushing. With the
advantage that the callee will know when it is possible to avoid pushing
onto the stack.

The disadvantage for the caller might be a little more effort, for the
compiler, in evaluating complex parameters and ensuring those registers
already loaded are preserved. For example:

fn(f(p,q,r), g(s), h(t,u,v), i());

Here, purely stack-based parameters would be far simpler.
 
J

jacob navia

Le 25/11/11 12:40, BartC a écrit :
Fair enough. In that case it's just another way to push those
parameters, via registers, leaving it to the callee to do the actual
pushing. With the advantage that the callee will know when it is
possible to avoid pushing onto the stack.

The disadvantage for the caller might be a little more effort, for the
compiler, in evaluating complex parameters and ensuring those registers
already loaded are preserved. For example:

fn(f(p,q,r), g(s), h(t,u,v), i());

Here, purely stack-based parameters would be far simpler.

Yes, I agree with you 100%. Having implemented linux's calling
conventions where you have nothing less than 7 registers that
can receive parameters those calls with embedded cals in the arguments
evaluation are a plain nightmare...

For me.

But in the big view where the compiler writer is just a matter of
fact that nobody cares about, this speeds up the code in MOST cases,
what is the important point anyway.
 
S

Seebs

Fair enough. In that case it's just another way to push those parameters,
via registers, leaving it to the callee to do the actual pushing. With the
advantage that the callee will know when it is possible to avoid pushing
onto the stack.
Yes.

The disadvantage for the caller might be a little more effort, for the
compiler, in evaluating complex parameters and ensuring those registers
already loaded are preserved. For example:
fn(f(p,q,r), g(s), h(t,u,v), i());
Here, purely stack-based parameters would be far simpler.

Right.

But you don't always get a vote -- the ABI may have specified the rules
for you. And in fact, often has.

-s
 
S

Seebs

I had during an interview this question: which variables are
allocated on the stack in a function?

That is a very poor question for an interviewer to use.
I am unsure how the output
variable is handled. Is this one too allocated on the stack and popped
when the function exists ?

Maybe. Often, the answer is "no".
char* some_function(char c)
{
char *out = malloc(10);
return out;
}
On the local stack are there one char and 2 pointers to chars ?

There may or may not be a "local stack" in any meaningful sense. Since
we know that this function is functionally a leaf node (it never calls
anything that could possibly recurse to it), it would be quite reasonable
for a compiler to use a calling convention based on register values.
Since some ABIs use registers by convention for return values, and the
argument is ignored, the function might well consist entirely of shoving
an immediate 10 into a register and calling malloc, which would stash its
return value in the same register this function would have used, and then
there's nothing more to do.

-s
 
B

Bogdan

A modern compiler would probably just optimise that function to a call
to malloc().

However the abstract C compiler needs a stack for the function calls,
and a stack for variables. When you call a function, the address of
the instruction to return to in caller is pushed onto the call stack.
Then the parameters are pushed onto the variable stack (in this case,
c). Then the local variables are pushed onto the variable stack (in
this case, out). The function then does its calculations, the variable
stack is popped, the return address is popped off the call stack and
the function returns. So what happens to the return value? Almost
always, it is returned in a register.

However pushing and popping stacks is not very efficient. So compilers
use registers as much as possible. On most processors, the first four
parameters will be passed in registers. Also, the call stack can be
the same as the variable stack.


I guess that it depends on the output variable size. When we need to
return a structure I guess that the variable stack should be always
used.
 
B

BartC

Seebs said:
Right.

But you don't always get a vote -- the ABI may have specified the rules
for you. And in fact, often has.

Surely compiler writers can do what they like - within their own
implementation? Calling conventions are only important when calling foreign
and OS functions, and being called from outside.
 
E

Eric Sosman

[... function linkage mechanisms ...]

I guess that it depends on the output variable size. When we need to
return a structure I guess that the variable stack should be always
used.

On one compiler I've used, struct-valued functions were handled
rather differently. The function was rewritten like a void function
with an extra, invisible argument pointing to a temporary struct that
would receive the value. That is, you'd write

struct foo func(int bar) {
struct foo result;
...
return result;
}

struct foo baz = func(42);

.... and the compiler would rewrite to something like

void func(struct foo* _output, int bar) {
struct foo result;
...
*_output = result;
}

struct foo _temp;
struct foo baz;
func(&_temp, 42);
baz = _temp;

(Why use _temp instead of just func(&baz, 42)? Because the
compiler had to guard against the possibility that func() might
use baz directly -- as a global, say, or via another pointer.)

The lesson here is that words like "always" are risky in this
connection. Platforms differ, and present implementors with
differing challenges, and those challenges are met with different
strategies. This is probably why the Standard says so very little
about such matters: To give the implementors maximum freedom to
exploit the strengths and avoid the weaknesses of the platforms.
 
M

Markus Wichmann

Surely compiler writers can do what they like - within their own
implementation? Calling conventions are only important when calling foreign
and OS functions, and being called from outside.

Yeah, that's right!

Only... every function not defined "static" may be called "from the
outside". Every call to a function only declared (and not qualified
"static") but not defined may be to a foreign function.

Plus, when you ask a compiler writer whether he want's to include
_two_different_ calling conventions in his compiler with the added
maintenance and stuff, guess what he'll say. Writing a register
allocator is hard enough without having to differentiate internal and
external functions and function calls.

CYA,
Markus
 
B

BartC

Markus Wichmann said:
Yeah, that's right!

Only... every function not defined "static" may be called "from the
outside". Every call to a function only declared (and not qualified
"static") but not defined may be to a foreign function.

If no special attributes are defined, then it is just the private calling
convention used by this compiler. Although it might have a few other
conventions up it's sleeve, for example when creating exported functions
accessed via dynamic library by other languages, or calling the OS.

I'm used anyway to declaring in my non-C language:

clang function puts (istring)
windows function Sleep(int)

That is, specify the calling conventions used by these imported functions
(in this case, also imported via DLL files). Although I believe Windows was
originally written in C, it uses a different call convention from actual C
programs.

Fortunately the only C functions I've had to import so far all use a
consistent call convention. So for functions shared across applications,
this would be a good idea. But should this be defined by the ABI? I
understood an ABI was mainly for interfacing to the OS; it was not
necessarily a standard for C language compilers. Or, since other languages
are available, for *every* implementation of *every* language for that
platform.
Plus, when you ask a compiler writer whether he want's to include
_two_different_ calling conventions in his compiler with the added
maintenance and stuff, guess what he'll say.

Isn't that what happens anyway, when calling a function in a different
language? (pascal, stdcall and so on.) (And I'm working right now on a
project with I think 4 different conventions to take care of. It's not that
big a deal. More difficult is matching and converting datatypes across
languages.)
Writing a register
allocator is hard enough without having to differentiate internal and
external functions and function calls.

Most code generation should be for functions following the chosen
call-convention of the compiler. This may well have been chosen to be easier
to compile for. Only actual calls to other conventions, or 'callback'
functions, might need some work.
 
I

Ian Collins

Isn't that what happens anyway, when calling a function in a different
language? (pascal, stdcall and so on.)

No. At least not on any of the platforms I'm familiar with. Other
languages know how to call C (so they can access the platform's services).
(And I'm working right now on a
project with I think 4 different conventions to take care of. It's not that
big a deal. More difficult is matching and converting datatypes across
languages.)

Yet again the convention is to describe those structures in C calling ABI.
 
S

Stefan Ram

Ian Collins said:
No. At least not on any of the platforms I'm familiar with. Other
languages know how to call C (so they can access the platform's services).

In this case, wouldn't »know how to call the platform's ABI«
a more appropriate wording than »know how to call C«?
 
B

BartC

Ian Collins said:
No. At least not on any of the platforms I'm familiar with.

So how do you call Pascal, Fortran, (or C++) etc from C on your platform? Do
you not need to tell it that that imported function is in a different
language?
Other languages know how to call C (so they can access the platform's
services).

The platform interface should be language-independent. This is where a
binary interface becomes useful. (C has done a good job here in past, in
defining such interfaces, but there is often confusion; is 'long' 32-bits or
64-bits for example? gcc for Windows says 32, gcc for Linux says 64.)

But should a complex, register-based interface (I believe this might be for
x86-64) be also used as the standard when one C function calls another C
function in the same application?
 
I

Ian Collins

So how do you call Pascal, Fortran, (or C++) etc from C on your platform? Do
you not need to tell it that that imported function is in a different
language?

Fortran and C++ have standard mechanisms for C interoperability. I
don't know about Pascal, I haven't used it since the 80s...
The platform interface should be language-independent. This is where a
binary interface becomes useful. (C has done a good job here in past, in
defining such interfaces, but there is often confusion; is 'long' 32-bits or
64-bits for example? gcc for Windows says 32, gcc for Linux says 64.)

The sizes of types is defined by the platform's ABI. When in Rome and
all that..
But should a complex, register-based interface (I believe this might be for
x86-64) be also used as the standard when one C function calls another C
function in the same application?

Why not? The AMD64 calling convention offers significant performance
advantages.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top