Stack space, global space, heap space

S

Shuo Xiang

Greetings:

I know that variables declared on a stack definitely does not reside in heap
space so there is only a very limited amount of stuff that you can store in
the stack of a function. But what about the global space? Is that the same
as the heap space or is that still a form of special stack? Because once
I've seen someone declare a 1 megabyte char array in global space like this:

char s[1000000];

and it worked.


Regards,

Shuo Xiang
 
E

Eric Sosman

Shuo said:
Greetings:

I know that variables declared on a stack definitely does not reside in heap
space so there is only a very limited amount of stuff that you can store in
the stack of a function. But what about the global space? Is that the same
as the heap space or is that still a form of special stack? Because once
I've seen someone declare a 1 megabyte char array in global space like this:

char s[1000000];

and it worked.

First, the C answer. C has no "stack," no "heap," and no
"global space." C supports data objects with three kinds of
"storage duration," namely, automatic, dynamic, and static.
The details of where these objects are created and how their
lifetimes are managed is entirely up to the implementation; the
C Standard specifies how the objects behave, but says nothing
about how the behavior is achieved. The implementation may (or
may not) impose different size limitations on objects of the
three different storage durations; again, the C Standard is
silent on what form such strictures might take.

Second, the practical answer. Many C implementations use
a stack for automatic-duration variables, and many implementations
impose stricter size limits for stack-resident objects than for
other kinds of objects. The details of these limits (including
whether and how they can be adjusted) are implementation-specific,
and can only be answered by consulting the implementation's
documentation and/or gurus.

Finally, note that a C implementation is not required to be
able to support a one-megabyte array at all, regardless of what
storage duration is used. Many implementations do in fact handle
objects of this size and larger, but the Standard does not require
"minimalist" C to be able to do so.
 
C

Chris Dollin

Shuo said:
I know that variables declared on a stack definitely does not reside in
heap space

You can't declare variables "on a stack" in C. The best you can do is make
them [non-static] function-locals, which *allows* them to be implemented on
a stack but certainly doesn't require it. [They could, for example, be
implemented in a garbage-collected heap. This might be mad, but it is
possible.]
so there is only a very limited amount of stuff that you can
store in the stack of a function.

Why? Stacks can be as big as the implementation likes. Caution suggests we
don't overdo it, since we know some popular implementations are a little ...
weedy ... but (unless the Standard says otherwise) that's a property of
implementations, not of the language.
Because once I've seen someone declare a 1 megabyte char array in global
space like this:

char s[1000000];

and it worked.

Machine must have had enough room for it, then.
 
M

Malcolm

Chris Dollin said:
Why? Stacks can be as big as the implementation likes. Caution suggests
we don't overdo it, since we know some popular implementations are a
little ... weedy ... but (unless the Standard says otherwise) that's a property
of implementations, not of the language.
There are however good reasons for providing small stacks. Firstly stack
usage increases only as the logarithm of the size of a structured program.
Secondly, most processors implement cache schemes and if the stack is small
enough to ensure that it is always held in the cache, there is likely to be
a performance improvement. Thirdly, it is unusual to know the size of a big
data item at compile time, and pre 99 C doesn't allow variable size stack
arrays.
 
E

E. Robert Tisdale

Shuo said:
I know that variables declared on a stack

You probably meant *automatic storage*.
definitely does not reside in heap space

You probably meant *free storage*.
so there is only a very limited amount of stuff
that you can store in the stack of a function.

You probably meant *local storage*.
But what about the global space?
Is that the same as the heap space?
or is that still a form of special stack? Because once
I've seen someone declare a 1 megabyte char array in global space
like this:

char s[1000000];

and it worked.

So does this:
> cat stack.c
int main(int argc, char* argv[]) {
const int n = 1000000;
unsigned char s[n];
for (int j = 0; j < n; ++j) {
s[j] = j%256;
}
return 0;
}
> gcc -Wall -std=c99 -pedantic -O2 -o stack stack.c

Your program stack size can be set by the operating system
through a [UNIX] shell command (limit for [t]csh)
or a compile time option appropriate for your compiler.

In the *typical implementation*, virtual memory looks like

00000000 text segment
.
.
.
XXXXXXXX date segment
.
.
.
ZZZZZZZZ free storage
.
.
.
FFFFFFFF bottom of stack

Your code is stored in the *text segment*.
The *data segment* extends from the end of the text segment
to the bottom of virtual memory.
Global and static variables and constants
may be embedded in the text segment,
placed in the data segment immediately after the text segment
or placed at the bottom of the stack.
the program stack grows up toward free storage
and the free store grows down toward the top of the stack.
The typical implementation of automatic storage is the program stack.
The typical implementation of free storage is a *free list*.
The term *heap* comes from the whimsical name that IBM programmers
gave to their original implementation of a free list.

The amount of "stuff" that you can put on the stack
is limited only by the virtual memory space available
between the top of the stack and the current extent of the free list.
 
E

Eric Sosman

Malcolm said:
There are however good reasons for providing small stacks. Firstly stack
usage increases only as the logarithm of the size of a structured program.
Secondly, most processors implement cache schemes and if the stack is small
enough to ensure that it is always held in the cache, there is likely to be
a performance improvement. Thirdly, it is unusual to know the size of a big
data item at compile time, and pre 99 C doesn't allow variable size stack
arrays.

Troll? Or merely tripe? None of the three points
seems to make any sense at all:

- "Stack grows as the logarithm of program size"
First, there's no hint of how "program size" is
to be measured. Lines of code? Value of some
fundamental parameter (e.g., number of items to
sort)? Either way, it's dead easy to point to
plenty of existing counterexamples.

Of course, the counterexamples might be dismissed
on the grounds of not being "structured," but no
definition of "structured" is evident. Perhaps we
should say that "a structured program is one whose
stack size grows as the logarithm of its own size,"
but then the whole argument degenerates to tautology.

- "Stack is small so as to fit in a cache"
Balderdash. Cache friendliness (which involves far
more than mere size, by the way) is just as important
for code and for non-stack data as for stack-resident
data, so this argument leads to no special criterion
for stack size that wouldn't apply to everything else.

- "Stack is small because VLAs are new in C99"
Nonsense. VLAs have been with us for years and years,
albeit not in Standard C. But where did anybody get
the notion that machines and their operating systems
are designed solely with C in mind? Fortran, anyone?
Pascal? C with extensions like alloca()? This
argument isn't right; it's not even wrong.

Size limitations on stack-resident data (if either the stack
or limitations on it exist) are entirely implementation-dependent,
and are imposed (or not) for reasons the implementor considers
important. What those reasons might be are of no concern to the
C programmer, and the pseudo-reasons advanced above are simply
useless. Or worse.
 
M

Malcolm

Eric Sosman said:
- "Stack grows as the logarithm of program size"
First, there's no hint of how "program size" is
to be measured. Lines of code? Value of some
fundamental parameter (e.g., number of items to
sort)? Either way, it's dead easy to point to
plenty of existing counterexamples.

Of course, the counterexamples might be dismissed
on the grounds of not being "structured," but no
definition of "structured" is evident. Perhaps we
should say that "a structured program is one whose
stack size grows as the logarithm of its own size,"
but then the whole argument degenerates to tautology.
A structured program consists of a roughly balanced call tree of functions.
As it grows, the tree becomes deeper rather than the functions growing more
complicated.
It follows that such a program will use stack space proportionate to the
logarithm of its size (number of lines of code).
- "Stack is small so as to fit in a cache"
Balderdash. Cache friendliness (which involves far
more than mere size, by the way) is just as important
for code and for non-stack data as for stack-resident
data, so this argument leads to no special criterion
for stack size that wouldn't apply to everything else.
No it isn't, because the stack data tends to be used more intensively.
Take a totally typical line of code

for(i=0;i<N;i++)
employee.wage += 1000.0;

The array employee is probably on the heap. If N is large and the cache
small it is probably inevitable that there will be some misses. However if i
and N aren't in the cache, then we are really in trouble.
- "Stack is small because VLAs are new in C99"
Nonsense. VLAs have been with us for years and years,
albeit not in Standard C. But where did anybody get
the notion that machines and their operating systems
are designed solely with C in mind? Fortran, anyone?
Pascal? C with extensions like alloca()? This
argument isn't right; it's not even wrong.
The stack is something imposed by the compiler, not the OS. An OS might make
it easy and natural to implement, say, an 8K stack using a special register,
but if you want to implement a larger stack you can do so.
Size limitations on stack-resident data (if either the stack
or limitations on it exist) are entirely implementation-dependent,
and are imposed (or not) for reasons the implementor considers
important. What those reasons might be are of no concern to the
C programmer, and the pseudo-reasons advanced above are simply
useless. Or worse.
If you understand why the stack might be small, you will remember that
stacks are small more easily. Also, when a technological change comes along
and makes those reasons irrelevant, you will find it easier to adapt to the
new environment, because you have some idea what is going on rather than
just parrotting "stack size is implementation dependent and often small".
 
J

John Devereux

Malcolm said:
Take a totally typical line of code

for(i=0;i<N;i++)
employee.wage += 1000.0;

The array employee is probably on the heap. If N is large and the cache
small it is probably inevitable that there will be some misses. However if i
and N aren't in the cache, then we are really in trouble.


Surely even if i and N are not in the cache at first, they will be as
soon as the loop is started. Since N is large, the delay caused by one
cache miss would be insignificant, wouldn't it?
 
M

Malcolm

John Devereux said:
Surely even if i and N are not in the cache at first, they will be as
soon as the loop is started. Since N is large, the delay caused by one
cache miss would be insignificant, wouldn't it?
Generally, yes, though not on some platforms (the Sony Playstation has a
special reserved area of memory that has fast access, for example, variables
aren't promted to it on read/write).

The point is that it is important for stack variables to be in the cache - i
N and the pointer "employee" are all on the stack and are accessed on every
cycle, whilst the "wage" variables are each accessed only once.

It would be a pretty puny cache that failed to keep all three variables in
cache, however they were distributed in memory. However make the loop more
complicated, maybe calling a subroutine to calculate the wage increase
rather than giving 1000.0 to everybody, and you could soon get a situation
where i, N, and employee are in the cache when they are close together in
memory, but not kept in when they are widely separated.
 
J

John Devereux

Malcolm said:
Generally, yes, though not on some platforms (the Sony Playstation has a
special reserved area of memory that has fast access, for example, variables
aren't promted to it on read/write).

OK, although I would call that "special reserved area of memory that
has fast access"; it's not really a cache as I understand the term.
You would indeed probably put the stack there...
 
B

Bryan Bullard

the heap and "global data section" locations are determined by the host
systems process or task memory map. for instance both linux and windows
running on the intel chip (ia-32), each process has a 4 GB virtual address
space provided by the hardware memory paging feature (similar on other archs
and systems). many host systems will actually let you have many heaps per
process or stacks per thread or task. as far as the global data section,
this location is platform dependant but general follows after the code
section and is also where static data is stored. note that this location is
made known to the linker at link time via some type of script, typically.
also generally, there is no deference between global and static data at run
time. it is enforced only at compile time. if you really want to see
what's going on disassemble your linked executable image.

-bryan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top