Why does array address in the stack change after every execution?

C

CICAP

Why this simple program compiled with gcc runned on linux kernel
2.6.32 show different behavior after every execution?

int main (int argc, char **argv)
{
int a[10];
printf("%d\n", a[1000]);
return 0;
}

execution 1: 1668312366
execution 2: 0
execution 3: segmentation fault
execution 4: 0

etc...
 
D

Denis McMahon

Why this simple program compiled with gcc runned on linux kernel
2.6.32 show different behavior after every execution?

int main (int argc, char **argv)
{
int a[10];
printf("%d\n", a[1000]);
return 0;
}

You wrote a program that deliberately tries to break things, and you
want to know why you get inconsistent behaviour?

Lots of phrases come to mind, none of them particularly complimentary to
your cognitive abilities.

Rgds

Denis McMahon
 
R

robertwessel2

Why this simple program compiled with gcc  runned on linux kernel
2.6.32 show different behavior after every execution?

int main (int argc, char **argv)
{
  int a[10];
  printf("%d\n", a[1000]);
  return 0;

}

execution 1: 1668312366
execution 2: 0
execution 3: segmentation fault
execution 4: 0

etc...


Referencing past the end of an array results in undefined behavior.
Anything can happen.

Assuming the common Linux/GCC implementations, and typical downward
growing stack, you're attempting to display the contents of the memory
typically past the start of the stack. And there you entirely at the
mercy of whatever pages may or may not be allocated there, and what
they've been filed with.

And you're *not* displaying any addresses, as written the code
displays the contents of the 1001st element of the array "a" (which,
of course doesn't actually exist).
 
C

CICAP

Why this simple program compiled with gcc  runned on linux kernel
2.6.32 show different behavior after every execution?
int main (int argc, char **argv)
{
  int a[10];
  printf("%d\n", a[1000]);
  return 0;

execution 1: 1668312366
execution 2: 0
execution 3: segmentation fault
execution 4: 0

Referencing past the end of an array results in undefined behavior.
Anything can happen.

I know that. I was wandering because in some system the behavior is
always the same and in other systems not.
Assuming the common Linux/GCC implementations, and typical downward
growing stack, you're attempting to display the contents of the memory
typically past the start of the stack.  And there you entirely at the
mercy of whatever pages may or may not be allocated there, and what
they've been filed with.

And you're *not* displaying any addresses, as written the code
displays the contents of the 1001st element of the array "a" (which,
of course doesn't actually exist).

I finally discovered that it was caused by ASLR. Disabling ASLR (stack
randomization) the behavior becomes always the same.
Do you know how stack randomization work?
 
R

robertwessel2

Why this simple program compiled with gcc  runned on linux kernel
2.6.32 show different behavior after every execution?
int main (int argc, char **argv)
{
  int a[10];
  printf("%d\n", a[1000]);
  return 0;
}
execution 1: 1668312366
execution 2: 0
execution 3: segmentation fault
execution 4: 0
etc...
Referencing past the end of an array results in undefined behavior.
Anything can happen.

I know that. I was wandering because in some system the behavior is
always the same and in other systems not.
Assuming the common Linux/GCC implementations, and typical downward
growing stack, you're attempting to display the contents of the memory
typically past the start of the stack.  And there you entirely at the
mercy of whatever pages may or may not be allocated there, and what
they've been filed with.
And you're *not* displaying any addresses, as written the code
displays the contents of the 1001st element of the array "a" (which,
of course doesn't actually exist).

I finally discovered that it was caused by ASLR. Disabling ASLR (stack
randomization) the behavior becomes always the same.
Do you know how stack randomization work?


You know, 15 seconds with Google would have answered this for you...

But basically ASLR randomizes where things are loaded and allocated in
memory. The details are implementation specific, but commonly it
causes executables to be loaded at different locations as well as
stacks to be allocated at different locations. Often the OS memory
blocks passed to the C heap manager are subject to some randomization
as well. The idea is to make the typical stack smashing (or buffer
overflow) attacks ineffective, since it becomes very difficult to both
inject code and generate a branch to that code, or to generate a
branch to an abuseable routine in the program, since those addresses
are now unpredictable.

Unlike the traditional non-ASLR case, where you program is always
loaded at (say) 0x10000, and the first stack consistently starts just
below 0x80000000 (or wherever).

In any event, this is all OT for this group.
 
C

CICAP

But basically ASLR randomizes where things are loaded and allocated in
memory.

That is what I thought reading wikipedia etc. But what I am
experimenting in Linux is strange, and I would like to know how is
possible that Stack Pointer is randomized INSIDE the stack. In your
opinion (the last question because as you said I am OT), is the
following stack scheme possible?

Execution 1: |S _ _ _ a _ |
Execution 2: |_ S _ _ _ a |

where "S" is stack poninter, and "a" the array of the example.
 
B

Ben Bacarisse

CICAP said:
Why this simple program compiled with gcc runned on linux kernel
2.6.32 show different behavior after every execution?

int main (int argc, char **argv)
{
int a[10];
printf("%d\n", a[1000]);
return 0;
}

execution 1: 1668312366
execution 2: 0
execution 3: segmentation fault
execution 4: 0

From the C point of view, the program can do anything at all. The C
language does define what happens when you call printf without a valid
prototype in scope, and it does define what 'a[1000]' means when 'a' is
a ten element array.

gcc has chosen to produce a set of instructions from this invalid code
and these instructions appear to different things at different times but
all the results you see are equally "correct". You could look at the
generated instructions and study the way your version of Linux sets up
the execution environment of the program to determine exactly what the
possible range of behaviours is, but I would suggest there are better
ways to learn about C and Linux.
 
R

robertwessel2

That is what I thought reading wikipedia etc. But what I am
experimenting in Linux is strange, and I would like to know how is
possible that Stack Pointer is randomized INSIDE the stack. In  your
opinion (the last question because as you said I am OT), is the
following stack scheme possible?

Execution 1: |S _ _ _ a _ |
Execution 2: |_ S _ _ _ a |

where "S" is stack poninter, and "a" the array of the example.


Well, it's dependent on exactly which ALSR implementation you're
using, but no, what happens is that the whole stack is moved. In some
implementations that's just on a page basis, so the first word* on the
stack would always be at 0xXXXXXffc, with only the page - and the
surrounding allocations being randomized - IOW, the 1MB of address
space for the stack is reserved as a sequential set of 256 page frames
somewhere in the address space, and the initial stack pointed is set
to the end of that. Others allocate a large region for the stack in a
fixed location, say 8MB, and start the stack at one of roughly half a
million possible locations - basically any multiple of sixteen at
least 1MB into that reserved area. Other implementations are possible
as well.

At least one *application* that I'm aware of has implemented partial
ALSR by doing a random sized alloca() at startup, which would more or
less match what you described, but I don't know of any OS's that have
implemented it that way.


*assuming a 32 bit machine with a downward growing stack
 
M

Mark Bluemel

... what I am
experimenting in Linux is strange, and I would like to know how is
possible that Stack Pointer is randomized INSIDE the stack. In  your
opinion (the last question because as you said I am OT), is the
following stack scheme possible?

Execution 1: |S _ _ _ a _ |
Execution 2: |_ S _ _ _ a |

where "S" is stack poninter, and "a" the array of the example.

Try this thought experiment. Would my question be significantly
different if my program were written in C++, fortran, haskell or
brainf*ck?

As the answer is "no", then you are clearly off-topic. If you can't
find the documentation for the Linux implementation of ASLR that you
are using, I suggest asking somewhere like comp.unix.programmer as a
starting point.
 
K

Keith Thompson

Ben Bacarisse said:
From the C point of view, the program can do anything at all. The C
language does define what happens when you call printf without a valid
prototype in scope, and it does define what 'a[1000]' means when 'a' is
a ten element array.

I think you mean "does not define" in both cases.

[...]
 
B

Ben Bacarisse

Keith Thompson said:
Ben Bacarisse said:
From the C point of view, the program can do anything at all. The C
language does define what happens when you call printf without a valid
prototype in scope, and it does define what 'a[1000]' means when 'a' is
a ten element array.

I think you mean "does not define" in both cases.

Yes, thanks. Fortunately "does define" is odd in this context so the
reader might be able to guess that there is a "not" missing.
 
B

Barry Schwarz

CICAP said:
Why this simple program compiled with gcc runned on linux kernel
2.6.32 show different behavior after every execution?

int main (int argc, char **argv)
{
int a[10];
printf("%d\n", a[1000]);
return 0;
}

execution 1: 1668312366
execution 2: 0
execution 3: segmentation fault
execution 4: 0

etc...

As best I can determine, you are addressing a location approximately 32000
bytes away from one you should.

This implies that an int occupies 32 bytes. Unlikely.
Despite the hostile comments made by other posters, I believe that your
question is a sane one. However, it might best be posed to
comp.unix.programmer.

The question is sane in two ways:

a)On a virtual machine (as the x86 platform is nowadays), the addresses and
page offsets (relative to the start of a page) would presumably be the same
on every invocation of the program. The amount of memory allocated would
presumably also be the same. It is a bit unexpected that you would get a
segmentation fault some times but not other times.

Undefined behavior is not required to be consistent.
b)It is also a bit unexpected that you would get different values in memory
on different invocations. I don't know much about Linux internals, but I
would have thought that there would be some effort to prevent a program from
reading the memory contents left over by another program (this is a security
threat). Maybe I'm wrong and a program is responsible for sanitizing its
own memory. In any case, it is a bit unexpected.

Expecting undefined behavior to be consistent is, at best, wishful
thinking.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top