Problems With Global Variables (memory allocations)

R

raashid bhatt

Hi.

Consider the Following Code.

=========================================================================
#include <stdio.h>
#include <string.h>

/* For these Variables The Linker Only Allocates 4 Bytes each in the
memory Like
DB 77 -> w
DB 00
DB 00
DB 00
DB 61 -> a
DB 00
DB 00
DB 00
*/

char a[] = "w";

char b[] = "a";

int main(int argc, char **argv)
{

printf("%s",a);
/*
Copy 7bytes which should not cause data to be overwritten as a have
declared a
as a char[] 'welc' copied(as a has been allocated 4 bytes) to a but
'ome' Get written
to b as b is adjacent to it*
/

strcpy(a, "welcome");

printf("%s", b); /* output 'ome'

return 0;
}

============================================================================

i tried it with nearly all compilers and all produced the same output.
my point is
why for a char[] only 4bytes are allocated while as it has space for
unlimited num of bytes.
 
B

Ben Bacarisse

raashid bhatt said:
Consider the Following Code.
char a[] = "w";

char b[] = "a";

int main(int argc, char **argv)
{
strcpy(a, "welcome");

printf("%s", b); /* output 'ome'

return 0;
}

============================================================================

i tried it with nearly all compilers and all produced the same output.
my point is
why for a char[] only 4bytes are allocated while as it has space for
unlimited num of bytes.

All compilers should make space for two characters: the 'w' and a null
byte at the end. Any extra (in your case the extra 2 bytes) are an
accident due to alignment.

In C, arrays like a and b can't grow to accommodate what you put in
them. You have to say how big they are ate the declaration (and in
C90 that size must be a constant integer expression). Writing [] is
simply a short hand to the compiler that says "reserve enough space
for the initialiser".
 
R

raashid bhatt

No, a has just TWO bytes allocated to it. They are a[0] and a[1].
strcpy(a, "welcome");

No. In your example code, you defined two arrays, each two bytes in size,
not four bytes.


Richard Two bytes arent allocated when i disassemble the program
looking at .data section

DB 77 -> w = byte1
DB 00 = byte 2
DB 00 = byte 3
DB 00 = byte 4
DB 61 -> a another variable ie b
DB 00
DB 00
DB 00 and same 4 bytes for b
 
R

raashid bhatt

So the a array has size 2 (one for 'w' and one for '\0'), and the b array
has size 2 (one for 'a' and one for '\0').

You say array has size 2 that is absolutely wrong.

variable a has been allocated 4 bytes in which are same as

char a[] = "\x77\x00\x00\x00"
w NULL's
if the array had only two bytes then only 'we'(2bytes) had been copied
in b and b would have contained 'lcome'
...when you do that, you are no longer dealing with the rules of C, but
with the under-the-hood details of a particular implementation.

as i said i compiled with nearly all compilers i am not using any
particular implementation.
 
R

Richard Tobin

raashid bhatt said:
Richard Two bytes arent allocated when i disassemble the program
looking at .data section

DB 77 -> w = byte1
DB 00 = byte 2
DB 00 = byte 3
DB 00 = byte 4
DB 61 -> a another variable ie b

The fact that the variables are 4 bytes apart doesn't mean that 4
bytes have been allocated to the first one. 2 bytes have been
allocated, and they are followed by two unused bytes because the
compiler wants to align the second variable on a 4-byte boundary.

The other two byte are "allocated" in the sense that they exist in
your program, but they are not allocated in the sense of being part of
the C variable.

You may find that most compilers do that (or even leave, say, 6 bytes
between the variables), but you can't rely on it.

-- Richard
 
R

Richard

Richard Heathfield said:
raashid bhatt said:
Hi.

Consider the Following Code.

=========================================================================
#include <stdio.h>
#include <string.h>

char a[] = "w";

char b[] = "a";

int main(int argc, char **argv)
{

printf("%s",a);
/*
Copy 7bytes which should not cause data to be overwritten as a have
declared a as a char[] 'welc' copied(as a has been allocated
4 bytes) to a but 'ome' Get written to b as b is adjacent to it*/

No, a has just TWO bytes allocated to it. They are a[0] and a[1].

You, for some reason, refuse to mention his confusion is probably caused
by alignment issues.

Since he doesnt mention what the "same output" is I cant comment on
that. But in any real world system would be surprised to see any crashes
(assuming the contents of a and w do not cause it) if there a 4 byte
alignments of two consecutive globals. And I quick gcc/gdb check
confirms no crash but interesting results.

raashid : Google up element alignment in c.
 
H

Herbert Rosenau

You say array has size 2 that is absolutely wrong.

No, that is absolutely correct. Your implementation does

- fills up blindly the whole data segment with 0 for security
requirements to overwrite anything the last user of that memory area
has set before it gives that area to the application.
Then it gives control to the runtime of your C application that
- reserves 2 bytes for the first array
- initialises that array with 'w' followed by '\0'
- rounds up the address to next 4byte alignment
- reserves 2 bytes for the second array
- initialises that array with 'a' following by '\o'
- calls main() to give control to the code you've written

variable a has been allocated 4 bytes in which are same as

No, it has not.
char a[] = "\x77\x00\x00\x00"
w NULL's
if the array had only two bytes then only 'we'(2bytes) had been copied
in b and b would have contained 'lcome'


as i said i compiled with nearly all compilers i am not using any
particular implementation.

No, it's you who are misinterpreting what you sees in your debugger

Then your code provides buffer overflow by copying more than 2 bytes
in the array you named a.
That copy does not override the the whole array a but write beihnd the
bound, hitting undefined behavior.

C has no bounds checking. So it does blindy what you request it shoud
do: hit undefined behavior.


--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2R Deutsch ist da!
 
C

Chris Torek

[we are given the source fragment:

char var1[] = "w", var2[] = "a";

where these are file-scope variables]

Richard Two bytes arent allocated when i disassemble the program
looking at .data section

DB 77 -> w = byte1
DB 00 = byte 2
DB 00 = byte 3
DB 00 = byte 4
DB 61 -> a another variable ie b
DB 00
DB 00
DB 00 and same 4 bytes for b

What you continue to miss here, which is actually very important,
is that the two-byte "gap" between variables "a" and "b" is not
actually allocated to the variable "a".

You might then reasonably ask: "what's the difference between a
two-byte zero-filled gap following a two-byte variable var1, and
four-byte zero-filled variable var1"? There are multiple answers,
but perhaps the most important one is that the compiler *may* in
the future decide to stuff some other two-byte entity into that
two-byte gap. For instance:

char var1[] = "w", var2[] = "a";
short var3 = 0x1234;

might wind up generating the equivalent of:

DB 77 [var1 occupies
DB 00 these two bytes]
DB 34 [then var3 takes
DB 12 up the next two]
DB 61 [and finally var2
DB 00 uses the last two]
DB 00 [while these two
DB 00 are reserved for later]

(or perhaps the two-byte gap after var2 will also get filled in
with more one-or-two byte variables).

Different compilers (and linkers) will do this differently at
different times, so the result may change when you upgrade the
compiler or change the optimization flags (in particular if you
optimize for size instead of speed). In general, the better the
compiler, the more likely it is to "fill in the gaps" (to avoid
wasting resources) -- so if you are seeing gaps, it means your
compiler is not as good as it could be. (But sometimes leaving
gaps -- perhaps even much bigger ones, such as aligning things
to 8 or even 32 byte boundaries -- can improve performance, so
again it may depend on whether you optimize for speed or space.)
 
I

Ike Naar

=========================================================================
#include <stdio.h>
#include <string.h>

/* For these Variables The Linker Only Allocates 4 Bytes each in the
memory Like
DB 77 -> w
DB 00
DB 00
DB 00
DB 61 -> a
DB 00
DB 00
DB 00
*/

char a[] = "w";

char b[] = "a";

int main(int argc, char **argv)
{

printf("%s",a);
/*
Copy 7bytes which should not cause data to be overwritten as a have
declared a
as a char[] 'welc' copied(as a has been allocated 4 bytes) to a but
'ome' Get written
to b as b is adjacent to it*
/

strcpy(a, "welcome");

printf("%s", b); /* output 'ome'

return 0;
}

============================================================================

i tried it with nearly all compilers and all produced the same output.
my point is
why for a char[] only 4bytes are allocated while as it has space for
unlimited num of bytes.

If you try it with Sun C on a sparc, you will find that
two bytes, not four, are allocated for each of a and b .
(in my setup, the addresses of a and b are 0x20e30 and 0x20e32, respectively),
The ``printf("%s", b);'' prints ``lcome''.

Regards,
Ike
 
B

Ben Bacarisse

=========================================================================
#include <stdio.h>
#include <string.h>

char a[] = "w";

char b[] = "a";
i tried it with nearly all compilers and all produced the same output.
my point is
why for a char[] only 4bytes are allocated while as it has space for
unlimited num of bytes.

If you try it with Sun C on a sparc, you will find that
two bytes, not four, are allocated for each of a and b .

This answer muddies the waters by suggesting that some compilers do
allocate four bytes to a or b. The presence of apparently unused
space near a variable does mean that extra space has been allocated
*to* it. char a[] = "w"; is and always must be a 2-byte object.
 
I

Ike Naar

[email protected] (Ike Naar) said:
char a[] = "w";
char b[] = "a";
i tried it with nearly all compilers and all produced the same output.
my point is
why for a char[] only 4bytes are allocated while as it has space for
unlimited num of bytes.

If you try it with Sun C on a sparc, you will find that
two bytes, not four, are allocated for each of a and b .

This answer muddies the waters by suggesting that some compilers do
allocate four bytes to a or b. The presence of apparently unused
space near a variable does mean that extra space has been allocated
*to* it. char a[] = "w"; is and always must be a 2-byte object.

(Did you forget a "not" between "does" and "mean" ?)

You are correct: only two bytes are allocated to a itself, even if
four bytes are reserved for ``a plus padding''.
Had I read your response to Raashid Bhatt elsethread before
I posted my answer, I would probably have chosen my words more carefully.

The Sun C example was not meant to muddy the waters, but to show that
different compilers can do things differently for implementation
details that are not specified by the language.

Raashid seemed to believe that every compiler in the world would reserve
four bytes for a-with-padding. Clearly, this is not the case.
 
B

Ben Bacarisse

[email protected] (Ike Naar) said:
char a[] = "w";
char b[] = "a";
i tried it with nearly all compilers and all produced the same output.
my point is
why for a char[] only 4bytes are allocated while as it has space for
unlimited num of bytes.

If you try it with Sun C on a sparc, you will find that
two bytes, not four, are allocated for each of a and b .

This answer muddies the waters by suggesting that some compilers do
allocate four bytes to a or b. The presence of apparently unused
space near a variable does mean that extra space has been allocated
*to* it. char a[] = "w"; is and always must be a 2-byte object.

(Did you forget a "not" between "does" and "mean" ?)

Grrrr... Yes, thanks.

The Sun C example was not meant to muddy the waters, but to show that
different compilers can do things differently for implementation
details that are not specified by the language.

Raashid seemed to believe that every compiler in the world would reserve
four bytes for a-with-padding. Clearly, this is not the case.

Agreed -- and that may well be more persuasive than all the logic in
the world.
 
J

John Bode

Hi.

Consider the Following Code.

=========================================================================
#include <stdio.h>
#include <string.h>

/* For these Variables The Linker Only Allocates 4 Bytes each in the
memory Like
DB 77 -> w
DB 00
DB 00
DB 00
DB 61 -> a
DB 00
DB 00
DB 00
*/

char a[] = "w";

char b[] = "a";

int main(int argc, char **argv)
{

printf("%s",a);
/*
Copy 7bytes which should not cause data to be overwritten as a have
declared a
as a char[] 'welc' copied(as a has been allocated 4 bytes) to a but
'ome' Get written
to b as b is adjacent to it*
/

strcpy(a, "welcome");

printf("%s", b); /* output 'ome'

return 0;

}

============================================================================

i tried it with nearly all compilers and all produced the same output.
my point is
why for a char[] only 4bytes are allocated while as it has space for
unlimited num of bytes.

This is why I keep telling you to get a good C book. The declaration

char a[] = "w"

does *not* declare a to be an unlimited number of bytes wide. It
declares a to be as wide as necessary to store the contents of the
initializer, which in this case is two bytes ('a' and 0). Note that
this is the only time you can declare an array without an explicit
size. If you simply wrote

char a[]; // a is incomplete and will not be created
...
strcpy(a, "w");

that would be an error.

C does not support the concept of "unlimited" arrays. All arrays are
of a fixed, finite size. Note that you can only get away with not
explicitly sizing the array when using an initializer. The
declaration

Your compiler/linker did not *allocate* 4 bytes for a or b; each is
*allocated* 2 bytes by virtue of the language definition, but the
linker *aligned* them on 4-byte boundaries. There is a difference.
 
K

Kenneth Brody

[... the code: char a[]="w"; char b[]="a"; ...]
You say array has size 2 that is absolutely wrong.

No, it's absolutely true. Add this and see what you get:

size_t x = sizeof a;

[...]
as i said i compiled with nearly all compilers i am not using any
particular implementation.

Even if you tested it with literally every compiler, the answer
is still that both a[] and b[] are allocated 2 bytes each.

Also, my implementation generates:

_DATA SEGMENT
_a DB 'w', 00H
ORG $+2
_b DB 'a', 00H
_DATA ENDS

Each array has 2 bytes, and 2 bytes have been added for alignment
before the second array.

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 
A

Anand Hariharan

(...)
If you simply wrote

char a[]; // a is incomplete and will not be created ...

As a declaration (as opposed to a function parameter) like you have
shown, is this syntax even legal?

strcpy(a, "w");

that would be an error.

You seem to suggest that it is an error because of buffer overflow, not a
compilation error.


thank you for clarifying,
- Anand
 
C

Chris Torek

If you simply wrote
char a[]; // a is incomplete and will not be created ...
[/QUOTE]

As a declaration (as opposed to a function parameter) like you have
shown, is this syntax even legal?

It is in some cases. (Syntactically, it is always valid; the
problem here is the word "legal".) Consider:

#include <stdio.h>

char arr1[];

int main(void) {
char arr2[]; /* ERROR - diagnostic required */

printf("arr1 = %p\n", (void *)arr1);
printf("arr2 = %p\n", (void *)arr2);
return 0;
}

This program contains one error that requires a diagnostic, on the
marked line. The array arr2 cannot be a tentative definition:
the line is declaring a variable with block scope, and hence no
linkage, and identifiers with no linkage cannot be tentative
definitions.

If you insert a size for arr2, however, the code will compile and
run. The array "arr1" has an incomplete type at the point where
we compute the address of its first element (the first printf()),
but it is OK to take the address of that element anyway. (One can
also take the address of the entire array with &arr1, or the
address of any element -- but see next paragraph.)

At the end of the translation unit, the tentative definition of
arr1 is turned into an actual definition. The compiler supplies
the size at that point, and the size it supplies is 1. This
makes the array have a single element, arr1[0]. No diagnostic
is required, but gcc here says:

t.c:3: warning: array `arr1' assumed to have one element

(with -W -Wall anyway) to alert you that you probably forgot to
specify the size. (Arrays of size 1 are unusual, though I know
one programmer who likes to use them instead of unary &, as in:

time_t now[1];
char *str;

time(now);
str = ctime(now);

instead of:

time_t now;
char *str;

time(&now);
str = ctime(&now);

I find this annoying, myself.)
 
A

Anand Hariharan

If you simply wrote
char a[]; // a is incomplete and will not be created ...

As a declaration (as opposed to a function parameter) like you have
shown, is this syntax even legal?

It is in some cases. (Syntactically, it is always valid; the problem
here is the word "legal".) Consider:

#include <stdio.h>

char arr1[];

int main(void) {
char arr2[]; /* ERROR - diagnostic required */

printf("arr1 = %p\n", (void *)arr1);
printf("arr2 = %p\n", (void *)arr2);
return 0;
}

This program contains one error that requires a diagnostic, on the
marked line. The array arr2 cannot be a tentative definition: the line
is declaring a variable with block scope, and hence no linkage, and
identifiers with no linkage cannot be tentative definitions.

If you insert a size for arr2, however, the code will compile and run.
The array "arr1" has an incomplete type at the point where we compute
the address of its first element (the first printf()), but it is OK to
take the address of that element anyway. (One can also take the address
of the entire array with &arr1, or the address of any element -- but see
next paragraph.)

At the end of the translation unit, the tentative definition of arr1 is
turned into an actual definition. The compiler supplies the size at
that point, and the size it supplies is 1. This makes the array have a
single element, arr1[0]. No diagnostic is required, but gcc here says:

t.c:3: warning: array `arr1' assumed to have one element

(with -W -Wall anyway) to alert you that you probably forgot to specify
the size.

Chris -

Sincere thanks for the patient and thorough explanation. Much
appreciated.

sincerely,
- Anand Hariharan
 
J

James Kuyper

Anand said:
(...)
If you simply wrote

char a[]; // a is incomplete and will not be created ...

As a declaration (as opposed to a function parameter) like you have
shown, is this syntax even legal?

It is not allowed at block scope, without an extern specifier. It is
allowed for file scope declarations with external linkage. In either
case, it declares 'a' with an incomplete type, indicating that some
other declaration will actually complete the type; this other
declaration may be in a different translation unit.

It is also allowed inside structure definitions, in which case it
declares what is called a flexible array member. However, I don't think
you're ready yet for an explanation of flexible array members.

The type of an array element must be a complete type, so the following
are NOT allowed, even at file scope:

char b[5][]; // Constraint violation! 6.7.5.2p1
char c[][]; // Constraint violation! 6.7.5.2p1

The only thing I can think of that you might want to do with an array,
that you can't do with an incomplete array, is apply sizeof to it.
There's probably something else that I've missed.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,270
Latest member
TopCryptoTwitterChannels_

Latest Threads

Top