memcpy() with unitialised memory

Spiros Bousbouras · Jan 14, 2009

#include <string.h>

int main(void) {
int a[10] , b[10] ;
memcpy(a,b,10) ;
return 0 ;
}

Is this undefined behaviour ? If yes how does it follow from the
standard ?

qarnos · Jan 14, 2009

#include <string.h>

int main(void) {
int a[10] , b[10] ;
memcpy(a,b,10) ;
return 0 ;

}

Is this undefined behaviour ? If yes how does it follow from the
standard ?

If you are having problems with that code, it's because you are
forgetting to use the sizeof operator.

int main(void) {
int a[10], b[10];
memcpy(a, b, sizeof(int) * 10);
return 0;
}

Ben Pfaff · Jan 14, 2009

Spiros Bousbouras said:
#include <string.h>

int main(void) {
int a[10] , b[10] ;
memcpy(a,b,10) ;
return 0 ;
}

Is this undefined behaviour ? If yes how does it follow from the
standard ?

No, it is not undefined behavior. memcpy copies an object as an
array of unsigned char. The values of b's elements are
indeterminate, but unsigned char has no trap representation, so
their values are merely unspecified.

Tomás Ó hÉilidhe · Jan 14, 2009

#include <string.h>

int main(void) {
int a[10] , b[10] ;
memcpy(a,b,10) ;
return 0 ;

}

Is this undefined behaviour ? If yes how does it follow from the
standard ?

This is one of the places where I get lazy and don't bother
thinking about it because I never do it. Same goes for bitwise
operations on signed integers -- I haven't bothered learning about it
because I'll never do it.

For what it's worth though, I've seen very proficient programmers
on this newsgroup do stuff like copy uninitialised arrays of unsigned
char and say it's OK, so my first guess would be that it's OK. Not
that you'd have a reason to do it, of course.

Of course if you try to access the uninitialised data is if it
were int's, it'd be UB.

user923005 · Jan 14, 2009

#include <string.h>

int main(void) {
int a[10] , b[10] ;
memcpy(a,b,10) ;
return 0 ;

}

Is this undefined behaviour ? If yes how does it follow from the
standard ?

6.2.6 Representations of types
6.2.6.1 General

5 Certain object representations need not represent a value of the
object type. If the stored value of an object has such a
representation and is read by an lvalue expression that does not have
character type, the behavior is undefined. If such a representation is
produced by a side effect that modifies all or any part of the object
by an lvalue expression that does not have character type, the
behavior is undefined.41) Such a representation is called a trap
representation.

Footnote 41) Thus, an automatic variable can be initialized to a trap
representation without causing undefined behavior, but the value of
the variable cannot be used until a proper value is stored in it.

And I guess you meant:

/* No undefined behavior here. */
#include <string.h>
int main(void) {
int a[10], b[10]={0};
memcpy(a,b,sizeof a) ;
return 0;
}

Sjouke Burry · Jan 14, 2009

qarnos said:
#include <string.h>

int main(void) {
int a[10] , b[10] ;
memcpy(a,b,10) ;
return 0 ;

}

Is this undefined behaviour ? If yes how does it follow from the
standard ?

Click to expand...

If you are having problems with that code, it's because you are
forgetting to use the sizeof operator.

int main(void) {
int a[10], b[10];
memcpy(a, b, sizeof(int) * 10);
return 0;
}

Although you are still copying uninitialized data,
which is as usefull as carrying water to the sea.

user923005 · Jan 14, 2009

Spiros Bousbouras said:
Spiros Bousbouras said:

#include <string.h>

Click to expand...

int main(void) {
int a[10] , b[10] ;
memcpy(a,b,10) ;
return 0 ;
}

Click to expand...

Is this undefined behaviour ? If yes how does it follow from the
standard ?

Click to expand...

No, it is not undefined behavior. memcpy copies an object as an
array of unsigned char. The values of b's elements are
indeterminate, but unsigned char has no trap representation, so
their values are merely unspecified.

I see nothing about the mechanics of the copy operation here:
7.21.2 Copying functions
7.21.2.1 The memcpy function
Synopsis
1 #include <string.h>
void *memcpy(void * restrict s1,
const void * restrict s2,
size_t n);
Description
2 The memcpy function copies n characters from the object pointed to
by s2 into the object pointed to by s1. If copying takes place between
objects that overlap, the behavior is undefined.
Returns
3 The memcpy function returns the value of s1.

The reference to "n characters" only implies size.

Mechanically, most library source I know of does not actually use char
anyway. If there is some rule that says the memcpy() function must
behave as if it is moving unsigned characters, then you are right.

user923005 · Jan 14, 2009

Spiros Bousbouras said:
Spiros Bousbouras said:

#include <string.h>
int main(void) {
int a[10] , b[10] ;
memcpy(a,b,10) ;
return 0 ;
}
Is this undefined behaviour ? If yes how does it follow from the
standard ?

Click to expand...

Click to expand...

No, it is not undefined behavior. memcpy copies an object as an
array of unsigned char. The values of b's elements are
indeterminate, but unsigned char has no trap representation, so
their values are merely unspecified.

Click to expand...

I see nothing about the mechanics of the copy operation here:
7.21.2 Copying functions
7.21.2.1 The memcpy function
Synopsis
1 #include <string.h>
void *memcpy(void * restrict s1,
const void * restrict s2,
size_t n);
Description
2 The memcpy function copies n characters from the object pointed to
by s2 into the object pointed to by s1. If copying takes place between
objects that overlap, the behavior is undefined.
Returns
3 The memcpy function returns the value of s1.

The reference to "n characters" only implies size.

Mechanically, most library source I know of does not actually use char
anyway. If there is some rule that says the memcpy() function must
behave as if it is moving unsigned characters, then you are right.

See, for instance:
http://www.pell.portland.or.us/~orc/Code/libc/libc-current/string/memcpy.c
http://www.koders.com/c/fidE40953362C44848125DB7B62E480E2E1675F7166.aspx?s=mdef:insert

Ben Pfaff · Jan 14, 2009

user923005 said:
I see nothing about the mechanics of the copy operation here: ....
2 The memcpy function copies n characters from the object pointed to
by s2 into the object pointed to by s1. If copying takes place between
objects that overlap, the behavior is undefined. ....
The reference to "n characters" only implies size.

Why do you think so? It seems pretty clear to me that it copies
characters, since that it what the plain text of the standard
says.

The memmove description is even more explicit about characters
being involved:

Copying takes place as if the n characters from the
object pointed to by s2 are first copied into a
temporary array of n characters that does not overlap
the objects pointed to by s1 and s2, and then the n
characters from the temporary array are copied into the
object pointed to by s1.

TC2 adds this paragraph to 7.21.1 "String function conventions":

For all functions in this subclause, each character
shall be interpreted as if it had the type unsigned
char (and therefore every possible object
representation is valid and has a different value).

This is a result of DR 274:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_274.htm

Mechanically, most library source I know of does not actually use char
anyway. If there is some rule that says the memcpy() function must
behave as if it is moving unsigned characters, then you are right.

This falls under the "as if" rule.

Ben Pfaff · Jan 14, 2009

Han from China - Master Troll said:
So those three tell us that we need only bother exploring further to see
what these pesky trap representations imply for us.

7.21.1{3}:
For all functions in this subclause, each character shall be interpreted
as if it had the type unsigned char (and therefore every possible object
representation is valid and has a different value).

Does your copy of C99 have TC1 and TC2 pre-applied, then? Where
did you get it?

user923005 · Jan 14, 2009

Why do you think so? It seems pretty clear to me that it copies
characters, since that it what the plain text of the standard
says.

The memmove description is even more explicit about characters
being involved:

Copying takes place as if the n characters from the
object pointed to by s2 are first copied into a
temporary array of n characters that does not overlap
the objects pointed to by s1 and s2, and then the n
characters from the temporary array are copied into the
object pointed to by s1.

TC2 adds this paragraph to 7.21.1 "String function conventions":

For all functions in this subclause, each character
shall be interpreted as if it had the type unsigned
char (and therefore every possible object
representation is valid and has a different value).

I guess it is time for me to get a copy of TC2. The above is very
clear.
Looks like n1256.pdf has TC1+TC2+TC3

user923005 · Jan 14, 2009

Does your copy of C99 have TC1 and TC2 pre-applied, then? Where
did you get it?

This document:
http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf
Has C99 + TC1 + TC2 + TC3

Ben Pfaff · Jan 14, 2009

user923005 said:
This document:
http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf
Has C99 + TC1 + TC2 + TC3

Thanks.

Bartc · Jan 14, 2009

Sjouke Burry said:
qarnos said:

#include <string.h>

int main(void) {
int a[10] , b[10] ;
memcpy(a,b,10) ;
return 0 ;

}

Is this undefined behaviour ? If yes how does it follow from the
standard ?

Click to expand...

If you are having problems with that code, it's because you are
forgetting to use the sizeof operator.

int main(void) {
int a[10], b[10];
memcpy(a, b, sizeof(int) * 10);
return 0;
}

Click to expand...

Although you are still copying uninitialized data,
which is as usefull as carrying water to the sea.

It ensures both a and b have the same contents; this could possibly be
significant.

(Now of course someone will say this is not guaranteed by the C standard,
but that wouldn't surprise me.)

Wolfgang Draxinger · Jan 14, 2009

qarnos said:
If you are having problems with that code, it's because you are
forgetting to use the sizeof operator.

int main(void) {
int a[10], b[10];
memcpy(a, b, sizeof(int) * 10);
return 0;
}

Since the C standard states, that

sizeof(char) <= sizeof( any_other_type )

the only thing that might happen is, that "too few" elements are
copied. Otherwise the code shows no undefined behaviour:

* 'a' and 'b' don't overlap
* 'a' and 'b' are allocated (automatically)

Surely the contents are uninitialized, but sometimes one might
_want_ to read out the contents of uninitialized memory (either
to initialize some entropy pool^1, or for data forensics^2).

Wolfgang Draxinger

[1]: OpenSSL does this - and the Debian folks "corrected" it away
resulting in the Debian-OpenSSL desaster.

[2]: like in: Inject some shell code into an application, call
the forensics function, exploiting knowledge about the
implementation, e.g. that this certain implementation uses a
stack and with the following code

void foo()
{
unsigned char test[32];
/* do something on test */
}

void bar()
{
unsigned char gotcha[1024];
}

void baz()
{
foo(); /* somehow inject a call to bar after foo here */
/* -> */ bar(); /* On certain architectures utilizing a stack,
like the x86, bar's 'gotcha' will now
contain the last contents of
foo's 'test' */
}

This technique is usefull, if you can't run a debugger on the
system, but can inject shellcode (through some exploit e.g.)

jameskuyper · Jan 14, 2009

Bartc said:
Sjouke Burry said:

qarnos wrote: ....

int main(void) {
int a[10], b[10];
memcpy(a, b, sizeof(int) * 10);
return 0;
}

Click to expand...

Although you are still copying uninitialized data,
which is as usefull as carrying water to the sea.

Click to expand...

Or more precisely, it's as useful a replacing sea water with other sea
water.

It ensures both a and b have the same contents; this could possibly be
significant.

(Now of course someone will say this is not guaranteed by the C standard,
but that wouldn't surprise me.)

That the contents are identical is indeed guaranteed. But I still
don't see why it would be important for a correctly written program
that two objects contain identical copies of uninitialized memory.

Ben Pfaff · Jan 14, 2009

Han from China - Master Troll said:
I'm surprised you've missed Chuck Fucking Falconer dumping the
damn link in every thread.

I usually just read threads that have very few posts. There's
rarely anything left to contribute to popular threads.

jameskuyper · Jan 14, 2009

Wolfgang Draxinger wrote:
....

Surely the contents are uninitialized, but sometimes one might
_want_ to read out the contents of uninitialized memory (either
to initialize some entropy pool^1, or for data forensics^2).

Wolfgang Draxinger

[1]: OpenSSL does this - and the Debian folks "corrected" it away
resulting in the Debian-OpenSSL desaster.

It sounds like such code makes unwarranted assumptions about the
entropy of uninitialized memory. Lets put it this way: the compiler I
use most frequently has the "feature" that uninitialized memory is
always filled with zeros. This feature does not render it non-
conforming. Would that be a problem for such applications?

Falcon Kirtaran · Jan 14, 2009

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Wolfgang Draxinger wrote:
...

Surely the contents are uninitialized, but sometimes one might
_want_ to read out the contents of uninitialized memory (either
to initialize some entropy pool^1, or for data forensics^2).

Wolfgang Draxinger

[1]: OpenSSL does this - and the Debian folks "corrected" it away
resulting in the Debian-OpenSSL desaster.

Click to expand...

It sounds like such code makes unwarranted assumptions about the
entropy of uninitialized memory. Lets put it this way: the compiler I
use most frequently has the "feature" that uninitialized memory is
always filled with zeros. This feature does not render it non-
conforming. Would that be a problem for such applications?

It's kind of silly to try to use that as a source of entropy in the
first place. malloc() is often implemented in such a way that
unallocated memory (might) contain information used for memory
allocation within the program's heap (which makes the data there much
more predictable).

- --
- --Falcon Darkstar Kirtaran
- --
- --OpenPGP: (7902:4457) 9282:A431

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iQIcBAEBAgAGBQJJbggmAAoJEKmxP9YxEE4rEvUP/iUpZVa7dr5TO+0FqKO3eVXi
rhOrmv1qLu+k6gj9eS8HrFEv7zE7L5CwAAgSQEx6FhS11U2lQt68qVGw9C9O5OMS
6GI81+qMoKAdq4HLb575jfivfSJWvgnRjg2lVsadMN2SeAYHu3oxj/ptOeLA0+M+
DDlNqZ6xXglwBQnowD+JX69daoJCkJW5+TY3hKTbqMBQfEI5VJOrHveqflpyJVoo
Z/NfwPW2C2CeAXU8sb3kd9UoSgdrcrRbMDV1Fo+Qf/lKA85GPCofwY9oUhTBnePa
W6WYYR7CB3SgyZSv+v0IDyY+1HZr4fB6g5CmXIowx+ligYt82dX0jz1GzF+J79GM
hAd4QSyIt+G9M12mRLhG1DMo1t6YkvQ24MIlxSh8bQe9pMqaHYzwnnnPw95T5348
CKJkOaFzuyadqCnG+oRaLj1zO1IqH05kO4Ag3WkXIbcIa5d8ZekK1CtLJf7T+E3p
HXXDJgVk7T+nGI9BzAs/JnbIyNLuKvN8hN9JCI6Iu+P2zSXg1ujuN/aVhL17iMBc
BpwEy73kCULKqHmpWa9qW0kJSfVo0SqwneTHezoy2/gq0ObD3u33FkwueyWT4fTP
WYDkNcrqVF93Deve2boMfdcSMK59pYODOvg8VY9i7fW21P+v5wxxE32vN6MnNFHV
PbJZohWmkVYWkQI2XO1t
=PV2n
-----END PGP SIGNATURE-----

Richard Tobin · Jan 14, 2009

jameskuyper said:
That the contents are identical is indeed guaranteed. But I still
don't see why it would be important for a correctly written program
that two objects contain identical copies of uninitialized memory.

Suppose a struct contains members that are not used in all cases. It
may be convenient to compare them without knowledge of which members
are used, so when making a copy you might choose to copy all the
members regardless of whether they have been initialised. Of course,
it might well be better to always initialise the data to zero.

-- Richard

Adding adressing of IPv6 to program	1	Feb 16, 2023
Is memcpy with len=0 a NOP?	16	Jan 24, 2011
[memcpy] dst=NULL,size=0	9	Mar 3, 2009
Array of structs function pointer	10	Jul 16, 2023
gcc inline memcpy	7	Jul 12, 2012
Problem with displaying character that code number is 219 (after SetConsoleTextAttribute)?	3	Jan 9, 2023
C program: memory leak/ segmentation fault/ memory limit exceeded	0	Nov 12, 2022
THE PROGRAM IS NOT RUNING	3	Nov 1, 2022

memcpy() with unitialised memory

Spiros Bousbouras

qarnos

Ben Pfaff

Tomás Ó hÉilidhe

user923005

Sjouke Burry

user923005

user923005

Ben Pfaff

Ben Pfaff

user923005

user923005

Ben Pfaff

Bartc

Wolfgang Draxinger

jameskuyper

Ben Pfaff

jameskuyper

Falcon Kirtaran

Richard Tobin

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads