memcpy() with unitialised memory

jameskuyper · Jan 14, 2009

Richard said:
Suppose a struct contains members that are not used in all cases. It
may be convenient to compare them without knowledge of which members
are used, so when making a copy you might choose to copy all the
members regardless of whether they have been initialised. Of course,
it might well be better to always initialise the data to zero.

In this case, the object being copied is an array, not a struct, and
it's the entire array that's uninitialized, not just parts of it, so
that argument doesn't apply.

Harald van DÄ³k · Jan 14, 2009

Bartc said:
Bartc said:

qarnos wrote:
int a[10], b[10];
memcpy(a, b, sizeof(int) * 10);

Click to expand...

[...]
It ensures both a and b have the same contents; this could possibly be
significant.

(Now of course someone will say this is not guaranteed by the C
standard, but that wouldn't surprise me.)

Click to expand...

That the contents are identical is indeed guaranteed. [...]

Does the standard really guarantee that an uninitialised value doesn't
change? The rule is that "[a]n object exists, has a constant address, and
retains its last-stored value throughout its lifetime." It seems to me
that there if there is no last-stored value, there is nothing to retain.

jameskuyper · Jan 14, 2009

Harald said:
Bartc said:

qarnos wrote:
int a[10], b[10];
memcpy(a, b, sizeof(int) * 10); [...]
It ensures both a and b have the same contents; this could possibly be
significant.

(Now of course someone will say this is not guaranteed by the C
standard, but that wouldn't surprise me.)

Click to expand...

That the contents are identical is indeed guaranteed. [...]

Click to expand...

Does the standard really guarantee that an uninitialised value doesn't
change? The rule is that "[a]n object exists, has a constant address, and
retains its last-stored value throughout its lifetime." It seems to me
that there if there is no last-stored value, there is nothing to retain.

I think you've got a point there. As a practical matter, taking the
address of the object (as must be done to make use of memcpy()),
should force an implementation to assign the object to a fixed
location in memory. However, until the first write into that piece of
memory, it might be used for other purposes as well, which might cause
it's contents to change.

Falcon Kirtaran · Jan 14, 2009

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Harald said:
Harald said:

Bartc wrote:
qarnos wrote:
int a[10], b[10];
memcpy(a, b, sizeof(int) * 10); [...]
It ensures both a and b have the same contents; this could possibly be
significant.

(Now of course someone will say this is not guaranteed by the C
standard, but that wouldn't surprise me.)
That the contents are identical is indeed guaranteed. [...]

Click to expand...

Does the standard really guarantee that an uninitialised value doesn't
change? The rule is that "[a]n object exists, has a constant address, and
retains its last-stored value throughout its lifetime." It seems to me
that there if there is no last-stored value, there is nothing to retain.

Click to expand...

I think you've got a point there. As a practical matter, taking the
address of the object (as must be done to make use of memcpy()),
should force an implementation to assign the object to a fixed
location in memory. However, until the first write into that piece of
memory, it might be used for other purposes as well, which might cause
it's contents to change.

Linux tends to do just that as a matter of course, does it not?

- --
- --Falcon Darkstar Kirtaran
- --
- --OpenPGP: (7902:4457) 9282:A431

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iQIcBAEBAgAGBQJJbixrAAoJEKmxP9YxEE4rW5MP/2Cq5jeP5gBKXMnfPdGb3gv9
nt47pgG2vMxAwLddWgRuLU3wMoJysUk1VpF7sT9hXRFH5EsHPcCL7xVjp5rsmZnK
oK6sfC4kQL1jIC5yIYbbOVCsPcBIf5cvC2CJGf3DZCH4GO5WS+TlQenqU1msijL5
0O6brx1geUhl6A9BpAQk+j4MMkrC76jRnn6qIjaNBTtW4fSE+FVkRALRXruC1HQP
ssra1J0CJzs0Zk3Er4riKxGxxu/da/PIJ89yOUDpELIfaOPEHmCaiFE9+rY0vS/9
0xre93Yto1NyY9c9dh81ZBJO0SUHYMTE09T4Yw+Dt18C95UmRT1eM6Z+N8B1j7Yw
KlbRbsuFUM39L9lhcTjKXMA/HSCt7+ZHHk+87XJf+ARY3rtwTscgxjO1v+Dg2HMw
YwlOExwK9TREAYXBPtv5YS4nwi52BosTWEecQqANHITDSiJGxOrqhgv+OHx7fcDf
VBZpQOC0sNjBLMvU+r+1jleEfhEpv+IEJrhtWqGJwYg/zK4lvyiMXzdukKmKhNc7
z6tFKAJw1CH4xCP/nNyTtenMEf4MAqRXw1xi2CAYJ4D0/UfuD1z+SGXuWahHMEnZ
3g6u7Whuf0hy0rAXhXq5hIH8aoi3Nqs2AXcyyzLlzDLcT2WaaecA3s8ZZMrrela3
garNw5bGsyE/HZHyW/YB
=F573
-----END PGP SIGNATURE-----

Wolfgang Draxinger · Jan 14, 2009

jameskuyper said:
It sounds like such code makes unwarranted assumptions about
the entropy of uninitialized memory. Lets put it this way: the
compiler I use most frequently has the "feature" that
uninitialized memory is always filled with zeros. This feature
does not render it non- conforming. Would that be a problem for
such applications?

OpenSSL uses uninitialized memory as one of several sources for
entropy - it's not the only one.

If the memory is preinitialized it will just remove that little
bit of entropy, but if the previous contents remain it add some
degree of randomness.

The problem with the Debian-OpenSSL desaster was, that the Debian
folks disabled an entire code path, namely the one, that caused
warnings in Valgrind. Unfortunately this code path was the
entropy collector, which also implements a cryptography grade
PSNR, accesses to /dev/urandom and such. Disabling this codepath
made the XOR of pid, uid and gid the only entropy source, with
the result that only 2^16 keys of possible 2^2048 were
generated.

Wolfgang Draxinger

Richard Tobin · Jan 14, 2009

I think you've got a point there. As a practical matter, taking the
address of the object (as must be done to make use of memcpy()),
should force an implementation to assign the object to a fixed
location in memory. However, until the first write into that piece of
memory, it might be used for other purposes as well, which might cause
it's contents to change.

[/QUOTE]

Linux tends to do just that as a matter of course, does it not?

Many operating systems don't allocate memory for a page until it's
accessed. But that's accessed, not written, and they generally zero
the page, because you don't want to reveal one program's data to
another.

-- Richard

CBFalconer · Jan 14, 2009

Spiros said:
#include <string.h>

int main(void) {
int a[10] , b[10] ;
memcpy(a,b,10) ;
return 0 ;
}

Is this undefined behaviour ? If yes how does it follow from the
standard ?

Yes. For one example, consider that the array b (uninitialized)
may contain trap int values. Thus the memcpy operation can be
interrupted and the program aborted. Thus you don't know what the
program will do.

Keith Thompson · Jan 14, 2009

CBFalconer said:
Spiros said:

#include <string.h>

int main(void) {
int a[10] , b[10] ;
memcpy(a,b,10) ;
return 0 ;
}

Is this undefined behaviour ? If yes how does it follow from the
standard ?

Click to expand...

Yes. For one example, consider that the array b (uninitialized)
may contain trap int values. Thus the memcpy operation can be
interrupted and the program aborted. Thus you don't know what the
program will do.

Did you read the rest of the thread, in which it was persuasively
argued that memcpy() copies values of type unsigned char and therefore
cannot be affected by trap representations? If so, how do you refute
that argument?

CBFalconer · Jan 14, 2009

Ben said:
Spiros Bousbouras said:

#include <string.h>

int main(void) {
int a[10] , b[10] ;
memcpy(a,b,10) ;
return 0 ;
}

Is this undefined behaviour ? If yes how does it follow from the
standard ?

Click to expand...

No, it is not undefined behavior. memcpy copies an object as an
array of unsigned char. The values of b's elements are
indeterminate, but unsigned char has no trap representation, so
their values are merely unspecified.

I retract my earlier answer. Since the objects are treated as
arrays of unsigned char, not as ints, there is no interruption of
memcpy.

Spiros Bousbouras · Jan 15, 2009

qarnos said:
qarnos said:

#include <string.h>
int main(void) {
int a[10] , b[10] ;
memcpy(a,b,10) ;
return 0 ;
}
Is this undefined behaviour ? If yes how does it follow from the
standard ?

Click to expand...

Click to expand...

If you are having problems with that code, it's because you are
forgetting to use the sizeof operator.

Click to expand...

int main(void) {
int a[10], b[10];
memcpy(a, b, sizeof(int) * 10);
return 0;
}

Click to expand...

Although you are still copying uninitialized data,
which is as usefull as carrying water to the sea.

The precise code above is of course useless for anything other
than a theoretical exercise but after I posted it I started
wondering whether there might be occasions where copying
unitialised memory might be useful. A scenario I came up with is
that the programme shares memory with 2 other processes and it
copies memory initialised by one of those processes to memory
read by the other. So the memory will be initialised but not by
the programme executing memcpy() Since the other process might
not be written in C and it might put into memory values a C
programme might not like, it's useful to know that memcpy() can
handle anything.

CBFalconer · Jan 15, 2009

CBFalconer said:
.... snip ...

I retract my earlier answer. Since the objects are treated as
arrays of unsigned char, not as ints, there is no interruption of
memcpy.

Something is ridiculous. This reply shows up with a time of 18:34
(local), and my earlier reply shows up with a time of 18:45. Since
they were sent from the same machine, and the same session of the
newsreader, I don't know whom to blame. I didn't reset the clock.

On further investigation, both messages are stored in my 'sent
news' folder, with the same times. I just do not understand.

CBFalconer · Jan 15, 2009

Keith said:
.... snip ...

Did you read the rest of the thread, in which it was persuasively
argued that memcpy() copies values of type unsigned char and
therefore cannot be affected by trap representations? If so, how
do you refute that argument?

I later read some of 'the rest' and issued a retracting message.
The times on those messages have caused me further confusion. :-(

No, I don't refute it.

CBFalconer · Jan 15, 2009

Spiros said:
Sjouke Burry said:

qarnos wrote:
.... snip ...

int main(void) {
int a[10], b[10];
memcpy(a, b, sizeof(int) * 10);
return 0;
}

Click to expand...

Although you are still copying uninitialized data,
which is as usefull as carrying water to the sea.

Click to expand...

The precise code above is of course useless for anything other
than a theoretical exercise but after I posted it I started
wondering whether there might be occasions where copying
unitialised memory might be useful. A scenario I came up with is
that the programme shares memory with 2 other processes and it
copies memory initialised by one of those processes to memory
read by the other. So the memory will be initialised but not by
the programme executing memcpy() Since the other process might
not be written in C and it might put into memory values a C
programme might not like, it's useful to know that memcpy() can
handle anything.

Actually just that sort of use used to be quite prevalent.
Consider the actions in single-user MSDOS. Normally a program, on
launch, was assigned all available memory. The program could cut
the memory assignment down, without using the eventually
non-assigned space, and then launch another program in that freed
space. Just that sort of memcpy was required.

When such code is in a searchable library, and then loaded into the
executing program, it has limited use. It is more useful (in a
particular machine) when available in some immediately accessible
place, such as actual memory, or even a Blindows Dll, because then
it can just be called when needed.

Ben Pfaff · Jan 15, 2009

CBFalconer said:
Something is ridiculous. This reply shows up with a time of 18:34
(local), and my earlier reply shows up with a time of 18:45. Since
they were sent from the same machine, and the same session of the
newsreader, I don't know whom to blame. I didn't reset the clock.

You must have invoked undefined behavior.

Keith Thompson · Jan 15, 2009

Spiros Bousbouras said:
qarnos said:

#include <string.h>

Click to expand...

int main(void) {
int a[10] , b[10] ;
memcpy(a,b,10) ;
return 0 ;

}

Click to expand...

Is this undefined behaviour ? If yes how does it follow from the
standard ?

Click to expand...

If you are having problems with that code, it's because you are
forgetting to use the sizeof operator.

Click to expand...

int main(void) {
int a[10], b[10];
memcpy(a, b, sizeof(int) * 10);
return 0;
}

Click to expand...

Although you are still copying uninitialized data,
which is as usefull as carrying water to the sea.

Click to expand...

The precise code above is of course useless for anything other
than a theoretical exercise but after I posted it I started
wondering whether there might be occasions where copying
unitialised memory might be useful.

[snip]

A simpler scenario is a counted array type:

struct counted_array {
int count;
int data[MAX];
};

For a given object of this type, only elements 0..count-1 of the data
array will be relevant, but it's easier to copy the whole structure
than to compute how much actually needs to be copied. (Though it
might be worth the effort if MAX is large).

Spiro Trikaliotis · Jan 15, 2009

Hello namesake,

Spiros said:
[...] but after I posted it I started
wondering whether there might be occasions where copying
unitialised memory might be useful.

OpenSSL uses this in order to make the random seed a little bit more
random.

Essentially was what lead to the Debian specific bug (DSA 1571,
http://www.debian.org/security/2008/dsa-1571):

With the help of valgrind. it was found out that openssl uses
uninitialised memory, and this was removed
(http://marc.info/?t=114651088900003&r=1&w=2) - removing almost all
randomness from the random seed. I think the rest of the story is known.

It can be argued if the randomness of the uninitialised memory helps a
lot (and IIRC, there is some comment in the sources saying something
along the lines of "... but it does not hurt, either"), but it is an
approach.

To come back to OnT: Invoking UB might be a good way to start some kind
of randomness in the first place, anyway, especially when the machine
explodes. - SCNR.

Regards,
Spiro.

Guest · Jan 15, 2009

Hello namesake,
Spiros Bousbouras wrote:

[...] but after I posted it I started
wondering whether there might be occasions where copying
unitialised memory might be useful.

Click to expand...

OpenSSL uses this in order to make the random seed a little bit more
random.

sounds like a potential security hole

Essentially was what lead to the Debian specific bug (DSA 1571,http://www..debian.org/security/2008/dsa-1571):

ah, right...

With the help of valgrind. it was found out that openssl uses
uninitialised memory, and this was removed
(http://marc.info/?t=114651088900003&r=1&w=2) - removing almost all
randomness from the random seed. I think the rest of the story is known.

It can be argued if the randomness of the uninitialised memory helps a
lot

I don't see how you can be sure unitialised memory is "random"

(and IIRC, there is some comment in the sources saying something
along the lines of "... but it does not hurt, either"),

it sound slike it could hurt. But I suppose the implementors of
Debenian know what is (or isn't) in "unitialised" memory.

but it is an approach.

To come back to OnT: Invoking UB might be a good way to start some kind
of randomness in the first place, anyway, especially when the machine
explodes. - SCNR.

UB doesn't have to result in random data either...

--
Nick Keighley

In a sense, there is no such thing as a random number;
for example, is 2 a random number?
(D.E.Knuth)

dj3vande · Jan 15, 2009

Spiros said:
Hello namesake,

Spiros said:

[...] but after I posted it I started
wondering whether there might be occasions where copying
unitialised memory might be useful.

Click to expand...

OpenSSL uses this in order to make the random seed a little bit more
random.

Essentially was what lead to the Debian specific bug (DSA 1571,
http://www.debian.org/security/2008/dsa-1571):

With the help of valgrind. it was found out that openssl uses
uninitialised memory, and this was removed
(http://marc.info/?t=114651088900003&r=1&w=2) - removing almost all
randomness from the random seed. I think the rest of the story is known.

No, the problem was a little bit subtler than that.

There were *two* places where code was commented out.
One of them was assimilating the (possibly uninitialized) contents of a
buffer into the entropy pool before filling that buffer with random
bytes. This was the "can't hurt, and may help" one. Removing this one
should have silenced the valgrind warnings and would have left a secure
random number generator, just one that sometimes had slightly less
entropy.
There was another place where the same operation was done (with
identical code) when the caller requested that the contents of a buffer
be added to the entropy pool. That one was also (incorrectly) removed
by an overenthusiastic patcher, and that's the one that turned the RNG
into a NRNG.

ObC: What would the DS9k[1] have done with the uninitialized buffer?
I think it's been established that it's required to accept it, since
the memory is being examined as unsigned char.
But what would the OpenSSL library actually see when it looked at the
buffer?

dave

[1] For those who weren't around a few years back: The DeathStation
9000 is a hypothetical machine that accepts and does The Right
Thing with conforming and portable C code, but fails in creative
and spectacular ways constrained only by the laws of physics when
given code that is incorrect in any way.

dj3vande · Jan 15, 2009

On 15 Jan, 08:07, Spiro Trikaliotis <[email protected]>
wrote:

sounds like a potential security hole

Not if the people hacking on the code know what they're doing.

1571,http://www.debian.org/security/2008/dsa-1571):

ah, right...

The security hole wasn't caused by the OpenSSL developers' decision to
write code that used uninitalized data; it was caused by the Debian
developers *removing* code that used *initialized* data.

I don't see how you can be sure unitialised memory is "random"

If the RNG is competently built, adding non-random data to the entropy
pool *can't* hurt it (i.e. as long as it gets enough input that *is*
random, it will generate adequately random output, no matter how much
non-random input it gets). If it could, an attacker could force it to
degrade just by sending it enough all-bits-zero data; this is a
weakness we know how to avoid, so a RNG that has it is (by definition)
not competently built.

dave

CBFalconer · Jan 15, 2009

Spiro said:
Spiros said:

[...] but after I posted it I started
wondering whether there might be occasions where copying
unitialised memory might be useful.

Click to expand...

OpenSSL uses this in order to make the random seed a little bit
more random.

If they are doing this to their version of the C rand() (and srand)
then they are not meeting the C standard. That requires that the
sequence always start at the same point, barring a srand call.

Adding adressing of IPv6 to program	1	Feb 16, 2023
Is memcpy with len=0 a NOP?	16	Jan 24, 2011
[memcpy] dst=NULL,size=0	9	Mar 3, 2009
Array of structs function pointer	10	Jul 16, 2023
gcc inline memcpy	7	Jul 12, 2012
Problem with displaying character that code number is 219 (after SetConsoleTextAttribute)?	3	Jan 9, 2023
C program: memory leak/ segmentation fault/ memory limit exceeded	0	Nov 12, 2022
THE PROGRAM IS NOT RUNING	3	Nov 1, 2022

memcpy() with unitialised memory

jameskuyper

Harald van DÄ³k

jameskuyper

Falcon Kirtaran

Wolfgang Draxinger

Richard Tobin

CBFalconer

Keith Thompson

CBFalconer

Spiros Bousbouras

CBFalconer

CBFalconer

CBFalconer

Ben Pfaff

Keith Thompson

Spiro Trikaliotis

Guest

dj3vande

dj3vande

CBFalconer

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads