stack smashing

F

frank

Keith said:
Ah, but it is true. Any result is okay in the sense that it's
permitted by the standard. That includes getting consistent results
on a particular system, which is what exploits generally take
advantage of.

It took me a while in this thread to figure out a) what stack-smashing
is and b) why my program was doing it.

When I worked up the example from the wiki in the original post, with
the 10.5 and the 96.1, one thing notable was that it didn't change that
value on my (ubuntu) implementation. It didn't succeed in changing that
value irrespective of how many characters I used in the buffer that was
being overrun. The OS detected it and refused to write to the stack
data inappropriately.

Also in researching this, I found out some sentinel values used by
stacks to prevent this type of hack. My friend mentioned 0x deadbeef as
one such value.
 
J

jaysome

This is exceptionally poor advice - and as usual, the famous "clc peer
review" that would have seen half a dozen people piling in to point out
that the resulting int might have been a trap representation or some
such nonsense if a newbie had posted this remains strangely silent when
it is one of their chums who's boobooed.

I disagree that it's "exceptionally poor advice", and would go further
to argue that it's sane advice, in most cases.

The C standard guarantees that INT_MAX is at least 32767, and the size
of any scalar type will always be less than this (at least in the real
world). Rather than casting the result of sizeof to "unsigned long",
it's simply easier to cast it to "int". In the instant case, we know
that the result of "sizeof(float)" is guaranteed to fit within type
int (again, in the real world).

In all of my years of development, I've never run into a case where
casting the return value of sizeof to "int" in a printf statement has
ever been a problem. If there were cases in which sizeof returned a
value greater than 32767, I was working on a platform in which the
compiler used 32-bit int, so it was not a problem. And if I had ever
ran into a case in which sizeof returned a value greater than 32767
and I was working on a platform in which the compiler used 16-bit int
(e.g., on some embedded devices), then, admittedly, I had bigger fish
to fry (like I don't even have 16K let alone 4K of RAM and thus the
printf code was never executed).

I find printf very useful in test programs to print out the size of my
user-defined (e.g., structure) types, and the pattern I use is:

printf("sizeof(T) is %d\n", (int)sizeof(T));

where T is my user-defined type.
 
M

Michael Foukarakis

Ah, but it is true.  Any result is okay in the sense that it's
permitted by the standard.  That includes getting consistent results
on a particular system, which is what exploits generally take
advantage of.

Exactly - stack smashing is only related to the C standard until we
overrun a buffer; after that, there's a whole other system of rules to
break - not as meticulously defined or standardized as C (or any
programming language) but it's what makes it fun to break. :)
 
N

Nick Keighley

[as previous post but with (hopefully!) less typos


if you read the documentation for strncpy() you will [find] it is rather odd.
[strncpy()] is designed for copying small fixed arrays of characters that
were not necessarily zero terminated. I believe unix has (or had) such
things.

  char thing_name[8] = "AAAABBBB";
the above is not nul terminated in C (C++ is different) because the
number of characters in the initialiser exactly fits.

  strncpy (thing_name, "NEW_NAME", 8);
copies exactly 8 characters. No nul on the end.

  strncpy (thing_name, "SMALL", 8);
copies 5 character and appends three nuls.

This all makes sense but it often isn't what people expect. The strncpy()
of "NEW_NAME" means that thing_name isn't a valid C string after
the call! There's no terminator.

And that padding habbit can go wildly wrong.
  char buffer[100];
  strncpy (buffer, "tmp/", 100);

avoids a call of strlen() on "tmp/" (saving 5 character reads) but
tags 96 nuls onto the end of buffer.

optimising gnats and pessimising camels.
One possibility is to write a function that does what you might expect
strncpy() to do. (but don't call it strsomething() because that invades a
namespace reserved for the implementation).

<snip>
 
N

Nick Keighley

[Where a function takes a pointer to a buffer and the size of the buffer,
it's usually a good idea to use sizeof rather than the macro you used when
defining the buffer. That way, you don't have to update the rest of the
code when you change the declaration.]

as long as we're talking about chars that is. If buffer is made up of
some other type it's often better to pass the number of elements
rather than the size.

I use
#define ARRAY_SIZE(A) (sizeof(A)/sizeof(A[0]))
 
E

Ersek, Laszlo

The C standard guarantees that INT_MAX is at least 32767, and the size
of any scalar type will always be less than this (at least in the real
world). Rather than casting the result of sizeof to "unsigned long",
it's simply easier to cast it to "int". In the instant case, we know
that the result of "sizeof(float)" is guaranteed to fit within type
int (again, in the real world).

(Topic change.) This "int vs. size_t" question makes me remember what I
don't really like about the printf() family:

- The return value signals the number of characters transmitted (if no
error occurred). While strlen() returns a size_t, printf() and co.
return an int.

- Same for %n.

- Same for the "*" field width and precision.

(C99 7.19.6.1 The fprintf function, p15:

----v----
Environmental limits

The number of characters that can be produced by any single conversion
shall be at least 4095.
----^----

Does this mean one can't portably pass a string to a single %s if
strlen() returns at least 4096 for that string?)

I have the (very superficial) impression that unsigned integers are
historically very under-used in favor of signed integers. For example,
the (not standard C) BSD socket interfaces historically took a lot of
"size parameters" (obvious candidates for the sizeof operator) as int's:

- accept(): 3rd param
- bind(): 3rd param
- connect(): 3rd param
- getpeername(): 3rd param
- getsockname(): 3rd param
- getsockopt(): 5th param
- recvfrom(): 6th param
- sendto(): 6th param
- setsockopt(): 5th param

If one opens the manual page for accept() on a GNU/Linux distribution,
something like this should come up:

----v----
The third argument of accept was originally declared as an `int *' (and
is that under libc4 and libc5 and on many other systems like BSD 4.*,
SunOS 4, SGI); a POSIX 1003.1g draft standard wanted to change it into a
`size_t *', and that is what it is for SunOS 5. Later POSIX drafts
have `socklen_t *', and so do the Single Unix Specification and glibc2.
Quoting Linus Torvalds: _Any_ sane library _must_ have "socklen_t" be
the same size as int. Anything else breaks any BSD socket layer stuff.
POSIX initially _did_ make it a size_t, and I (and hopefully others,
but obviously not too many) complained to them very loudly indeed.
Making it a size_t is completely broken, exactly because size_t very
seldom is the same size as "int" on 64-bit architectures, for example.
And it _has_ to be the same size as "int" because that's what the BSD
socket interface is. Anyway, the POSIX people eventually got a clue,
and created "socklen_t". They shouldn't have touched it in the first
place, but once they did they felt it had to have a named type for some
unfathomable reason (probably somebody didn't like losing face over
having done the original stupid thing, so they silently just renamed
their blunder).
----^----

I believe historically wrong parameter types should be fixed in
standards, and it was right to change those parameter types to size_t in
the first place, because size_t is the return type of the sizeof
operator. If for whatever reason it was necessary to match int's size,
"unsigned" would still fit better than "int".

Now if one writes code simultaneously for SUSv1 (UNIX 95) and SUSv2
(UNIX 98) or later, all such function calls need preprocessor magic or
the following ugly but useful "technique":

{
struct msghdr dummy;
struct sockaddr_in addr;
int acc_sock;

dummy.msg_namelen = sizeof addr;
acc_sock = accept(sock, (struct sockaddr *)&addr, &dummy.msg_namelen);
}

Because the type of the msg_namelen member changed from size_t (SUSv1)
to socklen_t (SUSv2+) in parallel to the other parameters listed above.
(Not surprisingly, as msg_namelen communicates an address size
otherwise.)

Similarly, the not standard C fcntl() / F_SETFL lets or requires the
programmer to manipulate a bitmask. Why is the mask represented as an
int, instead of an unsigned? Let's suppose we possibly opened a FIFO in
nonblocking mode and now we want to ensure blocking behavior:

{
int opts;

opts = fcntl(fd, F_GETFL);
if (-1 == opts /* query could have been incorporated here */
|| -1 == fcntl(fd, F_SETFL, opts & ~O_NONBLOCK)) {
(void)fprintf(stderr, "%s: fcntl(): %s\n", progname, strerror(errno));
}
}

O_NONBLOCK is positive (because the value returned by fcntl() / F_GETFL
is positive if no error occurs). It must be representable by an int (see
return type again), thus it is not promoted above int within the bitwise
complement operator. ~O_NONBLOCK will be no trap representation, but it
will be probably negative. Then we BIT-AND that negative value with the
current flags. Ugly.

-o-

read() and write() take a size_t parameter for the number of bytes to be
read/written, but return ssize_t so that -1 can be returned to signal an
error. Consequently, only min { actual number of bytes, SSIZE_MAX } can
be passed.

I think all such functions should return -1 or 0 for error or success
correspondingly, and store the actual output through a
programmer-supplied pointer.

int made_up_fprintf(size_t *wr, FILE *strm, const char *fmt, ...);
int made_up_read(size_t *rd, int fd, void *buf, size_t nbyte);

("restrict" omitted for simplicity.)

fcntl() / F_GETFL takes a variable number of arguments anyway, so it
could return the current file status flags and file access modes through
a pointer using the current prototype. The type carrying the flags
should be unsigned int.

-o-

If I'm already talking about what I perceive as illogical interfaces,
the hierarchy of the BSD socket address structures / functions is wrong.
Consider:

{
struct sockaddr_in addr;

addr.sin_family = AF_INET;
addr.sin_port = htons(12345);
addr.sin_addr.s_addr = inet_addr("192.168.1.3");
}

Part of the TCP/IP (v4) protocol stack looks like this:
1) internet layer (eg. IP addresses and transport protocol selection)
2) transport layer (eg. TCP/UDP ports, dependent on the protocol
selected above)

The very existence of the "port" notion depends on the protocol selected
at the internet layer. There should be other IP-based protocols than TCP
and UDP, with a different or no "port" notion (eg. ICMP). The order of
"specialization" with an API should reflect the underlying protocol
structure.

1a) internet layer: address(es)
1b) internet layer: transport protocol
--------
2) transport layer: protocol-specific stuff, eg. port(s)

In reality, we have

socket(AF_INET, SOCK_STREAM, 0);
or socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);

or

socket(AF_INET, SOCK_DGRAM, 0);
or socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);

or similar. Then we provide local or remote IP address(es) and TCP/UDP
port(s) in the second step (bind() / connect()). This corresponds to

1b) internet layer: transport protocol
--------
1a) internet layer: address(es)
2) transport layer: protocol-specific stuff, eg. port(s)

I'm not saying this doesn't work -- it does and I like network
programming (see http://lacos.hu for my few tiny toys that nevertheless
make huge use to me). I claim that most of the BSD socket interface is a
very non-intuitive black book of magic incantations. See "Hobbit"'s
comments in the netcat source (IIRC), for example.

.... I think this was a bit off-topic, sorry.

Cheers,
lacos
 
B

Ben Pfaff

jaysome said:
In all of my years of development, I've never run into a case where
casting the return value of sizeof to "int" in a printf statement has
ever been a problem. If there were cases in which sizeof returned a
value greater than 32767, I was working on a platform in which the
compiler used 32-bit int, so it was not a problem. [...]

Desktop and server platforms are all moving to 64-bit address
spaces, but many of these still have 32-bit int, so you may soon
run into a platform where "(unsigned) int" is not adequate for
the size of an object.
 
A

Antoninus Twink

In the instant case, we know that the result of "sizeof(float)" is
guaranteed to fit within type int (again, in the real world).

Regular readers of this group will know that I'm more than pragmatic
when it comes to using features or constructions that will work fine in
the real world even if they are not ISO C.

However, even I baulk at a completely gratuitous use of something
non-portable for no gain or convenience whatsoever...
Rather than casting the result of sizeof to "unsigned long",
it's simply easier to cast it to "int".

....except saving a few characters of typing. It smacks to me of
carelessness, and that's a bad attribute in a programmer whether they
are real-world pragmatists or clc pie-in-the-sky dreamers.
I find printf very useful in test programs to print out the size of my
user-defined (e.g., structure) types, and the pattern I use is:

printf("sizeof(T) is %d\n", (int)sizeof(T));

If you're writing test programs and not production code, then do
whatever you like, of course! It makes no difference either way.

Even so, doesn't it appeal to your lazy instincts to be able to save a
whole 3 characters' worth of typing by using

printf("sizeof(T) is %zu\n", sizeof(T));
 
R

Richard Tobin

Antoninus Twink said:
If you're writing test programs and not production code, then do
whatever you like, of course! It makes no difference either way.

How often does the question of printing out the value of a sizeof
expression come up in production code?
Even so, doesn't it appeal to your lazy instincts to be able to save a
whole 3 characters' worth of typing by using

printf("sizeof(T) is %zu\n", sizeof(T));

There's laziness and laziness... for many people it will be the choice
between typing 3 extra characters and looking up the format string
for a size_t.

-- Richard
 
B

Ben Pfaff

How often does the question of printing out the value of a sizeof
expression come up in production code?

Fairly often, at least in code that emits log messages, for
production code that deals heavily with network protocol parsing.
(Of course, this code is generally not 100% comp.lang.c-compliant
anyhow, since it typically represents protocol entities with C
structures whose data sizes are not 100% portable.)
 
L

lawrence.jones

Ben Pfaff said:
Desktop and server platforms are all moving to 64-bit address
spaces, but many of these still have 32-bit int, so you may soon
run into a platform where "(unsigned) int" is not adequate for
the size of an object.

Even with a 64-bit address space, it's exceedingly rare to have a single
object whose size won't fit in 32 bits. The value of a large address
space is usually the ability to have *lots* of objects rather than very
big ones.
 
S

Seebs

How often does the question of printing out the value of a sizeof
expression come up in production code?

Depends on how heavily instrumented or logged it is. :)

(More generally, while it's rarely sizeof, I print a lot of size_t
values.)

-s
 
B

Ben Pfaff

Even with a 64-bit address space, it's exceedingly rare to have a single
object whose size won't fit in 32 bits. The value of a large address
space is usually the ability to have *lots* of objects rather than very
big ones.

Most of the time, yes. But sometimes you want to do something
like map an entire hard drive (or hard drive virtual image) into
your address space, simulate a virtual machine with lots of
memory, etc. And certainly one can read a large file into memory
as well.
 
N

Nick Keighley

If you're writing test programs and not production code, then do
whatever you like, of course! It makes no difference either way.

I've found if I apply this too literally then I spend all my time
debugging test code, because my test code is buggily claiming my
production code is buggy!
 
J

John Bode

AFAIK, neither C89 nor C90 (which are for all practical purposes
identical) defines %zu.

Grr; that's supposed to be C99 instead of C90. There's another
synapse popped...
 
J

jaysome

Most of the time, yes. But sometimes you want to do something
like map an entire hard drive (or hard drive virtual image) into
your address space, simulate a virtual machine with lots of
memory, etc. And certainly one can read a large file into memory
as well.

Ben,

In these cases, wouldn't you most likely (always?) be using a pointer
to memory rather than something like a fixed-size array to refer to
the entire hard drive or virtual machine? If not, how do you know what
size the fixed-size array should be?

That's how typical "map" functions work--they return a pointer and a
size (or you specify the size). In such a case, sizeof is of no use.
Certainly in these types of cases, you would not use sizeof and would
not use "%d" to print out the size. For example:

unsigned char *p;
size_t size;

if ( MapHardDrive("/dev/hd0", &p, &size) )
{
// For C90
printf("Mapped hard drive is size %lu.\n", (unsigned long)size);
// For C99
printf("Mapped hard drive is size %zu.\n", size);
}
 
J

jaysome

printf("sizeof(T) is %zu\n", sizeof(T));

#include <stdio.h>
int main(void)
{
printf("sizeof(int)is %zu\n", sizeof(int));
return 0;
}

This is the output I get with VC++ 6.0:

sizeof(int)is zu

That's because the "%zu" conversion specifier is new in C99, and VC++
6.0, like most of the dozen or so compilers I use, are not
C99-compliant. I think it will always be this way. YMMV.

So rather than have to deal with portability issues, I simply write:

#include <stdio.h>
int main(void)
{
printf("sizeof(int) is %d\n", (int)sizeof(int));
return 0;
}
 
B

Ben Pfaff

jaysome said:
In these cases, wouldn't you most likely (always?) be using a pointer
to memory rather than something like a fixed-size array to refer to
the entire hard drive or virtual machine?

Yes.

But you should not use "int" or "unsigned int" to print the size
of a size_t that measures the size of such a large object.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top