Bounds checking and safety in C

Peter J. Holzer · Aug 2, 2007

This is the case where the operating system traps access to addresses not
owned by the process.

This is probably what he meant, but not what he wrote. He wrote:

"You read or write data outside bounds. It generates an exception." That
is the very definition of bounds checking (indeed he uses exactly the
same words for defining "bounds checking" below). The bounds are
checked, and if you try to access data outside, you get an exception.
(Actually, you might get an exception just by computing an out-of-bounds
pointer, but lets not be too picky. The memory maps maintained by the OS
usually don't achieve that: An OS "segment" usually contains a lot of
objects as well as unused space around them (e.g., memory which has been
freed, or unused stack space), so you can very often read or write data
outside of the bounds of your objects without getting an exception.

hp

Richard Bos · Aug 3, 2007

Peter J. Holzer said:
What is unclear about "This slowdown is still too large for the checks
to be used in released software"? The authors are clearly of the opinion
that the speedup from using OMD instead of PMD is not enough. (I don't
share that opinion: For most software an overhead of 63.9% won't matter
at all).

I must disagree with that. For most software, an overhead of 63.9% will
matter a lot, except in the bits where the computer is waiting for user
or disk input. One could argue that most programs are bounded by those
anyway, but that's not really true. Or rather, it's only true if you
look at it from the computer's POV, not from the user's POV.
To the user, time spent typing is not the same as time spent waiting for
the computer to do its job; and of the time spent waiting, the part
where the computer just sits there and appears to do nothing is much
more irritating than the part where it's clearly doing some hard work in
the background because the disk light is flickering like nobody's
business.
Therefore, _to the user_, who is the most important person in a computer
program's life, 63.9% added to the time he spends looking at a seemingly
frozen screen is a big deal; a much bigger deal than, for example, a
63.9% addition to the time between keystrokes.

Richard

Peter J. Holzer · Aug 3, 2007

I must disagree with that. For most software, an overhead of 63.9% will
matter a lot, except in the bits where the computer is waiting for user
or disk input.

Your "except" is "most software" in my experience. Interactive software
waits for the user most of the time (and for the disk most of the rest
of the time), and non-interactive software waits for disk (or network
connections) most of the time.

CPU usage only accounts for a small time of the run time of a program,
so increasing the CPU usage by 63.9% (or conversely, reducing it by 39%)
doesn't change the overall runtime much. (As an example, look at all
those programs written in scripting languages: They typically have a lot
more overhead than 60%. Yet the performance is adequate. As another
example, look at different computers: My old 800 MHz PIII is a lot
slower (much more than 60%) than my new 1.8 GHz Core2. But in everyday
work I don't notice much difference).

Of course there are programs (or libraries) where CPU usage does
matter and where 60% more CPU use is unacceptable. For these you may
want to turn off bounds checking. Or you may want to switch to Fortran.
Or find a better algorithm, which is 100 times faster, so your 60% are
completely insignificant.

One could argue that most programs are bounded by those anyway, but
that's not really true. Or rather, it's only true if you look at it
from the computer's POV, not from the user's POV.

Actually, I do look at it from the user's POV. If the response time
doesn't change by a noticeable amount, the user simply doesn't care how
much work the computer is doing (well, a howling fan might be a bit
distracting). Only when the program gets noticable faster or slower does
the user care.

hp

Antoninus Twink · Aug 5, 2007

I'd recommend using GTK+ 2. It is highly portable and has safe types as
alternatives to C types. Eg. gchar * is a string type that can be used
safely without worrying about overwriting memory etc. But if you need to
use APIs with non-bounds-checked strings, you can directly typecast, eg.

gchar *s="hello";
gint i=strlen((char *) s);

As Richard H. pointed out, this is a QoI issue. For may years I have
been using a development environment that supports run time bounds and
leak checking and I probably wouldn't use one that didn't.

There are alternatives to C if you want performance and better memory
safety.

--

santosh · Aug 5, 2007

Antoninus Twink wrote:

[ Please don't top-post. Fixed. ]

I'd recommend using GTK+ 2. It is highly portable and has safe types as
alternatives to C types. Eg. gchar * is a string type that can be used
safely without worrying about overwriting memory etc. But if you need to
use APIs with non-bounds-checked strings, you can directly typecast, eg.

gchar *s="hello";
gint i=strlen((char *) s);

C is used in a huge variety of situations where using GTK would not be an
option.

Ian Collins · Aug 5, 2007

Antoninus Twink wrote:

Please don't top post.

I'd recommend using GTK+ 2. It is highly portable and has safe types as
alternatives to C types. Eg. gchar * is a string type that can be used
safely without worrying about overwriting memory etc. But if you need to
use APIs with non-bounds-checked strings, you can directly typecast, eg.

gchar *s="hello";
gint i=strlen((char *) s);

While helpful on some platforms, there are places where C is used where
GTK isn't appropriate.

<OT>I think your example is wrong, GTK+ 2 uses GString as its string
type</OT>

William Hughes · Aug 6, 2007

I'd recommend using GTK+ 2. It is highly portable and has safe types as
alternatives to C types. Eg. gchar * is a string type that can be used
safely without worrying about overwriting memory etc. But if you need to
use APIs with non-bounds-checked strings, you can directly typecast, eg.

gchar *s="hello";
gint i=strlen((char *) s);

This seems similar to the suggested extension by
Jacob Navia. It has the same limitations.
While the "cast" (possibly overloading of the cast operator)
is more convenient than

old_string=get_old_string_from_new_string(new_string)

we still have the same problems. We can play pointer games
with old_string which are not possible with new_string.
If the storage for the strings overlaps, then modifying old_string
may corrupt new_string. If not, then there is the problem
of memory allocation and cleanup (can we "free" old_string,
should we?)

- William Hughes

Ben Pfaff · Aug 6, 2007

Antoninus Twink said:
I'd recommend using GTK+ 2. It is highly portable and has safe types as
alternatives to C types. Eg. gchar * is a string type that can be used
safely without worrying about overwriting memory etc. But if you need to
use APIs with non-bounds-checked strings, you can directly typecast, eg.

gchar *s="hello";
gint i=strlen((char *) s);

In the header files I have here for GTK+ 2.0, gchar and char are
identical types:
typedef char gchar;

Antoninus Twink · Aug 6, 2007

I believe you are mistaken. We have
typedef struct {
gchar *str;
gsize len;
gsize allocated_len;
} GString;

So if s is a GString then I believe the typecast (char *)s can fail in
the case that a GString struct has padding before its first member.

Antoninus Twink wrote:

Please don't top post.
While helpful on some platforms, there are places where C is used where
GTK isn't appropriate.

<OT>I think your example is wrong, GTK+ 2 uses GString as its string
type</OT>

--

CBFalconer · Aug 6, 2007

Antoninus said:
I believe you are mistaken. We have
typedef struct {
gchar *str;
gsize len;
gsize allocated_len;
} GString;

So if s is a GString then I believe the typecast (char *)s can fail in
the case that a GString struct has padding before its first member.

Which is specifically forbidden by the C standard.

Please do not top-post. Your answer belongs after (or intermixed
with) the quoted material to which you reply, after snipping all
irrelevant material. See the following links:

--
<http://www.catb.org/~esr/faqs/smart-questions.html>
<http://www.caliburn.nl/topposting.html>
<http://www.netmeister.org/news/learn2quote.html>
<http://cfaj.freeshell.org/google/> (taming google)
<http://members.fortunecity.com/nnqweb/> (newusers)

santosh · Aug 6, 2007

Antoninus Twink wrote:

Please don't top post.

I believe you are mistaken. We have
typedef struct {
gchar *str;
gsize len;
gsize allocated_len;
} GString;

So if s is a GString then I believe the typecast (char *)s can fail in
the case that a GString struct has padding before its first member.

Look up 6.7.2.1(13) in n1124.pdf.

There shall be no padding at the begginning of a struct object.

Default User · Aug 6, 2007

Antoninus said:
I believe you are mistaken.

Please don't top-post. Your replies belong following or interspersed
with properly trimmed quotes. See the majority of other posts in the
newsgroup, or:
<http://www.caliburn.nl/topposting.html>

Stephen Sprunk · Aug 6, 2007

Again, please don't top-post.

Antoninus Twink said:
I believe you are mistaken. We have
typedef struct {
gchar *str;
gsize len;
gsize allocated_len;
} GString;

Your post above said (gchar *) is GTK+'s "safe" string type. gchar, per the
official GLib documentation, is just a typedef for char; the same is true
for gint and int, gshort and short, and glong and long. They're "included
for completeness", not because they add any particular value. (gchar *) is
no safer than (char *) because they're the same thing.

Now you're talking about GString; that's different and brings an entirely
different set of problems...

So if s is a GString then I believe the typecast (char *)s can fail in
the case that a GString struct has padding before its first member.

The standard guarantees zero padding before the first element.

However, casting a GString to a (char *) or (gchar *) can, depending on what
you do with the result, get the GString's len and allocated_len members out
of sync with the string's actual contents. Doing that is just begging for
bugs, unless you are absolutely, positively sure that the function you're
passing it to won't modify the string.

S

Antoninus Twink · Aug 6, 2007

Re: top-posting - the ancients understood that it's OK when they said,
"De Gustibus non est disputandum".

If what you say were true, then I believe you wouldn't need a typecast
at all - if gchar is a typedef for char, then couldn't a gchar * be
converted implicitly to a char *?

Again, please don't top-post.

Your post above said (gchar *) is GTK+'s "safe" string type. gchar, per the
official GLib documentation, is just a typedef for char; the same is true
for gint and int, gshort and short, and glong and long. They're "included
for completeness", not because they add any particular value. (gchar *) is
no safer than (char *) because they're the same thing.

Now you're talking about GString; that's different and brings an entirely
different set of problems...

The standard guarantees zero padding before the first element.

However, casting a GString to a (char *) or (gchar *) can, depending on what
you do with the result, get the GString's len and allocated_len members out
of sync with the string's actual contents. Doing that is just begging for
bugs, unless you are absolutely, positively sure that the function you're
passing it to won't modify the string.

S

--

Flash Gordon · Aug 6, 2007

Antoninus Twink wrote, On 06/08/07 19:59:

Re: top-posting - the ancients understood that it's OK when they said,
"De Gustibus non est disputandum".

<snip>

The ancients did not have Usenet access. If you ignore the conventions
then you will find that although you have access you will not get much
help on a lot of groups because many people will decide you are not
worth the effort. So your choice, but you risk talking to only yourself.

santosh · Aug 6, 2007

Antoninus said:
Re: top-posting - the ancients understood that it's OK when they said,
"De Gustibus non est disputandum".

Indeed, nor is there any accounting for perverted tastes.

Top posting is fundamentally illogical and annoying, for an Usenet post,
since one has to scroll all the way to the bottom to find out the context
of the poster's words, when, in a proper post, all he'd have had to do,
would've been to glance at the text immediately above the poster'.

[snip]

Default User · Aug 6, 2007

Antoninus said:
Re: top-posting - the ancients understood that it's OK when they said,
"De Gustibus non est disputandum".

*plonk*

Brian

CBFalconer · Aug 6, 2007

Antoninus said:
Re: top-posting - the ancients understood that it's OK when they
said, "De Gustibus non est disputandum".

If what you say were true, then I believe you wouldn't need a
typecast at all - if gchar is a typedef for char, then couldn't
a gchar * be converted implicitly to a char *?

If you continue to top-post in this news group you will just be
PLONKED. Your choice.

Keith Thompson · Aug 6, 2007

Antoninus Twink said:
Re: top-posting - the ancients understood that it's OK when they said,
"De Gustibus non est disputandum".

[...]

<http://www.caliburn.nl/topposting.html>
<http://www.cpax.org.uk/prg/writings/topposting.php>

Richard Bos · Aug 7, 2007

Peter J. Holzer said:
Your "except" is "most software" in my experience. Interactive software
waits for the user most of the time (and for the disk most of the rest
of the time), and non-interactive software waits for disk (or network
connections) most of the time.

CPU usage only accounts for a small time of the run time of a program,
so increasing the CPU usage by 63.9% (or conversely, reducing it by 39%)
doesn't change the overall runtime much.

No, but that's not the point. It changes the _waiting_ time much more.
And that's what irritates users. They don't care if they themselves type
slowly, but they do care if they have to wait for the computer when
they're not typing.

(As an example, look at all those programs written in scripting
languages: They typically have a lot more overhead than 60%. Yet
the performance is adequate.

Not IME.

And when push comes to shove: the "It doesn't matter, because it's only
60%" attitude is culpable for Windows starting up slower than MS-DOS.

Richard

Looping for checking input integer	2	Feb 13, 2023
Bounds checking	8	Feb 5, 2008
Bounds checking	15	Mar 18, 2011
Bounds Checking as Undefined Behaviour?	29	Jul 29, 2010
JavaScript String Syntax Checking	5	Jun 29, 2022
Meme generator in c	1	Dec 23, 2022
Weird Behavior with Rays in C and OpenGL	4	Feb 13, 2024
Checking input value using loop	3	Feb 13, 2023

Bounds checking and safety in C

Peter J. Holzer

Richard Bos

Peter J. Holzer

Antoninus Twink

santosh

Ian Collins

William Hughes

Ben Pfaff

Antoninus Twink

CBFalconer

santosh

Default User

Stephen Sprunk

Antoninus Twink

Flash Gordon

santosh

Default User

CBFalconer

Keith Thompson

Richard Bos

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads