Bounds checking and safety in C

P

Peter J. Holzer

This is the case where the operating system traps access to addresses not
owned by the process.

This is probably what he meant, but not what he wrote. He wrote:

"You read or write data outside bounds. It generates an exception." That
is the very definition of bounds checking (indeed he uses exactly the
same words for defining "bounds checking" below). The bounds are
checked, and if you try to access data outside, you get an exception.
(Actually, you might get an exception just by computing an out-of-bounds
pointer, but lets not be too picky. The memory maps maintained by the OS
usually don't achieve that: An OS "segment" usually contains a lot of
objects as well as unused space around them (e.g., memory which has been
freed, or unused stack space), so you can very often read or write data
outside of the bounds of your objects without getting an exception.

hp
 
R

Richard Bos

Peter J. Holzer said:
What is unclear about "This slowdown is still too large for the checks
to be used in released software"? The authors are clearly of the opinion
that the speedup from using OMD instead of PMD is not enough. (I don't
share that opinion: For most software an overhead of 63.9% won't matter
at all).

I must disagree with that. For most software, an overhead of 63.9% will
matter a lot, except in the bits where the computer is waiting for user
or disk input. One could argue that most programs are bounded by those
anyway, but that's not really true. Or rather, it's only true if you
look at it from the computer's POV, not from the user's POV.
To the user, time spent typing is not the same as time spent waiting for
the computer to do its job; and of the time spent waiting, the part
where the computer just sits there and appears to do nothing is much
more irritating than the part where it's clearly doing some hard work in
the background because the disk light is flickering like nobody's
business.
Therefore, _to the user_, who is the most important person in a computer
program's life, 63.9% added to the time he spends looking at a seemingly
frozen screen is a big deal; a much bigger deal than, for example, a
63.9% addition to the time between keystrokes.

Richard
 
P

Peter J. Holzer

I must disagree with that. For most software, an overhead of 63.9% will
matter a lot, except in the bits where the computer is waiting for user
or disk input.

Your "except" is "most software" in my experience. Interactive software
waits for the user most of the time (and for the disk most of the rest
of the time), and non-interactive software waits for disk (or network
connections) most of the time.

CPU usage only accounts for a small time of the run time of a program,
so increasing the CPU usage by 63.9% (or conversely, reducing it by 39%)
doesn't change the overall runtime much. (As an example, look at all
those programs written in scripting languages: They typically have a lot
more overhead than 60%. Yet the performance is adequate. As another
example, look at different computers: My old 800 MHz PIII is a lot
slower (much more than 60%) than my new 1.8 GHz Core2. But in everyday
work I don't notice much difference).

Of course there are programs (or libraries) where CPU usage does
matter and where 60% more CPU use is unacceptable. For these you may
want to turn off bounds checking. Or you may want to switch to Fortran.
Or find a better algorithm, which is 100 times faster, so your 60% are
completely insignificant.
One could argue that most programs are bounded by those anyway, but
that's not really true. Or rather, it's only true if you look at it
from the computer's POV, not from the user's POV.

Actually, I do look at it from the user's POV. If the response time
doesn't change by a noticeable amount, the user simply doesn't care how
much work the computer is doing (well, a howling fan might be a bit
distracting). Only when the program gets noticable faster or slower does
the user care.

hp
 
A

Antoninus Twink

I'd recommend using GTK+ 2. It is highly portable and has safe types as
alternatives to C types. Eg. gchar * is a string type that can be used
safely without worrying about overwriting memory etc. But if you need to
use APIs with non-bounds-checked strings, you can directly typecast, eg.

gchar *s="hello";
gint i=strlen((char *) s);


As Richard H. pointed out, this is a QoI issue. For may years I have
been using a development environment that supports run time bounds and
leak checking and I probably wouldn't use one that didn't.

There are alternatives to C if you want performance and better memory
safety.


--
 
S

santosh

Antoninus Twink wrote:

[ Please don't top-post. Fixed. ]
I'd recommend using GTK+ 2. It is highly portable and has safe types as
alternatives to C types. Eg. gchar * is a string type that can be used
safely without worrying about overwriting memory etc. But if you need to
use APIs with non-bounds-checked strings, you can directly typecast, eg.

gchar *s="hello";
gint i=strlen((char *) s);

C is used in a huge variety of situations where using GTK would not be an
option.
 
I

Ian Collins

Antoninus Twink wrote:

Please don't top post.
I'd recommend using GTK+ 2. It is highly portable and has safe types as
alternatives to C types. Eg. gchar * is a string type that can be used
safely without worrying about overwriting memory etc. But if you need to
use APIs with non-bounds-checked strings, you can directly typecast, eg.

gchar *s="hello";
gint i=strlen((char *) s);
While helpful on some platforms, there are places where C is used where
GTK isn't appropriate.

<OT>I think your example is wrong, GTK+ 2 uses GString as its string
type</OT>
 
W

William Hughes

I'd recommend using GTK+ 2. It is highly portable and has safe types as
alternatives to C types. Eg. gchar * is a string type that can be used
safely without worrying about overwriting memory etc. But if you need to
use APIs with non-bounds-checked strings, you can directly typecast, eg.

gchar *s="hello";
gint i=strlen((char *) s);


This seems similar to the suggested extension by
Jacob Navia. It has the same limitations.
While the "cast" (possibly overloading of the cast operator)
is more convenient than

old_string=get_old_string_from_new_string(new_string)

we still have the same problems. We can play pointer games
with old_string which are not possible with new_string.
If the storage for the strings overlaps, then modifying old_string
may corrupt new_string. If not, then there is the problem
of memory allocation and cleanup (can we "free" old_string,
should we?)

- William Hughes
 
B

Ben Pfaff

Antoninus Twink said:
I'd recommend using GTK+ 2. It is highly portable and has safe types as
alternatives to C types. Eg. gchar * is a string type that can be used
safely without worrying about overwriting memory etc. But if you need to
use APIs with non-bounds-checked strings, you can directly typecast, eg.

gchar *s="hello";
gint i=strlen((char *) s);

In the header files I have here for GTK+ 2.0, gchar and char are
identical types:
typedef char gchar;
 
A

Antoninus Twink

I believe you are mistaken. We have
typedef struct {
gchar *str;
gsize len;
gsize allocated_len;
} GString;

So if s is a GString then I believe the typecast (char *)s can fail in
the case that a GString struct has padding before its first member.


Antoninus Twink wrote:

Please don't top post.
While helpful on some platforms, there are places where C is used where
GTK isn't appropriate.

<OT>I think your example is wrong, GTK+ 2 uses GString as its string
type</OT>


--
 
C

CBFalconer

Antoninus said:
I believe you are mistaken. We have
typedef struct {
gchar *str;
gsize len;
gsize allocated_len;
} GString;

So if s is a GString then I believe the typecast (char *)s can fail in
the case that a GString struct has padding before its first member.

Which is specifically forbidden by the C standard.

Please do not top-post. Your answer belongs after (or intermixed
with) the quoted material to which you reply, after snipping all
irrelevant material. See the following links:

--
<http://www.catb.org/~esr/faqs/smart-questions.html>
<http://www.caliburn.nl/topposting.html>
<http://www.netmeister.org/news/learn2quote.html>
<http://cfaj.freeshell.org/google/> (taming google)
<http://members.fortunecity.com/nnqweb/> (newusers)
 
S

santosh

Antoninus Twink wrote:

Please don't top post.
I believe you are mistaken. We have
typedef struct {
gchar *str;
gsize len;
gsize allocated_len;
} GString;

So if s is a GString then I believe the typecast (char *)s can fail in
the case that a GString struct has padding before its first member.

Look up 6.7.2.1(13) in n1124.pdf.

There shall be no padding at the begginning of a struct object.
 
S

Stephen Sprunk

Again, please don't top-post.

Antoninus Twink said:
I believe you are mistaken. We have
typedef struct {
gchar *str;
gsize len;
gsize allocated_len;
} GString;

Your post above said (gchar *) is GTK+'s "safe" string type. gchar, per the
official GLib documentation, is just a typedef for char; the same is true
for gint and int, gshort and short, and glong and long. They're "included
for completeness", not because they add any particular value. (gchar *) is
no safer than (char *) because they're the same thing.

Now you're talking about GString; that's different and brings an entirely
different set of problems...
So if s is a GString then I believe the typecast (char *)s can fail in
the case that a GString struct has padding before its first member.

The standard guarantees zero padding before the first element.

However, casting a GString to a (char *) or (gchar *) can, depending on what
you do with the result, get the GString's len and allocated_len members out
of sync with the string's actual contents. Doing that is just begging for
bugs, unless you are absolutely, positively sure that the function you're
passing it to won't modify the string.

S
 
A

Antoninus Twink

Re: top-posting - the ancients understood that it's OK when they said,
"De Gustibus non est disputandum".

If what you say were true, then I believe you wouldn't need a typecast
at all - if gchar is a typedef for char, then couldn't a gchar * be
converted implicitly to a char *?

Again, please don't top-post.



Your post above said (gchar *) is GTK+'s "safe" string type. gchar, per the
official GLib documentation, is just a typedef for char; the same is true
for gint and int, gshort and short, and glong and long. They're "included
for completeness", not because they add any particular value. (gchar *) is
no safer than (char *) because they're the same thing.

Now you're talking about GString; that's different and brings an entirely
different set of problems...


The standard guarantees zero padding before the first element.

However, casting a GString to a (char *) or (gchar *) can, depending on what
you do with the result, get the GString's len and allocated_len members out
of sync with the string's actual contents. Doing that is just begging for
bugs, unless you are absolutely, positively sure that the function you're
passing it to won't modify the string.

S


--
 
F

Flash Gordon

Antoninus Twink wrote, On 06/08/07 19:59:
Re: top-posting - the ancients understood that it's OK when they said,
"De Gustibus non est disputandum".

<snip>

The ancients did not have Usenet access. If you ignore the conventions
then you will find that although you have access you will not get much
help on a lot of groups because many people will decide you are not
worth the effort. So your choice, but you risk talking to only yourself.
 
S

santosh

Antoninus said:
Re: top-posting - the ancients understood that it's OK when they said,
"De Gustibus non est disputandum".

Indeed, nor is there any accounting for perverted tastes.

Top posting is fundamentally illogical and annoying, for an Usenet post,
since one has to scroll all the way to the bottom to find out the context
of the poster's words, when, in a proper post, all he'd have had to do,
would've been to glance at the text immediately above the poster'.

[snip]
 
C

CBFalconer

Antoninus said:
Re: top-posting - the ancients understood that it's OK when they
said, "De Gustibus non est disputandum".

If what you say were true, then I believe you wouldn't need a
typecast at all - if gchar is a typedef for char, then couldn't
a gchar * be converted implicitly to a char *?

If you continue to top-post in this news group you will just be
PLONKED. Your choice.
 
R

Richard Bos

Peter J. Holzer said:
Your "except" is "most software" in my experience. Interactive software
waits for the user most of the time (and for the disk most of the rest
of the time), and non-interactive software waits for disk (or network
connections) most of the time.

CPU usage only accounts for a small time of the run time of a program,
so increasing the CPU usage by 63.9% (or conversely, reducing it by 39%)
doesn't change the overall runtime much.

No, but that's not the point. It changes the _waiting_ time much more.
And that's what irritates users. They don't care if they themselves type
slowly, but they do care if they have to wait for the computer when
they're not typing.
(As an example, look at all those programs written in scripting
languages: They typically have a lot more overhead than 60%. Yet
the performance is adequate.

Not IME.

And when push comes to shove: the "It doesn't matter, because it's only
60%" attitude is culpable for Windows starting up slower than MS-DOS.

Richard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,073
Latest member
DarinCeden

Latest Threads

Top