Malcolm's new book

B

Ben Bacarisse

Malcolm McLean said:
Yes. About two years later Steve Summit edited to FAQ to point out
that fgets() is problematic if lines overflow the buffer length, which
was the point I was making all along. Never said I was right.

You are curiously casual about language for someone who works with
computers. All C functions that put data into buffers are problematic
if the data "overflows" the buffer length, but I don't think you meant
what you wrote. It would be either an almost meaningless remark about
the operation of the function, or a gross under statement of the
problem depending on how precise you are using the term "overflow".

I won't argue the point if you wish to call it "problematic" that
fgets puts no more data into the buffer than there is room for (to me,
it just what fgets does) but I would welcome some examples of programs
that would be safer if they used gets() rather than fgets(). There
must be lots since you claim it is true of "most programs [that use
fgets()]".
 
C

Charlton Wilbur

(to Malcom McLean)

B> You are curiously casual about language

.... full stop.

One would think that after however many threads where Malcolm says
something absurd, gets corrected, and argues about it, claiming "I was
only being informal, something you non-linguists would not
understand," or "I was only trying to write so that the completely
ignorant would not be confused" -- while sometimes restating what he
originally said in ways that subtly move it closer to the correction,
and claiming that that's what he meant all along, sometimes
selectively editing the words of his interlocutors to suggest that
they agree with him -- he would learn to be more precise in his
language.

Or, possibly, that he would learn *something* from the corrections.

Alas, one is frequently disappointed. I suspect it's the manager's
disease: where being *perceived* as being right is far more important
than *being* right, and so Malcolm spends a lot more energy trying to
argue that he was right in the first place than he would just
admitting his error and learning from it.

Charlton
 
K

Kenny McCormack

No. fgets is not problematic. fgets behaves in a defined non problematic
way.[/QUOTE]

Um, er, that's like saying that starving to death is not problematic.
Afterall, it behaves in a defined (etc) way.
 
R

Richard

No. fgets is not problematic. fgets behaves in a defined non problematic
way.

Um, er, that's like saying that starving to death is not problematic.
Afterall, it behaves in a defined (etc) way.
[/QUOTE]

Except fgets isn't the cause of the problem. The handling of it might
be. fgets is well defined to work as documented.
 
K

Keith Thompson

Malcolm McLean said:
Yes. About two years later Steve Summit edited to FAQ to point out
that fgets() is problematic if lines overflow the buffer length, which
was the point I was making all along. Never said I was right.

Of course he never said you were right. You weren't.

Your claims are:

(1) "fgets() is too hard for the average programmer to use
correctly"

(2) "most programs would be safer if they replaced fgets() with
gets()"

Steve Summit's statement in the FAQ, assuming it's as you portray it,
does not in any way support your first claim, and it most certainly
doesn't support your second, which is directly contradicted by
question 12.23.

A Google search indicates that the word "problematic" does not appear
anywhere under the c-faq.com domain. Can you provide a reference to
Steve Summit's statement about fgets that you were referring to?
 
F

Flash Gordon

CBFalconer wrote, On 30/08/07 00:13:
You also missed that ggets is immune to this type of data loss, in
that it returns a filled buffer with an error signal. A new call
to ggets will continue the line, provided there is memory available
at that time. I don't know about other routines.

I was only commenting about Malcolm's code, not yours. So that is saying
I did not hit something I was not aiming at.
 
M

Malcolm McLean

Richard Heathfield said:
The only FAQ comment I can find that's even remotely relevant is "(If
long lines are a real possibility, their proper handling must be
carefully considered.)" - which is certainly true but says nothing
about overflowing buffer length.
The previous version showed "how to replace gets(0 with a call to fgets()",
then threw away the newline, making it impossible for caller to tell whether
or not a full line of input had been read. That wasn't an improvement.
However it got rid of the undefined behaviour.
Pointing out that it wasn't an improvement created massive protests. It got
rid of the undefined behaviour, so it must be better, mustn't it?
Hold that thought.
Some people aren't very mannered. But at least the FAQ is now right.
 
K

Keith Thompson

Malcolm McLean said:
The previous version showed "how to replace gets(0 with a call to
fgets()", then threw away the newline, making it impossible for caller
to tell whether or not a full line of input had been read. That wasn't
an improvement. However it got rid of the undefined behaviour.
Pointing out that it wasn't an improvement created massive
protests. It got rid of the undefined behaviour, so it must be better,
mustn't it?

Yes. Getting rid of undefined behavior is an improvement. A program
that incorrectly interprets a long line as two shorter lines is an
improvement over a program reads that same long line and incurs a
buffer overflow, causing it to behave in some arbitrarily bad way.
(If you're very very lucky, the resulting undefined behavior might
cause it to interpret the long line as two shorter lines.)

There was still room for more improvement, of course, but that's not
what you claimed.
Some people aren't very mannered. But at least the FAQ is now right.

Yes, it is. Are you seriously claiming credit for that?
 
K

Kelsey Bjarnason

[snips]

In practise fgets() is too hard for the average programmer to use correctly,
as has time after time been demonstrated here. I caused outrage by
suggesting that most programs would be safer if they replaced fgets() with
gets(). However I was right.

The only one here I can recall ever claiming to use gets safely was (from
memory) Dann Pop. However, that was (again, from memory) in a rather
limited set of circumstances.

The use of gets is *never* warranted except, possibly, in very rare cases
where you absolutely control all possible inputs, as without that, the
function simply cannot be used safely.

It is a function to be used only by experts, and only in very unusual
cases; I can't see any way you could have recommended it and not been
simply wrong.
The real answer is not to use gets() but something like ggets()

Which, presumably, can be used safely. Or just learn how to use fgets.
 
K

Kelsey Bjarnason

[snips]

You only mentioned part of the problem, namely that you cannot
distinguish between the two conditions. You do not mention the problem
of loosing the input that has been received.

He did, elsewhere. His take? It is better for *him* to decide to throw
away the data than to pass potentially incomplete data on to you and let
you decide what to do with it.

Just what we need: a library writer who thinks he is god.
 
K

Kelsey Bjarnason

[snips]

The OS ought to tell a process how much memory it can "reasonably" have.

Sounds good. I'm currently running 148 processes, ranging from mixers to
window managers to browsers to multiple DB servers. If a DB server needs,
oh, 256MB to process a large query, is that reasonable? How does the OS
decide?
It is very bad design to allow something to gobble up huge portions of
swap space, slowing every other part of the system to a crawl, except in
the relatively unusual circumstnaces where the process does genuinely
need all of the systems' resources.

It's a bad design to have the current process suck up swap at all,
generally, as the usually appropriate solution is to swap low-usage
or low-priority processes rather than active ones, thus freeing the memory
to be used by the process actually doing things.
So the strategy of asking malloc() for memory until failure ought to be
the best one.

Yet it almost never is, as even between one malloc and the next, the state
of the system can change significantly.
The reality is that it is not. Most systems will happily
hand out memory that cannot be afforded, and is almost certainly isn't
really wanted, allowing exploiters to tie up system resources. However
there isn't an easy answer, which is why we having this sub-thread.

There is an easy solution: use single-process operating systems.
readline() doesn't provide the answer either. It doesn't try.

It doesn't even try to be useful - discarding perfectly useful data, for
example.
It
provides something that is good enough for casual use

Actually, it doesn't. It provides something which serves best as a guide
of what not to do with an input routine.
reader into the way of writing functions. If he can improve on
readline() then the chapter has done its job.

If he can improve it, he's obviously doing better than the author is...
and if he can, he probably doesn't need readline in the first place.
one. For instance a function that writes an error message to stderr, and
then terminates, might be exactly what is wanted in the context of a
particular program, though it is no good for a general-purpose fucntion.

A function which gracefully handles errors - out of memory, end of file,
whatever - and returns a status indicating what's going on, rather than
just deciding "Hah, your data doesn't matter, so I'm going to discard it"
would be better still - and would be useful as a GP function. It'd also
make good fodder for an algorithms book, as in "Here's how to develop a
sensible algorithm to handle input."

Too bad we don't have someone interested in writing algorithm books - and
competent to do so.
 
R

Richard

Kelsey Bjarnason said:
[snips] inF>
In practise fgets() is too hard for the average programmer to use correctly,
as has time after time been demonstrated here. I caused outrage by
suggesting that most programs would be safer if they replaced fgets() with
gets(). However I was right.

The only one here I can recall ever claiming to use gets safely was (from
memory) Dann Pop. However, that was (again, from memory) in a rather
limited set of circumstances.

The use of gets is *never* warranted except, possibly, in very rare cases
where you absolutely control all possible inputs, as without that, the
function simply cannot be used safely.
It is a function to be used only by experts, and only in very unusual
cases; I can't see any way you could have recommended it and not been
simply wrong.

Would I recommend it? Not really/necessarily, but a lot code is written where
the input limits are set by the run time setup. It seems to be favorite
icon for people wanting to demonstrate their knowledge of buffer over runs.
Which, presumably, can be used safely. Or just learn how to use
fgets.

ggets() is a mess. It does nothing particularly well - its increment is
poorly thought out, is non configurable and has no idea of how to bail
out. It is a far better idea to use fgets() and do it properly.
 
R

Richard

Richard said:
Kelsey Bjarnason said:
[snips] inF>
In practise fgets() is too hard for the average programmer to use correctly,
as has time after time been demonstrated here. I caused outrage by
suggesting that most programs would be safer if they replaced fgets() with
gets(). However I was right.

The only one here I can recall ever claiming to use gets safely was (from
memory) Dann Pop. However, that was (again, from memory) in a rather
limited set of circumstances.

The use of gets is *never* warranted except, possibly, in very rare cases
where you absolutely control all possible inputs, as without that, the
function simply cannot be used safely.
It is a function to be used only by experts, and only in very unusual
cases; I can't see any way you could have recommended it and not been
simply wrong.

Would I recommend it? Not really/necessarily, but a lot code is written where
the input limits are set by the run time setup. It seems to be favorite
icon for people wanting to demonstrate their knowledge of buffer over runs.
Which, presumably, can be used safely. Or just learn how to use
fgets.

ggets() is a mess. It does nothing particularly well - its increment is
poorly thought out, is non configurable and has no idea of how to bail
out. It is a far better idea to use fgets() and do it properly.

ps, here is another boring thread with exactly the same points and the
same people pimping their libraries:

http://www.thescripts.com/forum/thread222573.html
 
K

Kelsey Bjarnason

[snips]

On Sun, 02 Sep 2007 19:34:56 +0200, Richard wrote:

[re gets]
Would I recommend it? Not really/necessarily, but a lot code is written where
the input limits are set by the run time setup. It seems to be favorite
icon for people wanting to demonstrate their knowledge of buffer over
runs.

Yes, I've seen it used in several places when people are showing how to
exploit systems/apps/whatever. Just goes to show, if you don't use the
barking thing in the first place, it makes their lives just that much more
difficult - and I'm all in favour of annoying crackers. :)
ggets() is a mess. It does nothing particularly well - its increment is
poorly thought out, is non configurable and has no idea of how to bail
out. It is a far better idea to use fgets() and do it properly.

I've not really looked at ggets, so I can't really comment.

This all strikes me as something akin to what I did for a socket library a
while back. Basically, you may have to read the data in multiple blocks
and stitch them together. All very good, but what if some bonehead tries
to send 4GB this way?

Easy; let the library user specify a reasonable increment and a reasonable
top limit for his application. *I* certainly don't know what *he*
considers reasonable limits, so my setting them is kinda pointless. Maybe
he's got 32GB RAM and 4GB in a buffer is perfectly sensible. Maybe not.
Either way, if he can set the limits and the code simply obeys them, the
problem goes away.
 
D

David Thompson

On Sun, 26 Aug 2007 21:16:29 +0200, "Peter J. Holzer"
Yes. In fact, there are some other signals (SIGXCPU, SIGXFSZ) used to
signal excessive resource usage which can be caught or ignored. Using
SIGKILL for exceeding memory usage was probably a bad design decision.
After all, a process can free memory, but it can't decrease CPU usage.
Why not? One of the examples we typically use/see for wanting to
handle mem-alloc problems is "if there's not enough memory to
read/display/whatever a huge graphic allow the user to continue
without it, presumably with text -- or even a smaller/simpler graphic;
or just shut down gracefully, saving current data, rather than die."
Analagous cases seem to apply to CPU: if say we are rendering a huge
complicated video, perhaps we could substitute a quarter-size at
reduced framerate. And if we're doing something like SETI or
primesearch in the background, we could just suspend it completely
until some later time. (Unless of course the user takes the attitude
that primesearch is more important than "real" work, but such a user
presumably can disable this feature.)

Indeed, since CPU is (on systems I know about) totally fungible, this
actually works better. free()ing _some_ C level chunks may not fully
release entire pages or regions to the OS, but not executing N cycles
worth of instructions definitely makes those cycles available. (I say
cycles rather than instructions since on modern machines there is
often not a fixed relation of insn X = N(X) cycles.) Although another
process/task/whatever may not get quite the entire benefit of N cycles
because it suffers context-switch or cool-cache costs.

- formerly david.thompson1 || achar(64) || worldnet.att.net
 
P

Peter J. Holzer

On Sun, 26 Aug 2007 21:16:29 +0200, "Peter J. Holzer"

Why not?

Because the CPU-usage is cumulative. The process gets a signal after it
has used n cpu-seconds. Obviously it cannot give back cpu-time it has
already consumed.

hp
 
K

Kelsey Bjarnason

[snips]

Why not? One of the examples we typically use/see for wanting to
handle mem-alloc problems is "if there's not enough memory to
read/display/whatever a huge graphic allow the user to continue
without it, presumably with text -- or even a smaller/simpler graphic;
or just shut down gracefully, saving current data, rather than die."
Analagous cases seem to apply to CPU: if say we are rendering a huge
complicated video, perhaps we could substitute a quarter-size at
reduced framerate.

Might reduce memory, but as to CPU, if you're processing, oh, a million
pixels per second on the large video, you're presumably also processing a
million pixels per second on the smaller.

Total runtime might decrease, but CPU consumption during processing
shouldn't change unless you add some sort of "delay and give the CPU back
to the OS" code in there.
 
K

Kelsey Bjarnason

[snips]

Guess what? We use compilers, a language and other tools that _do_
conform to the standards of the day.

You do ? Where did you find a portable fully C99 compliant toolset?

As far as C usage is concerned, the standard of the day is C89.

My point exactly.
Newsflash: nobody but you seems to be confused about that.

Okay, I'm baffled. You seem to be arguing against yourself now. Did you
do a reality check recently?

Yes. Perhaps if you learned to read, this wouldn't be a problem.

Lost on you apparently.

Hey, you're the one arguing mindless adherence. Do tell us when you plan
to start relying on gets for input.

Meanwhile, the rest of us - the ones who prefer intelligent adherence -
will happily not use something *despite* the standard including it, if it
is, in our opinion, a bad idea.

Funny thing, I had vague recollections of you being moderately sane. My
bad. In the bin with the other drooling morons.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top