A solution for the allocation failures problem

P

Paul Hsieh

Rarely it makes sense to have adjustable buffer sizes. But only rarely.

Every object created by the Better String Library is resizable.
[...] A
theoretical result is that you _can_ perform any computation with just a
disk file and a finite list of states, but generally if a small amount of
memory is not made available, you might as well forget about the operation.

Interesting choice.

So I was making an application that downloaded a massive stream from
the internet that *far* exceeded the memory of my system. The
information came down in small chunks, but processing them one chunk
at a time was way too much overhead, especially since many of these
chunks are linked together. The idea of simply sucking down as much
as could be fit into a single bstring until I ran out of memory, then
processing what I've got before getting more stuff appealed so much to
me I actually coded it up. It works, I max my internet connection, my
CPU and my total memory all at the same time. And of course the
application runs as fast as I know how to make it.

Without a "run until you run out of memory" scheme, I would have had
to tune some fixed amount of space to be both fast and reasonable on
whatever machine architecture, and basically lose a little on
performance, and fail on certain machines with too little memory. But
simply scaling to the amount of memory that's actually available on a
machine the program just works.
You've instantly doubled you development costs by insisting on this.

The development cost for *using* Bstrlib is about 15 minutes to
download, install, try a few examples, and basically understand it.
It did take me a while to develop it, but that's only because people
before just didn't quite have the insight to see the importance of
implementing Bstrlib the way it is. And of course, its development
time invested is amortized for all future projects I make that use
Bstrlib.

I have other libraries for data structures that are basically the
same. And you are right that development cost is increased (that's C
for you), but its not doubled.
 
M

Malcolm McLean

Kelsey Bjarnason said:
I trimmed the quoted part so that you can ponder how "once the habit has
been developed", being dropped from your rejoinder, makes your response
look somewhat foolish.
Tbat phrase is silly and arrogant, because it implies that people who
disagree with you must have poor programming habits.
In fact it reveals your own inexperience. Errors can be propagated up the
chain in simple programs, not always as easily in complex ones. A "simple
program" here doesn't necessarily mean a "short program". Why do you think X
does not always return error conditions when the server runs out of memory?
Because the designers were poor programmers, or because that solution was
worse than the cure?
 
M

Malcolm McLean

CBFalconer said:
You seem to have a reading impediment. It asked for:
18446744073709551613
repeat 18446744073709551613 bytes.

To me, 18446744073709551613 is not a negative number. If the
malloc system has such a memory block available, it should mark it
as used and return a pointer to it.
xmalloc() was asked for a negative number of bytes.
Because it asserts on the argument, if asserting is disabled then the
negative value will be converted to a size_t and passed to malloc(). Except
on a very unusual system, the result can only be allocation failure.

You can make a case against assert(), but xmalloc() isn't it.
 
M

Malcolm McLean

CBFalconer said:
Due to my natural intransigence, when machines 'demand' I tend to
ignore them. This has been known to lead to program failure. :)
If the user says "it is better that this program should terminate than be
given 64 bytes more of memory" then that is his rightful province.
 
M

Malcolm McLean

Eric Sosman said:
Can you cite any measurement or any research or even
any opium-induced dreams to substantiate this claim? Or
is this just the latest in your long string of unsupported
quantitative assertions?
Elementary. We have algorithm one, which takes N bytes of memory.
It is pointed out that there must exist an algorithm 2, which takes
arbitrarily less memory, for every algorithm that does memory allocation,
even if it is more complicated or slower than algorithm one.
So algorithm two is written, as Paul Hsieh, thinks it should be, and the
development costs approximately double.
 
M

Malcolm McLean

Paul Hsieh said:
The development cost for *using* Bstrlib is about 15 minutes to
download, install, try a few examples, and basically understand it.
It did take me a while to develop it, but that's only because people
before just didn't quite have the insight to see the importance of
implementing Bstrlib the way it is. And of course, its development
time invested is amortized for all future projects I make that use
Bstrlib.
That's the development cost for using Bstrlib in a hobby environment.

In a corporate environment things aren't usually so easy. Whilst I wouldn't
discourage bstrlib - I even recommended it for non-trivial string processing
applications a thread back - the costs of deciding to accept it into the
corporate codebase are much higher than that.
 
B

Bartc

Keith Thompson said:
p=heap(N);

Changing the syntax for memory allocation doesn't magically solve
anything.

Perhaps more than you think. With cleaner code you can concentrate on your
task.

Suppose there is another language X imposed on C (with C being a translation
target or acting as interpreter). Then it is easy to see the difference
between:

a=b+c;

Where a,b and c are, for example, strings, and b+c could involve implicit
memory allocation that could fail; and:

p=loadhugedatafile(file);
if (p=ERROR_RETURN)...

In the former case, if there is an allocation failure, there is no error
return back to that statement. And besides the programmer of this language
doesn't want to be bothered with what amounts to checking the status of
practically every line of code. He'd rather get on with the programming task
and leave this stuff to the runtime.

The runtime can do a lot of sophisticated things, but if ultimately there is
no more memory, then it must fail the operation, and there are many things
it can do including aborting. (Although, if X is built of independent
dynamic modules, it could mean closing down that one module, then
continuing.)

The whole point (getting back to C) is delegating all these checks -- when
used for ordinary allocations, especially when not expected to fail except
in extreme circumstances -- to a purpose-made handler.

Those who want to dutifully check every one of thousands of allocations
in-place and have code that can unwind all the way back to start of
execution (is that even possible?) are to be commended, but that seems the
wrong approach to me.
It's *always* possible for an attempted memory allocation to fail.
The possible ways to handle this possibility are:
1. Ignore it and keep going (dangerous).

Never ignore it. But if the allocation failure is unlikely (or you can't do
anything with the failure) then put the check elsewhere. I'm surprised you
even include this option.
2. Use an allocator that immediately aborts the program on failure.

That would be a lazy allocator. (My allocator does that. But my (toy)
software is not critical. And most allocation is done at the beginning of
execution so if it fails little is lost.).

However that wouldn't be unreasonable behaviour when the programmer can
choose to use it or not.
 
E

Eric Sosman

Malcolm said:
Elementary. We have algorithm one, which takes N bytes of memory.
It is pointed out that there must exist an algorithm 2, which takes
arbitrarily less memory, for every algorithm that does memory
allocation, even if it is more complicated or slower than algorithm one.
So algorithm two is written, as Paul Hsieh, thinks it should be, and the
development costs approximately double.

No, because no development time has been spent on the
rejected algorithm one. If the design choice is between
algorithm one, which uses CPU and memory resources wastefully
and doesn't always work, versus algorithm two, which uses
a fixed amount of memory and less CPU and *does* work, who
(present company excepted) wastes implementing algorithm one?
 
M

Malcolm McLean

Eric Sosman said:
No, because no development time has been spent on the
rejected algorithm one. If the design choice is between
algorithm one, which uses CPU and memory resources wastefully
and doesn't always work, versus algorithm two, which uses
a fixed amount of memory and less CPU and *does* work, who
(present company excepted) wastes implementing algorithm one?
I've worked in environments where malloc() was banned. It is possible to
implement virtually every function without it and without the drastic step
of using a disk as a Turing machine. However fucntions that operate on
dynamic memory have certain advantages. They are easier to write, and
usually they execute faster. That's why malloc() has found a place in the
standard library.

Secondly, when writing non-simple programs, it is generally a good idea to
have an algorithm one. I have a naive slow Fourier transform, for example,
which I use for testing Fast Fourier transforms. In some cases, algorithm
one proves to be adequate.
 
M

Malcolm McLean

Bartc said:
Those who want to dutifully check every one of thousands of allocations
in-place and have code that can unwind all the way back to start of
execution (is that even possible?) are to be commended, but that seems the
wrong approach to me.
I think usually the wrong approach. Sometimes development costs and time
taken to test do not matter, for instance if the software is to control a
spaceship. I got a rather silly reply from Kelsey to this one. Clearly if
costs are beginning to dominate overall project costs then you'll have to
keep programming costs down, but beyond that level, NASA would rather have
perfect software than cheap software, or software now.

Secondly, if the program is trivial, then unwinding the stack will also be
trivial. malloc wrappers and/or exceptions don't gain you much here.
 
E

Eric Sosman

Malcolm said:
I've worked in environments where malloc() was banned.

... in which case, your "algorithm one" is never even on
the docket for consideration, and certainly there is no effort
at all expended on its implementation. So again: How does this
justify your claim that implementing algorithm two "instantly
doubled you [sic] development costs?"
> It is possible to
implement virtually every function without it and without the drastic
step of using a disk as a Turing machine. However fucntions that operate
on dynamic memory have certain advantages. They are easier to write, and
usually they execute faster. That's why malloc() has found a place in
the standard library.

Yet another quantitative statement ("usually") without
substantiation. As a counter-example, I refer you to your own
line-numbering program from a related recent thread: It thrashes
the allocator and deallocator for no reason other than to waste
CPU time, and is subject to failure in cases where a fixed-storage
algorithm works. Why would anybody write that code in the first
place?
Secondly, when writing non-simple programs, it is generally a good idea
to have an algorithm one. I have a naive slow Fourier transform, for
example, which I use for testing Fast Fourier transforms. In some cases,
algorithm one proves to be adequate.

... and if algorithm one is adequate, then algorithm two
never gets written. Again, why do you claim that this has
"instantly doubled you [sic] development costs?"
 
S

santosh

Malcolm said:
xmalloc() was asked for a negative number of bytes.
Because it asserts on the argument, if asserting is disabled then the
negative value will be converted to a size_t and passed to malloc().
Except on a very unusual system, the result can only be allocation
failure.

You can make a case against assert(), but xmalloc() isn't it.

assert() is for use during development only. If you want to do argument
validation you should use the usual if/else construct, optionally
surrounded by ifdefs so that they (argument validation) can be turned
on or off individually.
 
F

Flash Gordon

Malcolm McLean wrote, On 01/02/08 08:27:
xmalloc() was asked for a negative number of bytes.

So you think that "ptr = xmalloc(18446744073709551613)" is a request for
a negative number of bytes. Interesting philosophy, but I'm not sure
many mathematicians will agree with you.
Because it asserts on the argument, if asserting is disabled then the
negative value will be converted to a size_t and passed to malloc().
Except on a very unusual system, the result can only be allocation failure.

You can make a case against assert(), but xmalloc() isn't it.

Chuck is not arguing against assert in general, he is arguing that your
usage of it is inappropriate.
 
M

Malcolm McLean

Flash Gordon said:
Malcolm McLean wrote, On 01/02/08 08:27:

So you think that "ptr = xmalloc(18446744073709551613)" is a request for a
negative number of bytes. Interesting philosophy, but I'm not sure many
mathematicians will agree with you.
xmalloc() takes an integer as an argument. Whilst C lays no requirements on
the sizeof an integer, it can accept negative arguemnts and thus detect a
bugged program.
This might be because a legitimate request has overflowed the range of an
int. Since the program's behaviour is now undefined, we can no longer assert
anything about it for certain, excpet that it is bugged.
 
M

Malcolm McLean

Eric Sosman said:
Malcolm said:
I've worked in environments where malloc() was banned.

... in which case, your "algorithm one" is never even on
the docket for consideration, and certainly there is no effort
at all expended on its implementation. So again: How does this
justify your claim that implementing algorithm two "instantly
doubled you [sic] development costs?"
It is possible to
implement virtually every function without it and without the drastic
step of using a disk as a Turing machine. However fucntions that operate
on dynamic memory have certain advantages. They are easier to write, ^^^^^^^^^^^^^^^^^^^^^^
and usually they execute faster. That's why malloc() has found a place in
the standard library.

Yet another quantitative statement ("usually") without
substantiation. As a counter-example, I refer you to your own
line-numbering program from a related recent thread: It thrashes
the allocator and deallocator for no reason other than to waste
CPU time, and is subject to failure in cases where a fixed-storage
algorithm works. Why would anybody write that code in the first
place?
Because the alternative is the hacker's answer to the problem. If you don't
understand this, it is because you are a hacker yourself, and can't see what
you are doing.
Sometimes the hacker's answer is the right one. But probably not when
dealing with lines of text inteneded to be human-readable. The point of the
example was to show the simplicity and expressivity of the new string
routines, despite the fact that they are extravagant with memory. I
specificlaly mentioned Paul Hsieh's string library as a more appropriate
alternative, if string processing was the bottlneck in the application.
Secondly, when writing non-simple programs, it is generally a good idea
to have an algorithm one. I have a naive slow Fourier transform, for
example, which I use for testing Fast Fourier transforms. In some cases,
algorithm one proves to be adequate.

... and if algorithm one is adequate, then algorithm two
never gets written. Again, why do you claim that this has
"instantly doubled you [sic] development costs?"
Paul Hsieh's position was one should provide two algorithms whenever
malloc() was called, one which used less or no dynamic memory. This is
always possible. As you correctly point out, if algorithm two is also
superior on all counts (including the readability and expressivity criteria)
then algorithm one become redundant. However it still has to be written to
demonstrate this. Deleted code costs as much to write as code which is
retained.

Simple mathematics should tell you that if you write two algorithms that is
likely to cost twice as much as writing one.
 
C

CBFalconer

Malcolm said:
xmalloc() was asked for a negative number of bytes. Because it
asserts on the argument, if asserting is disabled then the
negative value will be converted to a size_t and passed to
malloc(). Except on a very unusual system, the result can only
be allocation failure.

You can make a case against assert(), but xmalloc() isn't it.

assert doesn't exist when NDEBUG is set. Now consider the handling
of negative values passed to a size_t parameter.
 
K

Keith Thompson

Malcolm McLean said:
xmalloc() takes an integer as an argument. Whilst C lays no
requirements on the sizeof an integer, it can accept negative
arguemnts and thus detect a bugged program.

malloc() also takes an integer as an argument.

Please learn the difference between "int" and "integer".
 
M

Malcolm McLean

CBFalconer said:
assert doesn't exist when NDEBUG is set. Now consider the handling
of negative values passed to a size_t parameter.
I don't see the problem. The call will either succeed or, most probably,
fail, but the program is bugged anyway. It's not xmalloc()'s job to detect
every possible bug in calling code.

There is a case for always leaving asserts in source, I would agree.
 
K

Kelsey Bjarnason

Due to my natural intransigence, when machines 'demand' I tend to ignore
them. This has been known to lead to program failure. :)

There's also the issue of "what user?"

If the app is a DB server and it cannot allocate resources for, say, a
select, does it require that the client side pop up an error saying
"Please log onto the server machine and free up resources"? How can it
require this? Or maybe it does so on the server itself, popping up a
dialog box - which nobody looks at, because it's a server, quite possibly
headless.

'Course, that'd be even more special. 200 users on a DB server, all with
allocated resources which worked, and user 201 comes along, does a large
select, and the DB server pukes and dies, taking out 200 users' data sets
instead of simply detecting the error and reporting it to the single user.

Yeah, crash and burn on allocation failure is just such a good strategy.
 
K

Kelsey Bjarnason

[snips]


Hmm. Part of writing functions which work on dynamic memory is handling
allocation failure. Here we're told this is _easier_ than the
alternatives, yet the whole point to the existence of tools such as
xmalloc is that it's supposed to be _harder_ to write such functions.

Don't make much sense, do it?
Yet another quantitative statement ("usually") without
substantiation. As a counter-example, I refer you to your own
line-numbering program from a related recent thread: It thrashes the
allocator and deallocator for no reason other than to waste CPU time,
and is subject to failure in cases where a fixed-storage algorithm
works. Why would anybody write that code in the first place?
Secondly, when writing non-simple programs, it is generally a good idea
to have an algorithm one. I have a naive slow Fourier transform, for
example, which I use for testing Fast Fourier transforms. In some
cases, algorithm one proves to be adequate.

... and if algorithm one is adequate, then algorithm two
never gets written. Again, why do you claim that this has "instantly
doubled you [sic] development costs?"

'Course, that said, if his FFT code - which only ever needs to be written
once - is actually tested and shown to work, it can simply be dropped
into any application needing a Fourier Transform and benefit from the
speed enhancements, with next to zero cost - it's already been written,
debugged, tested, shown to be reliable. Why dick around with the slow
version?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top