bad alloc

none · Sep 7, 2011

One, it's not a given that it aborts other client connections, after
all. There could be a higher level mechanism that provides the
illusion of a persistent connection even after failover. Second, it
may be unnecessary, but just because it's unnecessary it doesn't
follow that:
1) The value of trying to handle OOM, instead of terminating, exceeds
it cost.
2) OOM can be handled in a robust fashion.

This arguments can go both ways which you seem to refuse to accept: It
doesn't follow that:

1) The cost of trying to handle *some* OOM error, instead of
terminating, exceeds the value.
2) That no OOM errors whatsoever can be handled in a robust fashion.

Yannick

none · Sep 7, 2011

Yes, if you want to isolate failure processing one request from
another (esp. in a threaded system), you set limits on how much input
can be provided with each request. You reject requests that exceed
the limit.

However, this doesn't mean the limits are set artificially low.
Usually memory isn't your bounding constraint, so you'll run out of
database handles, CPU, etc. long before you run out of memory. Per-
request memory limits can be generous and not create an issue.

Of course, I'm personally fine with unbounded input as long as the
user understands the system will break at some point and they get to
keep both of the pieces.

No, I'm not sure why you think this follows in the least. I also
think I've explained why this isn't the case several times already.
If you have per-request bounds and the OS can't give you memory when
you ask for it you either need to rewrite your code (so you'll be
terminating anyway) or the OS is likely to terminate (so you'll be
terminating anyway).

Given a multitasking OS running on a more than one process, there is
no point in time where you can know how much resources are available.
What was true a clock cycle ago may now be false.

So assuming that processing the inputs require some resources that
varies with the input complexity, there is no point in time that you
can *know* that there will be enough resource to process even the
simplest of input, unless processing it does not require you to
acquire any new resources at all.

Ergo: in order to validate an input and be sure that you will be able
to process it, you need to already have acquired all the resources
you will ever need to process this input.

Ergo: in order to document the input limits, you need to have already
acquired the resources that you will ever need in order to process the
limit.

Yan

Adam Skutt · Sep 7, 2011

This arguments can go both ways which you seem to refuse to accept: It
doesn't follow that:

1) The cost of trying to handle *some* OOM error, instead of
terminating, exceeds the value.

Then provide a reliable mechanism to distinguish OOM errors. Thus
far, you have been unable to do so. Otherwise, a reliable mechanism
must treat them all and treat them as the worse case situation.

2) That no OOM errors whatsoever can be handled in a robust fashion.

I have never at any point said that or anything close to that. All
I've said is that it's rarely, if ever, worth it, and that it's not
nearly as easy to do robustly as most people here seem to believe.

Adam

Adam Skutt · Sep 7, 2011

Given a multitasking OS running on a more than one process, there is
no point in time where you can know how much resources are available.
What was true a clock cycle ago may now be false.

There's no way for the application to generally know. There's plenty
of ways for the system designer to know or make more than reasonable
assurances, if it comes to that. The reality of the matter is one can
grant sufficient resources to the system such that exhaustion would be
considered catastrophic. Hence my second reason why you terminate:
the machine ran out of resources and some human being needs to add
more resources.

That's the best you can do, even preallocating resources. You don't
know if the kernel will overcommit, you don't know if the kernel will
run out of resources for itself, after all. Preallocation only works
if everything your program requires to run preallocates too. Modern,
mainstream operating system kernels do not preallocate most of their
resources. You should take that in to serious consideration when
deciding what applications should do.

Ergo: in order to document the input limits, you need to have already
acquired the resources that you will ever need in order to process the
limit.

And may prevent the program from running altogether. If your goal is
to serve as many requests as possible before failure, then is worse
than what I suggested, not better. If there's not enough memory, then
you'll serve 0. I'll serve some number between 0 and all of them.

Adam

Adam Skutt · Sep 7, 2011

[limiting inputs] is what other engineering disciplines do,
Actually, they don't. There's a good reason why soldiers are
required to break step when crossing a bridge.

Click to expand...

Click to expand...

And if you think that reason is a counterexample to what I said, then
you're simply crazy. Walking over a bridge where you don't know how
much wait it is designed to support (nor can you) isn't relevant here;

Click to expand...

it's not about weight it's about resonance.

That's a common statement, but it's not really, really true. Perhaps
counter-intuitively, when bridges start to sway, people tend to match
the swaying, potentially exacerbating the problem anyway. While it is
certainly possible for marching cadence to cause resonance on a badly
built or damaged bridge, it's possible for it to happen even when step
is broken due to human nature. As a result, bridges have collapsed on
troops, most likely due to resonance, even when they heeded this
advice.

Though how the officer
estimates the bridge's resonant frequency is beyond me...

If the swaying is getting worse, you have a problem. Unless the
bridge is about to structurally fail anyway (in which case it won't
support the dynamic load), it will get pretty severe before the
failure. Unsurprisingly, when this has happened to marching troops in
history, there were plenty of warning signs before the actual
collapse.

Nevertheless, it's still not the least bit relevant to my point.

Adam

Paul · Sep 7, 2011

"Paul" wrote in message

This is probably not a design of these programs. They run on a virtual
memory OS. If a program does not fit in RAM, OS replaces some RAM withswap
space. Swap space is usually hard disk, which is thousand times slower than
RAM. If RAM is added, the whole program fits in RAM, which makes it much
faster. No design of these programs is needed to use virtual memory, because
it is a OS feature, not a program feature.- Hide quoted text -

But lets say we have 3 scenarios:

1) 512MB physical RAM + 1GB swap space = 1.5GB total available RAM.
2) 2GB physical RAM + 2GB swap space = 4GB total available RAM.
3) 1.5GB physical RAM + 512MB swap space = 2GB total available RAM.

For each of these scenarios the program will have a different amount
of memory available for dynamic allocation. I don't know how this
works because the OS must reserve memory for other things but lets
just say for example;
Available within program for dynamic allocation:
1) 0.5GB
2) 2GB
3) 1GB

Yes the page swapping thing can be an issue but on a system with 4GB
physical RAM and 16MB swap space the page swaping is not the big
problem.
Even with windows virtual memory paging there is still not an
unlimited supply of memory. A program whos performance is restricted
by low memory must check for bad allocations or it would crash all the
time.

low memory = long load time and lots of memory = fast load time.
How can it be faster to load more data into more memory?

Adam Skutt · Sep 7, 2011

rather tahn wait for OOM you could detect low memory conditions and
take evasive action (eg. reject any furthur requests and concentrate
on the ones you've got.

If it were possible to reliably detect low memory conditions then I
don't think we'd be having this discussion at all. No, generally the
only way to find out if you can allocate memory is to try it.

Adam

Paul · Sep 7, 2011

On Sep 6, 2:43 pm, yatremblay@bel1lin202.(none) (Yannick Tremblay)
wrote:

On 09/ 4/11 11:20 AM, James Kanze wrote:
On 09/ 2/11 04:37 PM, Adam Skutt wrote:
[...]
I agree. On a descent hosted environment, memory exhaustion is usually
down to either a system wide problem, or a programming error.
Or an overly complex client request.
Not spotting those is a programming (or specification) error!
And the way you spot them is by catching bad_alloc.
No, you set upfront bounds on allowable inputs. This is what other
engineering disciplines do, so I'm not sure why computer programmers
would do something different. Algorithms that permit bounded response
to unbounded input are pretty rare in the grand scheme of things.
Even when they exist, they may carry tradeoffs that make them
undesirable or unsuitable (e.g., internal vs. external sort).
So, if I understand you correctly, you are saying that you must always
setup some artificial limits to the external inputs and set
artificially low so that no matter what is happening in the rest of
the system, the program will never run out of resources....
This seems like a very bad proposition to me. The only way to win is
to reserve and grab at startup time all of the resources you might
potentially ever need in order to meet the worse case scenario of your
inputs.

Click to expand...

Click to expand...

This is not possilbe in the situation where a program is limited by
system memory. As a crude example a text editor opening a new window
to display each text file, the number of windows is limited by
available system RAM.

Click to expand...

Not at all. Say that you simply load said text into memory (crude
approach, works for a massive amount of uses). If the file is 3 bytes,
chances are, you'll open thousands. If the file is a couple of megs,
you won't get there.

Click to expand...

But if the file size is unknown until say a user selects from a dailog
window.
You can't predict how many buffers and what size each buffer will need
to be.

The only way this system can really work is if you grab a memory pool
and then have some kind of allocation handler that processes
allocation/deallocation from the pool.
I don't know how you would know the size of this pool to initially
grab. Perhaps a trial and error until a bad alloc is thrown .

Click to expand...

Goran · Sep 7, 2011

On Sep 6, 2:43 pm, yatremblay@bel1lin202.(none) (Yannick Tremblay)
wrote:

On 09/ 4/11 11:20 AM, James Kanze wrote:
On 09/ 2/11 04:37 PM, Adam Skutt wrote:
[...]
I agree. On a descent hosted environment, memory exhaustionis usually
down to either a system wide problem, or a programming error.
Or an overly complex client request.
Not spotting those is a programming (or specification) error!
And the way you spot them is by catching bad_alloc.
No, you set upfront bounds on allowable inputs. This is what other
engineering disciplines do, so I'm not sure why computer programmers
would do something different. Algorithms that permit bounded response
to unbounded input are pretty rare in the grand scheme of things.
Even when they exist, they may carry tradeoffs that make them
undesirable or unsuitable (e.g., internal vs. external sort).
So, if I understand you correctly, you are saying that you must always
setup some artificial limits to the external inputs and set
artificially low so that no matter what is happening in the rest of
the system, the program will never run out of resources....
This seems like a very bad proposition to me. The only way to win is
to reserve and grab at startup time all of the resources you might
potentially ever need in order to meet the worse case scenario of your
inputs.
This is not possilbe in the situation where a program is limited by
system memory. As a crude example a text editor opening a new window
to display each text file, the number of windows is limited by
available system RAM.

Click to expand...

Click to expand...

Not at all. Say that you simply load said text into memory (crude
approach, works for a massive amount of uses). If the file is 3 bytes,
chances are, you'll open thousands. If the file is a couple of megs,
you won't get there.

Click to expand...

But if the file size is unknown until say a user selects from a dailog
window.
You can't predict how many buffers and what size each buffer will need
to be.

The only way this system can really work is if you grab a memory pool
and then have some kind of allocation handler that processes
allocation/deallocation from the pool.
I don't know how you would know the size of this pool to initially
grab. Perhaps a trial and error until a bad alloc is thrown .

Click to expand...

Sorry, I poorly explained myself there. I was arguing with something
that wasn't written.

What I wanted to say is that the number of windows you'll get to open
will vary wildly depending on the file size. I agree that one you
can't predict anything.

Therefore, the best (and simplest if you ask me) way to proceed is to
try to allocate and do __not__ "handle" bad_alloc. I pretty much agree
with Skutt that "handling" OOM is impossible, especially not at the
spot where it occurred, because memory is possibly tightest there. In
the imaginary editor, imagine sequence of events:

ask the user which file to open
create "frame" window for the file
create, I dunno, borders, toolbar, whatever
create widget to host text
load the text (say, whole file into a buffer that you pass to the
widget for display)

In pseudo-C++, that might be:

auto_ptr<Frame> openFile(const char* name)
{
auto_ptr<Frame> frame = new Frame();
frame->Decorate();
EditorWidget& e = frame->GetEditor();
vector<char> text = LoadFile(name);
e.SetText(text);
}

In the above, you allocate all sorts of stuff: frame, "decorations",
editor widget inside the frame. (I presume that frame "owns" that, and
GetEditor creates actual EditorWidget e.g. on demand, therefore it
gives it out as a reference). I also presume that LoadFile is a
function that loads a file into vector<char>. I presume that any
function you see throw an exception in case of any problems.

I say that the above code is resilient to resource shortage, and that,
if there is a resource shortage at any point bar "new Frame()" line,
it will nicely clean up behind and leave with at least some resources.
You can call this as much as you like and you'll be fine. No arena
allocators, no pools, no try/catch, no nothing. I further say that C++
makes it +/- easy to write similarly correct code.

Finally, I say: boy did I go off the tangent here...

Goran.

Click to expand...

Paul · Sep 8, 2011

On Sep 6, 2:43 pm, yatremblay@bel1lin202.(none) (Yannick Tremblay)
wrote:

On 09/ 4/11 11:20 AM, James Kanze wrote:
On 09/ 2/11 04:37 PM, Adam Skutt wrote:
[...]
I agree. On a descent hosted environment, memory exhaustion is usually
down to either a system wide problem, or a programming error.
Or an overly complex client request.
Not spotting those is a programming (or specification) error!
And the way you spot them is by catching bad_alloc.
No, you set upfront bounds on allowable inputs. This is what other
engineering disciplines do, so I'm not sure why computer programmers
would do something different. Algorithms that permit bounded response
to unbounded input are pretty rare in the grand scheme of things..
Even when they exist, they may carry tradeoffs that make them
undesirable or unsuitable (e.g., internal vs. external sort).
So, if I understand you correctly, you are saying that you must always
setup some artificial limits to the external inputs and set
artificially low so that no matter what is happening in the rest of
the system, the program will never run out of resources....
This seems like a very bad proposition to me. The only way to win is
to reserve and grab at startup time all of the resources you might
potentially ever need in order to meet the worse case scenario ofyour
inputs.
This is not possilbe in the situation where a program is limited by
system memory. As a crude example a text editor opening a new window
to display each text file, the number of windows is limited by
available system RAM.
Not at all. Say that you simply load said text into memory (crude
approach, works for a massive amount of uses). If the file is 3 bytes,
chances are, you'll open thousands. If the file is a couple of megs,
you won't get there.

Click to expand...

Click to expand...

But if the file size is unknown until say a user selects from a dailog
window.
You can't predict how many buffers and what size each buffer will need
to be.

Click to expand...

The only way this system can really work is if you grab a memory pool
and then have some kind of allocation handler that processes
allocation/deallocation from the pool.
I don't know how you would know the size of this pool to initially
grab. Perhaps a trial and error until a bad alloc is thrown .

Click to expand...

Sorry, I poorly explained myself there. I was arguing with something
that wasn't written.

What I wanted to say is that the number of windows you'll get to open
will vary wildly depending on the file size. I agree that one you
can't predict anything.

Therefore, the best (and simplest if you ask me) way to proceed is to
try to allocate and do __not__ "handle" bad_alloc. I pretty much agree
with Skutt that "handling" OOM is impossible, especially not at the
spot where it occurred, because memory is possibly tightest there. In
the imaginary editor, imagine sequence of events:

ask the user which file to open
create "frame" window for the file
create, I dunno, borders, toolbar, whatever
create widget to host text
load the text (say, whole file into a buffer that you pass to the
widget for display)

In pseudo-C++, that might be:

auto_ptr<Frame> openFile(const char* name)
{
auto_ptr<Frame> frame = new Frame();
frame->Decorate();
EditorWidget& e = frame->GetEditor();
vector<char> text = LoadFile(name);
e.SetText(text);

}

In the above, you allocate all sorts of stuff: frame, "decorations",
editor widget inside the frame. (I presume that frame "owns" that, and
GetEditor creates actual EditorWidget e.g. on demand, therefore it
gives it out as a reference). I also presume that LoadFile is a
function that loads a file into vector<char>. I presume that any
function you see throw an exception in case of any problems.

I say that the above code is resilient to resource shortage, and that,
if there is a resource shortage at any point bar "new Frame()" line,
it will nicely clean up behind and leave with at least some resources.
You can call this as much as you like and you'll be fine. No arena
allocators, no pools, no try/catch, no nothing. I further say that C++
makes it +/- easy to write similarly correct code.

Finally, I say: boy did I go off the tangent here...

Click to expand...

Well TBH I don't know WTF you are talking about bu if you had such an
app I would think the sensible resolution would be to stop opening
windows. Display a message to user and say no more windows until you
close some.

Click to expand...

Goran · Sep 8, 2011

On Sep 6, 2:43 pm, yatremblay@bel1lin202.(none) (Yannick Tremblay)
wrote:

On 09/ 4/11 11:20 AM, James Kanze wrote:
On 09/ 2/11 04:37 PM, Adam Skutt wrote:
[...]
I agree. On a descent hosted environment, memory exhaustion is usually
down to either a system wide problem, or a programming error.
Or an overly complex client request.
Not spotting those is a programming (or specification) error!
And the way you spot them is by catching bad_alloc.
No, you set upfront bounds on allowable inputs. This is what other
engineering disciplines do, so I'm not sure why computer programmers
would do something different. Algorithms that permit bounded response
to unbounded input are pretty rare in the grand scheme of things.
Even when they exist, they may carry tradeoffs that make them
undesirable or unsuitable (e.g., internal vs. external sort).
So, if I understand you correctly, you are saying that you mustalways
setup some artificial limits to the external inputs and set
artificially low so that no matter what is happening in the rest of
the system, the program will never run out of resources....
This seems like a very bad proposition to me. The only way to win is
to reserve and grab at startup time all of the resources you might
potentially ever need in order to meet the worse case scenario of your
inputs.
This is not possilbe in the situation where a program is limited by
system memory. As a crude example a text editor opening a new window
to display each text file, the number of windows is limited by
available system RAM.
Not at all. Say that you simply load said text into memory (crude
approach, works for a massive amount of uses). If the file is 3 bytes,
chances are, you'll open thousands. If the file is a couple of megs,
you won't get there.
But if the file size is unknown until say a user selects from a dailog
window.
You can't predict how many buffers and what size each buffer will need
to be.
The only way this system can really work is if you grab a memory pool
and then have some kind of allocation handler that processes
allocation/deallocation from the pool.
I don't know how you would know the size of this pool to initially
grab. Perhaps a trial and error until a bad alloc is thrown .

Click to expand...

Click to expand...

Sorry, I poorly explained myself there. I was arguing with something
that wasn't written.

Click to expand...

What I wanted to say is that the number of windows you'll get to open
will vary wildly depending on the file size. I agree that one you
can't predict anything.

Click to expand...

Therefore, the best (and simplest if you ask me) way to proceed is to
try to allocate and do __not__ "handle" bad_alloc. I pretty much agree
with Skutt that "handling" OOM is impossible, especially not at the
spot where it occurred, because memory is possibly tightest there. In
the imaginary editor, imagine sequence of events:

Click to expand...

ask the user which file to open
create "frame" window for the file
create, I dunno, borders, toolbar, whatever
create widget to host text
load the text (say, whole file into a buffer that you pass to the
widget for display)

Click to expand...

In pseudo-C++, that might be:

Click to expand...

auto_ptr<Frame> openFile(const char* name)
{
auto_ptr<Frame> frame = new Frame();
frame->Decorate();
EditorWidget& e = frame->GetEditor();
vector<char> text = LoadFile(name);
e.SetText(text);

In the above, you allocate all sorts of stuff: frame, "decorations",
editor widget inside the frame. (I presume that frame "owns" that, and
GetEditor creates actual EditorWidget e.g. on demand, therefore it
gives it out as a reference). I also presume that LoadFile is a
function that loads a file into vector<char>. I presume that any
function you see throw an exception in case of any problems.

Click to expand...

I say that the above code is resilient to resource shortage, and that,
if there is a resource shortage at any point bar "new Frame()" line,
it will nicely clean up behind and leave with at least some resources.
You can call this as much as you like and you'll be fine. No arena
allocators, no pools, no try/catch, no nothing. I further say that C++
makes it +/- easy to write similarly correct code.

Click to expand...

Finally, I say: boy did I go off the tangent here...

Click to expand...

Well TBH I don't know WTF you are talking about bu if you had such an
app I would think the sensible resolution would be to stop opening
windows. Display a message to user and say no more windows until you
close some.

Click to expand...

I'll try to clarify (my snippet is full of presumptions, I thought
they were +/- obvious; I am pretty much certain they are reasonable).

Suppose that this app has a "generic" exception handler (UI toolkits
do have such a thing in their UI-handling message loops). Typically,
said loop would receive a command-type message ("open a file") from
the user. That would end up in some function that asks for the file
name and then, possibly, my openFile would get called. Say that
openFile should return a pointer to a "Frame" (window) object that
displays the file.

So what happens if something, anything, goes wrong in openFile? Well,
nothing bad. Exception is thrown, hopefully containing, one way or
another, info on what has gone wrong. See that auto_ptr<Frame> there?
That ensures that, if exception is thrown, newly allocated Frame
instance will be deleted. See that vector<char> returned by LoadFile?
That ensures that, whatever storage might have been allocated for file
contents, will be freed. I claim: similar logic can be +/- trivially
applied to any bit of code for it to be error-resilient (a.k.a.
exception-safe).

About your "no more windows until you close some" idea: there's no
__need__ for that. What if file was too big for the current system
state, and what if user could open a smaller one? There's no __need__
to prevent further file opening, because:

1. there is __no__ harm in tying
2. some other file might open fine.

Key thing here, and has been, from the very start: exception safety
guarantees of any bit of code must be correct. For example, the
openFile function has "strong" exception guarantee: either it works as
described, either there has been an error an any temporary changes
that might have been made to the program were rolled back (e.g.
allocated Frame, EditorWidget, file contents buffer were all freed).

C++, in particular, offers enough mechanisms to make writing code,
well-designed WRT exception safety guarantees, a reasonably easy
affair.

Goran.

Click to expand...

none · Sep 8, 2011

There's no way for the application to generally know. There's plenty
of ways for the system designer to know or make more than reasonable
assurances, if it comes to that. The reality of the matter is one can
grant sufficient resources to the system such that exhaustion would be
considered catastrophic. Hence my second reason why you terminate:
the machine ran out of resources and some human being needs to add
more resources.

That's the best you can do, even preallocating resources. You don't
know if the kernel will overcommit, you don't know if the kernel will
run out of resources for itself, after all. Preallocation only works
if everything your program requires to run preallocates too. Modern,
mainstream operating system kernels do not preallocate most of their
resources. You should take that in to serious consideration when
deciding what applications should do.

And may prevent the program from running altogether. If your goal is
to serve as many requests as possible before failure, then is worse
than what I suggested, not better. If there's not enough memory, then
you'll serve 0. I'll serve some number between 0 and all of them.

I am confused. you are the one that said that in order to avoid OOM
error due to input complexity, all you need to do is to setup upfront
limit on allowable input. (since this is what other engineering
discipline do)

My reply simply highlighted that this is impossible to do correctly
unless you pre-acquire all resources you will ever need.

Are you now suggesting that you should not set upfront limits?

Adam Skutt · Sep 8, 2011

I am confused. you are the one that said that in order to avoid OOM
error due to input complexity, all you need to do is to setup upfront
limit on allowable input. (since this is what other engineering
discipline do)

My reply simply highlighted that this is impossible to do correctly
unless you pre-acquire all resources you will ever need.

Click to expand...

And my reply demonstrated how and why this is incorrect.

Are you now suggesting that you should not set upfront limits?

Click to expand...

No, I'm suggesting trying to preallocate the resources for those
limits is normally pointless. I'm saying that if you choose to serve
a max of 100 connections transferring no more than 5MB of data (so
roughly a 500MB limit) and then run it on a system that doesn't have
at least that much memory, you deserve what you get. This is obvious
and goes without saying, even if I didn't mention it explicitly
before.

Preventing OOM from occurring is a systems-level problem. You cannot
meaningfully solve it while discussing a single software application
removed from the system it runs on. You have to be able to make
certain assumptions about systems-level behavior in order to come up
with a workable solution.

Adam

none · Sep 13, 2011

Then provide a reliable mechanism to distinguish OOM errors. Thus
far, you have been unable to do so. Otherwise, a reliable mechanism
must treat them all and treat them as the worse case situation.

Sorry for the long discussion below but short simplified and not fully
explicit answers have been previously met with dismissal based on
generalities:

Thus far, I have never attempted to provide a mechanism to
automatically distinguish OOM errors with no other information
whatsoever. Thus far I have posted a small code sample that catch()
following a new that was known to potentially be large. You simply
questionned that you didn't know what "large" was hence the example
was invalid. My answer is that I know it is potentially large because
it was designed that way.

Someone has posted experiences of observing an application recovering
from OOM errors. I have also done the same and repetitively tested it
by purposefully triggering OOM errors (yes feeding purposefully
desgined inputs to a real application in such a way that the application
eventually uses all of the available memory on a system and making
sure it still recovers).

The key is design. Design your application so that you *know*
where are the safe points of failure. You *know* how to cancel a job
safely. You *know* where the application will attempt to allocate a
large amount of memory and design this area in such a way that
recovery is possible *if* the failure is due to the requested
allocation being too large *and* much larger than what is normal.

You are the designer. You should be able to design the code in such a
way that you can ensure that "large" *potentially* recoverable
allocation happens at a known location in the code.

You keep saying "how can you distinguish recoverable vs
non-recoverable allocation?" The answer is design. You design your
application in such a way that you *plan* where the recoverable
allocations will happen.

So here is an compilable example that recovers safely from bad_alloc

----------------------------------------------
#include <iostream>
#include <stdexcept>

size_t const multiplier = 1000000;

bool doIt(size_t size)
{
int * p = 0;
try
{
p = new int[multiplier * size];
}
catch(std::bad_alloc &e)
{
// NOTE: I purposefully use iostream here.
// I realise that this may allocate memory underneath
// and is not a guarantee nothrow operation but here
// it highlights that memory is avaiable.
// In practice, you would probably not do it and just
// try to log in a *safe* way and
//start unrolling the stack
std::cout << "Error: " << e.what() << std::endl;
// allocate a bit anyway.
int *z = new int[10];
std::cout << "Wow, new[] still works" << std::endl;
delete z;
return false;
}
int * q = new int[5];

delete[] q;
delete[] p;
std::cout << "Did it " << size << std::endl;
return true;

}
int main()
{
size_t size;

for(;

{
std::cout << "Please enter allocation size: " << std::endl;
std::cin >> size;

if(!doIt(size))
{
std::cout << "Error happened for " << size << std::endl;
}
else
{
std::cout << "Job done for " << size << std::endl;
}
}
}
-------------------------------------------

This is a simplistic example but the principles are usable in a larger
application and even in a multithreaded application and it would not
matter if the try/catch was 5 function up the stack or directly around
the new.

Note that if the std::bad_alloc happens anywhere outside the
purposefully *designed* *potentially recoverable* area (i.e. when
allocating for q, z or within iostream even including the one in the
catch), then the program will terminate. This is also as-designed.

So the result:

The program can recover from OOM errors if they occur at a
particular location in the code where the designer planned to do
large allocations.

The program behave as you argued for when the OOM error occurs
elsewhere and terminate

The program can process all inputs while only being limited by the
actual current resource limits on the host (not some artificial
limits).

The program can *potentially* recover from OOM error if they are due
to input complexity. If recovery is succesfull, it can continue
processing new, less complex inputs safely. If recovery is not
successful, it will simply terminate.

The program will terminate on OOM error that happen elsewhere are are
*really* unexpected and the designer could not know how to handle.

IMO, the value of implementing this purposefully localised and
targetted OOM error recovery exceeds its cost.

I have never at any point said that or anything close to that. All
I've said is that it's rarely, if ever, worth it, and that it's not
nearly as easy to do robustly as most people here seem to believe.

IMNSHO, this is not as difficult as you suggest if you carefully
design the system for this purpose.

Yannick

Adam Skutt · Sep 13, 2011

Thus far, I have never attempted to provide a mechanism to
automatically distinguish OOM errors with no other information
whatsoever. Thus far I have posted a small code sample that catch()
following a new that was known to potentially be large. You simply
questionned that you didn't know what "large" was hence the example
was invalid. My answer is that I know it is potentially large because
it was designed that way.

Which is what I told you. It's still not sufficiently general to
support the counterargument, "You only have to handle some of them".
Besides, the reality of that matter is only having to handle some of
them does not make things one iota easier, and I'm not sure why you
think it would make things easier. The difficulty is in figuring out
what goes inside the catch block, not in figuring out where to place
the damn thing (though it's hardly as simple as you think it is).

Regardless, like it or not, the burden on you to support your claim is
as I described. It's not my fault you're making claims you cannot
support.

The key is design. Design your application so that you *know*
where are the safe points of failure. You *know* how to cancel a job
safely.

There are plenty of systems where such things are impossible or simply
more difficult than they are worth. This is why we have multiple
levels of exception safety guarantees. There may not be such thing as
a "safe point of failure". All of my exception safety guarantees may
be weak.

You *know* where the application will attempt to allocate a
large amount of memory and design this area in such a way that
recovery is possible *if* the failure is due to the requested
allocation being too large *and* much larger than what is normal.

You are the designer. You should be able to design the code in such a
way that you can ensure that "large" *potentially* recoverable
allocation happens at a known location in the code.

I don't know why you keep returning to this point. 'Too large' is not
the only reason to see an allocation failure. On modern operating
systems with virtual memory, it's may not even be the primary reason
to see an allocation failure[1]. Heap fragmentation / allocator
limitations are far more likely to cause an OOM condition. Such
issues equally effect small and large allocations: it depends entirely
on the algorithm used to allocate memory. Do you know what your
std::allocator does? It probably doesn't do what you think it does.
Likewise, malloc() probably doesn't behave the way you think it does
either.

I may never, ever see an error because the allocation was "too
large". It's poor justification for going through the effort of
writing an OOM handler. Size alone does not tell me which allocations
in the application will fail.

You keep saying "how can you distinguish recoverable vs
non-recoverable allocation?" The answer is design. You design your
application in such a way that you *plan* where the recoverable
allocations will happen.

As I've stated many times: where one thinks the allocation failures
will happen and where they will actually happen are two entirely
different things. Especially when executing threaded code. Simply
*saying* design does not tell me what I need to know. You haven't yet
told me all the factors I need to consider in my design. There are
plainly more factors than how much memory a particular request makes
and why the request is being made.

It's no different from optimization: just because you claim the
hotspot is 'X' does not actually mean the hotspot is 'X'. Just
because you say, "The program will run out of memory here" does not
mean that will ever actually happen in practice.

// NOTE: I purposefully use iostream here.
// I realise that this may allocate memory underneath
// and is not a guarantee nothrow operation but here
// it highlights that memory is avaiable.

No, it doesn't highlight that memory is available in general. It
highlights memory was available in whatever asinine test cases you
came up with, or that memory was not allocated in that particular
situation. I'm not sure why you think an inductive proof has any
value here whatsoever. I'm also not sure why you think such a
simplistic example has any value whatsoever.

The goal is to improve robustness by handling OOM. It was already
stated that using iostreams as-is will not do this, so I'm not sure
what you hoped to prove by writing this example. Write an example
that actually improves robustness. And since you claimed this can be
done in a multi-threaded program without impacting the other threads,
do that too. Otherwise, you haven't demonstrated anything of any
value whatsoever.

Just for grins, try allocating all that space one byte at a time (go
ahead and leak it), so you actually fill up the freestore before
making the failed allocation. Then see how much space you have
available, if you don't outright crash your computer[2][3].

This is a simplistic example but the principles are usable in a larger
application and even in a multithreaded application and it would not
matter if the try/catch was 5 function up the stack or directly around
the new.

Such a simplistic handler will not protect other threads from failing
during stack unwind for the original std::bad_alloc. How can it
possibly do so?

The program can process all inputs while only being limited by the
actual current resource limits on the host (not some artificial
limits).

Nope. Again, do you know what your std::allocator does?

IMNSHO, this is not as difficult as you suggest if you carefully
design the system for this purpose.

You haven't yet designed a system for this purpose, so your opinion is
worth nothing.

Adam

[1] On a modern 64-bit operating system, it's possible to never see an
allocation failure for this reason. The address space of your process
almost certainly exceeds your commit limit by several orders of
magnitude (40 to 48-bits vs. 3X bits). Your process has good chance
of just crashing the system or getting stopped by an OOM killer
instead of seeing the failure. What will exactly happen depends on a
huge number of factors not worth listing here (tbh, I'm not 100%
confident I can even list them all).

[2] When I ran such a test on one system I have access to, I could not
allocate even a single int on the freestore once std::bad_alloc was
thrown. It's also worth noting that the operating system claimed the
process had filled the virtual address space, but the amount of memory
my code allocated was considerably less (3GiB vs. ~800MiB). Asking
for quarter pages (1KiB on my system) raised the amount returned to my
code to very close to the virtual address space limit. This simply
goes to further that making assumptions about how operator new/malloc
behave can be quite dangerous. It means it is quite possible for
"big" allocations to succeed when "little" allocations fail (proof
left as exercise to reader)!

[3] This is of course one reason that restarting is inherently
superior to handling OOM: if your program does leak memory, then
restarting the program will get that memory back. Plus, you will
eventually have to take it out of service anyway to plug the leak for
good.

none · Sep 14, 2011

On Sep 13, 1:06pm, yatremblay@bel1lin202.(none) (Yannick Tremblay)
wrote:
Which is what I told you. It's still not sufficiently general to
support the counterargument, "You only have to handle some of them".

You are being obtuse and refuse to even try to understand.

I design an application.
I specifically do the design in such a way that I know where "large"
memory allocation occurs. In fact I designed it so that I
purposefully do one large allocation rather than drip-by-drip.

There is not need to be able to recover from all allocation failure.
You seem to be the one that claims that because it is impossible to
recover from all possible allocation failures, then you should never
ever under any circumstances even consider attempting to recover from
any allocation failure whatsoever. I disagree and the code sample
posted demonstrate that this is possible.

You said it yourself, if memory is really exhausted and a small
allocation really fails, then attempting to recover is really hard. So
you are correct that attempting to recover from all allocation
failures is a bad idea.

However, I can design an application so that it can recover from some
specific allocation failure if I use my brain to design it correctly.

Beside, the reality of that matter is only having to handle some of
them does not make things one iota easier, and I'm not sure why you
think it would make things easier. The difficulty is in figuring out
what goes inside the catch block, not in figuring out where to place
the damn thing (though it's hardly as simple as you think it is).

See code sample supplied that you purposefully ignore (well almost
ignore).

1- It works
2- Some things is being done in the catch block

No problem. Not difficult. I am not sure what is your problems with
it.

Regardless, like it or not, the burden on you to support your claim is
as I described. It's not my fault you're making claims you cannot
support.

FFS! Code supplied. Claim supported. Did you run it? It works. More
than one person have told you that they have tested a similar thing in
the real world.

My claim that it is possible to recover from some allocation failures
is fully supported.

I have no idea what claim you claim that I made that you claim that I
cannot support.

Clarification before you try to claim that I claim other things:
My claims:
"It is possible to recover from some allocation failures"
"It is possible to design an application in such a way so that there
are suitable recovery points"
"It is possible to design an application in such a way that you plan
to do potentially large allocation in one particular area and
given this knowledge and this planning, it is safe to *attempt* to
recover"
"On a case by case basis, the cost of handling some OOM error may be
justified by the value"

As far as I can understand, you seem to claim the generic that
recovering from any allocation failures whatsoever is impossible to do
and never worth doing. This is invalid. You can disprove an "it's
impossible to do" by simply supplying one example where it works.

BTW: I am not disputing that it is very difficult (or virtually
impossible) to write a generic allocation failure handler that will
always succesfully recover from all allocation failure regardless of
the cause.

There are plenty of systems where such things are impossible or simply
more difficult than they are worth. This is why we have multiple
levels of exception safety guarantees. There may not be such thing as
a "safe point of failure". All of my exception safety guarantees may
be weak.

I never claimed that it is always possible for all system. I claim
that one can design a system where it is possible in restricted cases.

The fact that some system may not have any "safe point of failure"
does not mean that it is impossible to design a specific system to
have safe points of failure.

Are you claiming that it is impossible to design any system with safe
points of failure?

I don't know why you keep returning to this point.

Because it seems you are refusing to understand.

'Too large' is not
the only reason to see an allocation failure.

In the system as designed, you *know* that in this particular location,
the allocation size is possibly too large (because it depends on external
input). You *know* that because you designed it that way.

Hence two possibilities:

1- The allocation failure was due to the requested allocation being
too large. The the recovery attempt will succeed and it is fine to
continue.

2- The allocation failure was not due to the requested allocation
being too large and instead due to allocation being totally impossible
on the system now. Then the recovery attempt will fail and the
program will terminate.

Essentially you are advocating assuming that you are always and will
always ever be in situation #2. I am advocating that given good
design, good expertise and knowledge, you can know where it is worth
checking if you are in situation #1 and it may be worth attempting to
recover.

Of course, that also leave allocation failures that happens elsewhere in
the program. In this case, the result will be as you advocate and the
application will terminate. As designed.

On modern operating
systems with virtual memory, it's may not even be the primary reason
to see an allocation failure[1]. Heap fragmentation / allocator
limitations are far more likely to cause an OOM condition. Such
issues equally effect small and large allocations: it depends entirely
on the algorithm used to allocate memory. Do you know what your
std::allocator does? It probably doesn't do what you think it does.
Likewise, malloc() probably doesn't behave the way you think it does
either.

Can we at least agree that the application can't know if an allocation
failed because of heap fragmentation, allocator limitation, maximum
per-process OS enforced limits or OS actually having run out of memory
altogether? The visible result for the application will be the same.
So this is irrelavent to the discussion. For what it's worth, the
posted code will work as advertised on a system with little physical
RAM and disabled virtual memory.

Included code demonstate that on modern operating system, it is
possible to design an application that may have to allocate memory
depending on external input (unknown at compile time) in a system with
unknown current available resources and that it can be design in such
a way so that in some/one location in the code, allocation failure are
likely to be primarily caused by "large" allocation requests.

The posted code does not in any way attempt to demonstrate that the
application will always be able to recover from all possible causes of
OOM errors at any possible places in the code.

I may never, ever see an error because the allocation was "too
large". It's poor justification for going through the effort of
writing an OOM handler. Size alone does not tell me which allocations
in the application will fail.

For your application, this may be the case. For other applications,
this may not be the case.

As I've stated many times: where one thinks the allocation failures
will happen and where they will actually happen are two entirely
different things.

You choose to quit at the first hurdle. I choose to at least attempt
to jump. If I fail, no loss. I am no worse than you. If I succeed,
I live to fight another day.

See posted code. The application is designed so that allocation
failure due to input complexity happen where planned.

Other allocation failures may happen elsewhere. The way an allocation
failure happening elsewhere is handled will be different than the way
it will be treated if it happened in the purposefully designed
"recoverable" section.

Especially when executing threaded code. Simply
*saying* design does not tell me what I need to know. You haven't yet
told me all the factors I need to consider in my design. There are
plainly more factors than how much memory a particular request makes
and why the request is being made.

I am not the designer of your application so I can't know what are all
the factors that need to be considered for *your* design. I have no
ideas of *your* requirements.

I will leave it to you as an exercise to modify the code previously
posted so that it can run multithreaded and have recovery points.

It's no different from optimization: just because you claim the
hotspot is 'X' does not actually mean the hotspot is 'X'. Just
because you say, "The program will run out of memory here" does not
mean that will ever actually happen in practice.

Do you like trying to put claims in other peoples mouths?

I am saying: I design the program so that potentially large allocation
happens here and given this design I choose to attempt to recover *if*
the allocation failure happened there.

Question: In the posted simplitic example, does the hotspot happen
where planned?

If the program has an allocation failure elsewhere, the result is the
same as what you preach: terminate. So no loss whatsoever for other
allocation failure. Gain for some specific allocation failures.

No, it doesn't highlight that memory is available in general. It
highlights memory was available in whatever asinine test cases you
came up with, or that memory was not allocated in that particular
situation. I'm not sure why you think an inductive proof has any
value here whatsoever. I'm also not sure why you think such a
simplistic example has any value whatsoever.

A simplistic example can disprove a generality.
You argue against ever attempting to recover from OOM because it's so
much more difficult than anyone can imagine. That it's so difficult
that it never ever worth even attempting.

The simplistic example demonstrate that it is perfectly possible and
not necessarily complex in specific cases if you design your
application this way.

If you choose to design your application in such a way that nowhere is
it possible to recover from an allocation failure, it is *your* choice.

The goal is to improve robustness by handling OOM. It was already
stated that using iostreams as-is will not do this, so I'm not sure
what you hoped to prove by writing this example. Write an example
that actually improves robustness. And since you claimed this can be
done in a multi-threaded program without impacting the other threads,
do that too. Otherwise, you haven't demonstrated anything of any
value whatsoever.

Bullshit! (sorry for the rudeness but you asked for it)

The example supply improved robustness since even after an OOM error,
the program recovers and can keep processing further inputs.

The example can be extended to multithread using the same principle.
It's simple to do. Can you do it?

You keep claiming that everyone else are giving unsupported claims.
I gave you an example supporting my claims. The example demonstrate
that it is possible to recover from some OOM errors. What about you
proving your claims?

Just for grins, try allocating all that space one byte at a time (go
ahead and leak it), so you actually fill up the freestore before
making the failed allocation. Then see how much space you have
available, if you don't outright crash your computer[2][3].

So you are advocating that I should design my application in such a
way that OOM error are always fatal. IN such a way that I
purposefully micro-allocate lots and lots of memory so that failure
will most likely happen at totally random places. Euh?!? Well the
consequences will be that OOM errors will always be fatal. That the
way the application is designed.

I chose to design the application so that some OOM errors can be
handled. Is there a law against good design and an obligation to
always enforcing stupid design?

I am not sure I understand why, in order to support you arguments that
it is never possible to ever recover from an OOM error, everyone
should always design their application so that they are purposefully
designed not to be able to recover from an OOM error.

The key to the design is precisely that it does not allocate memory
one byte at a time. Because of this design, the application will never
run out of memory altogether purely due to input complexity. It
doesn't matter if it is multi-threaded or single threaded. The
application uses malloc/new/allocator in a similar way as you
suggested to use predefined arbitrary limits: instead of checking that
the input are of lower complexity than X, it "check that I have enough
resources to process this input and if so, reserve them immediately".
The std::bad_alloc is just the allocator answering "no" to the first
question. The failed "new" does not affect the available resources,
it fails to acquire them.

Such a simplistic handler will not protect other threads from failing
during stack unwind for the original std::bad_alloc. How can it
possibly do so?

Given your claimed superior knowledge of allocators, please enlight us
on why the same pattern would fail on a multithreaded setup?

Can you clarify why thread #1 *failing* to allocate a large block of
memory directly stop thread #2 from being able to allocate a small
block of memory? Are your allocator not-thread safe?

It will work as advertised. Assuming you design your multithreaded
application intelligently and build in places where recovery is
possible. If allocation failure was due to the size of allocation,
the other threads will keep working fine. If the allocation failure
was due to other reason, the program will terminate.

You choose to believe it is not possible. I know it is possible (in
limited circumstance, for specific cases, if you design carefully,
even in multithreaded applications).

Obviously, if you design you application to leak memory one byte at a
time, the application is likely to fail at any random point and will
most probably not be able to recover. That's your design and your
choice.

You haven't yet designed a system for this purpose, so your opinion is
worth nothing.

This would be worth an rude reply but I'll skip.

BTW: all your claims so far are unsupported. Kettle, pot, black?

[3] This is of course one reason that restarting is inherently
superior to handling OOM: if your program does leak memory, then
restarting the program will get that memory back. Plus, you will
eventually have to take it out of service anyway to plug the leak for
good.

If you program leaks memory, you should fix it. Not rely on periodic
restarts. Sorry but IMO, crashing and restarting is inherently
inferior to not crashing and staying in a fully stable state. We will
have to agree to disagree but I doubt users seeing the apps crash will
be particularly happy.

Are you now recommending that just in case an application may have
been written by an incompetent programmer that leaks memory, every
persistent service application in the world should always be restarted
periodically?

BTW: designing your application so that it attempts to recover from
some error does not mean that you can't also have a monitor that
restart the application if it crashes.

Sorry about the tone of some of my comments but the style of your
answer annoyed me. I think the discussion is worthwhile and
interesting.

Yannick

Adam Skutt · Sep 14, 2011

You are being obtuse and refuse to even try to understand.

No, I understand your code and your point perfectly. You're simply
wrong, because you have zero evidence that any real application will
behave as your little example. Real applications can behave in a
myriad of ways, including in the two I suggested: total failure to
allocate memory memory and where "little" allocations fail while
"large" allocations do not.

Your example plainly does not work as intended in either of those two
situations. More importantly, it does not achieve your original
stated goal: to improve the robustness of the application and isolate
the failure of one processing from impacting the others.

Your lack of understanding about how programs, runtimes, and operating
systems allocate memory and refusal to consider the situations I've
posed does not make me obtuse.

The fact you're unwilling or unable to accept that the code may never
throw std::bad_alloc where you claim it will is simply gobsmacking.
You've provided zero evidence to believe that the code will fail where
you claim it will with any regularity. As such, we have no reason
whatsoever to believe your design will improve robustness. Even if we
believed you have managed to place the catch block correctly (you
didn't), the entire discussion started over the difficulty of writing
what goes inside the catch block! What you've written is plainly
insufficient, so you still haven't met the burden of proof upon you.

Until you actually present something that will demonstrably improve
robustness in a real application under a wide variety of failure cases
(or demonstrate their irrelevance), there's no reason to discuss this
any further than this e-mail. You've been presented with plenty of
reasoning as to why your views are invalid and your example will not
work in a real application. If what you've been given isn't sufficient
proof, then I doubt anything will be sufficient proof.

There is not need to be able to recover from all allocation failure.
You seem to be the one that claims that because it is impossible to
recover from all possible allocation failures, then you should never
ever under any circumstances even consider attempting to recover from
any allocation failure whatsoever.

No, I say that only handling some failures doesn't buy you anything
because you have no way to tell which allocations will fail. Writing a
catch handler that never runs does nothing but waste my time. You
have yet to tell me how to determine which allocations will fail and
which allocations will not. You keep harping on "large", but I've
already demonstrated that size enough isn't sufficient information.
If your allocator is pooling and doesn't return memory to the
operating system, one failed allocation might be enough to prevent all
future allocations from succeeding, if they use a different allocator.

1- It works
2- Some things is being done in the catch block

No problem. Not difficult. I am not sure what is your problems with
it.

It doesn't work if there is no memory remaining, because the iostream
might allocate behind your back. Ergo, it is not robust. Just
because you believe your code proves there must be memory remaining
does not mean it has actually done so. You have not tested all the
possibilities yet. Inductive fallacies are not proof, and they are
just another reason I see no point in discussing this topic with you
further.

I have no idea what claim you claim that I made that you claim that I
cannot support.

All of them, ignoring the fact you've seriously backed off your
original claims. Which is what I still want to see, because they
would actually be interesting as opposed to what you are trying to do
now.

As far as I can understand, you seem to claim the generic that
recovering from any allocation failures whatsoever is impossible to do
and never worth doing.

I've never claimed any such thing. I've claimed it's almost never
worth doing because it's simply too hard to do robustly, and it's far
easier to improve robustness in other ways. I've also claimed that it
is considerably harder than its proponents suggest and demonstrated
the issues with the proposed solutions and why they are not actually
robust.

In the system as designed, you *know* that in this particular location,
the allocation size is possibly too large (because it depends on external
input). You *know* that because you designed it that way.

Hence two possibilities:

1- The allocation failure was due to the requested allocation being
too large. The the recovery attempt will succeed and it is fine to
continue.

2- The allocation failure was not due to the requested allocation
being too large and instead due to allocation being totally impossible
on the system now. Then the recovery attempt will fail and the
program will terminate.

Those are not the only two possibilities, which is part of the problem
with your reasoning. Consider:
3. The allocation succeeds but then causes all future allocations to
fail due to its size (i.e., it filled the freestore).

This should be plainly be handled in the same manner as the first
possibility, since the size of the user's request caused the failure.
However, your code example will not handle this possibility correctly.

Essentially you are advocating assuming that you are always and will
always ever be in situation #2.

Nope, I'm advocating that you cannot tell what situation you are in
and that you may well be in a situation where recovery is impossible
or pointless, generally speaking. It's obviously possible to figure
these things out, but it requires substantially more work than you, or
any other advocate, believe it does. Figuring out the situation
typically requires knowing low-level runtime and operating system
details so you can reason about what situations are possible and when
they can occur. Such details aren't trivial by any stretch of the
imagination.

Can we at least agree that the application can't know if an allocation
failed because of heap fragmentation, allocator limitation, maximum
per-process OS enforced limits or OS actually having run out of memory
altogether? The visible result for the application will be the same.

Your first statement is right and your second statement is wrong. The
visible result for the application will plainly not be the same. If
you've reached the commit limit, then all requests for more memory
from the OS will fail until the limit is increased or memory is
returned the operating system. If a request fails because it's simply
too large (e..g, it's bigger than your virtual address space), then
reasonable requests will still succeed.

Why the allocation failed matters, whether you like it or not. It
matters because it determines whether there's any value in spending
the time on writing a catch handler.

You choose to quit at the first hurdle. I choose to at least attempt
to jump. If I fail, no loss. I am no worse than you. If I succeed,
I live to fight another day.

No, you're far worse off because you've spend a large amount of money
designing, implementing, and (badly) testing code that's effectively
dead. I'm better off because I got to keep all that money. Writing
code is not free.

I am not the designer of your application so I can't know what are all
the factors that need to be considered for *your* design. I have no
ideas of *your* requirements.

Then you cannot possibly claim it's generally worthwhile to handle OOM
conditions, yet you've attempted to do precisely that several times
over!

Just for grins, try allocating all that space one byte at a time (go
ahead and leak it), so you actually fill up the freestore before
making the failed allocation. Then see how much space you have
available, if you don't outright crash your computer[2][3].

Click to expand...

So you are advocating that I should design my application in such a
way that OOM error are always fatal. IN such a way that I
purposefully micro-allocate lots and lots of memory so that failure
will most likely happen at totally random places. Euh?!?

No, I'm suggesting you try test cases that better exercise all of the
conditions under which OOM can occur. Many applications include many
small allocations to go along with their large allocations. Many
application may never make singular large allocations, perhaps they
structure their data using a linked list or a tree instead of an
array. Your example is simply not realistic, and I gave one example
of how it is unrealistic. The large allocation may be the last
allocation in a long string of small allocations. It may succeed,
filling the address space, causing all future allocations to fail.

Given your claimed superior knowledge of allocators, please enlight us
on why the same pattern would fail on a multithreaded setup?

Can you clarify why thread #1 *failing* to allocate a large block of
memory directly stop thread #2 from being able to allocate a small
block of memory? Are your allocator not-thread safe?

If the first allocation failed because the virtual address space is
full, the second allocation will fail for the same reason. Your
original claim was that you could isolate failure in one thread from
failure in others simply by catching OOM in the first thread. This is
obviously not possible when the reason for failure has nothing to do
with the size of the request.

Sure, your trivial example won't fail, but an example that considers
only one allocation isn't particularly interesting. Very few
applications are written to process their requests with only one
contiguous dynamic allocation. The problem is that you have no way of
knowing that it was the size of the request. If the only reason my
program will ever fail is because virtual address space is full (i.e.,
all requests are "reasonable") then my program will be terminating.
That makes writing the catch handler pointless.

Remember, this issue originally came up because someone claimed that
stack unwinding would free memory, allowing the other threads to
proceed. That's not true because allocators may not return memory to
the OS or a global pool, and different threads and/or data types may
use different allocators. It's also not true because there just may
not be enough resources left to be freed.

[3] This is of course one reason that restarting is inherently
superior to handling OOM: if your program does leak memory, then
restarting the program will get that memory back. Plus, you will
eventually have to take it out of service anyway to plug the leak for
good.

Click to expand...

If you program leaks memory, you should fix it.

Yes, I agree. Doing that means restarting the program. That means I
must have built a system that can tolerate restarting the program.
That seriously diminishes the value of writing code with the sole
purpose of avoiding program restarts.

Nevermind the fact my program may leak due to factors entirely outside
my control, like buggy
operating system or runtime libraries.

Adam

Goran · Sep 15, 2011

No, you're far worse off because you've spend a large amount of money
designing, implementing, and (badly) testing code that's effectively
dead. I'm better off because I got to keep all that money. Writing
code is not free.

We're talking about bad_alloc, hence exceptions. In that light, the
above is wrong. In exceptions-based code cost you're mentioning is
virtually non-existent^^^, because:

* in good C++ code, number of try/catch statements is __seriously__
small, and that number of "bad_alloc" catches is even smaller.
* in such code, there's absolutely no need to do anything particular
WRT OOM, __except__ in a catch(const bad_alloc&) {}.

In fact, hereby I am putting this to you: you don't know how to work
with exceptions effectively. You are missing the necessary mindset
change, from:

* trying to guess, all the time, whether there can be an error
somewhere, and what are the consequences (bad),

to

* assigning exception safety levels to bits of code (good).

In the former case, indeed, cost of handling OOM is immense, because
number of explicit failure points that must be handled is immense.

In the latter, however, you just don't care about OOM in particular.
You think about your exception safety guarantees, and OOM is in the
same bag with all other exceptions - ultimately, you don't even see
it.

Goran.

^^^ The cost is merely making sure that what's in a (rare) catch(const
bad_alloc&) { ---HERE--- } is either a no-throw operation, either not
the last catch there is (which is a pretty trivial affair).

Adam Skutt · Sep 15, 2011

We're talking about bad_alloc, hence exceptions. In that light, the
above is wrong. In exceptions-based code cost you're mentioning is
virtually non-existent^^^, because:

* in good C++ code, number of try/catch statements is __seriously__
small, and that number of "bad_alloc" catches is even smaller.

It's not quantity of handlers that drives cost, but the difficulty in
writing the handler itself. No one has yet to show how to write such
a handler correctly and robustly. It's not something I can read in a
textbook. The cost of writing the handler is high, even if only I
have to write it once, in one place in my code.

I was told I can avoid termination under OOM in general. So I want to
see a handler that does precisely that, and that actually reduces my
likelihood of having to terminate anyway. Showing me code that only
works when attempting to allocate a single array that's absurdly too
large is not the least bit interesting. Most applications don't fail
due to a single allocation that's too big. They fail because they
made a request that was too big given the memory allocations made
before the failing request. Code and techniques to deal with that
situation in a robust fashion would actually be interesting.

* in such code, there's absolutely no need to do anything particular
WRT OOM, __except__ in a catch(const bad_alloc&) {}.

Wrong. Writing a _robust_ handler for out of memory conditions will
require design and coding changes to the rest of the application.
Trivial example: you can't preallocate the resources for the catch
handler inside the catch handler!

In fact, hereby I am putting this to you: you don't know how to work
with exceptions effectively. You are missing the necessary mindset
change, from:

* trying to guess, all the time, whether there can be an error
somewhere, and what are the consequences (bad),

to

* assigning exception safety levels to bits of code (good).

Except the former is what we have to do, because we do not generally
know what exceptions any given piece of code will raise, nor why it
will raise the exceptions. In fact, this is precisely why most
languages provide exception handling semantics: it lets us write most
of our code in a fashion that generally disregards error conditions
altogether and only worry about the error conditions where they can be
handled, if they can be handled at all.

"Exception safety" is about ensuring the code retains a consistent
state when an exception is thrown. It doesn't make the task of
handling an exception any easier. It's purpose is to ensure the
program can continue onward after the exception, in some fashion, if
it chooses to do so. That fashion may or may not be compatible with
continuing onward after any arbitrary error condition.

In the latter, however, you just don't care about OOM in particular.
You think about your exception safety guarantees, and OOM is in the
same bag with all other exceptions - ultimately, you don't even see
it.

Except OOM is not in the same bag as all other exceptions. There are
plenty of exceptions, like those that are raised by programmer error,
where termination is the only sensible response. There are plenty of
exceptions, which includes OOM and serious I/O errors, where
termination may be the only practical response and/or it may be forced
upon you.

Exception safety guarantees don't necessarily ensure I can continue
onward or retry the failed operation. The weak exception safety
guarantee doesn't provide for either capability per se. The only
general assurance I have is the ability to go off and do something
else, which may not be of any value whatsoever.

Moreover, if what you said was really true, then exception handlers
wouldn't care about the type of the thrown exception. They do care
about type, so clearly the type of exception matters.

^^^ The cost is merely making sure that what's in a (rare) catch(const
bad_alloc&) { ---HERE--- } is either a no-throw operation, either not
the last catch there is (which is a pretty trivial affair).

The cost of doing such a thing robustly is very, very large. There's
a reason why substantial amounts of code do not provide a strong
exception guarantee, never mind a no-throw guarantee. Merely
suppressing the potential exception doesn't mean robustness has been
improved in reality. Even where it is possible, it may not be worth
the actual effort involved.

Adam

Goran · Sep 15, 2011

It's not quantity of handlers that drives cost, but the difficulty in
writing the handler itself.

No, it's __not__ difficult writing a piece of no-throw code. If
nothing else, it's try{code here}catch(...){swallow}. It's actually
trivial.

Wrong. Writing a _robust_ handler for out of memory conditions will
require design and coding changes to the rest of the application.
Trivial example: you can't preallocate the resources for the catch
handler inside the catch handler!

Why on Earth would anyone attempt to do such a thing!? It takes
serious oversight not to realize that an allocation is a no-no inside
a catch(const bad_alloc&) {}. I mean, what do you think, that people
are retarded?

Except the former is what we have to do, because we do not generally
know what exceptions any given piece of code will raise, nor why it
will raise the exceptions. In fact, this is precisely why most
languages provide exception handling semantics: it lets us write most
of our code in a fashion that generally disregards error conditions
altogether and only worry about the error conditions where they can be
handled, if they can be handled at all.

"Exception safety" is about ensuring the code retains a consistent
state when an exception is thrown. It doesn't make the task of
handling an exception any easier. It's purpose is to ensure the
program can continue onward after the exception, in some fashion, if
it chooses to do so. That fashion may or may not be compatible with
continuing onward after any arbitrary error condition.

Except OOM is not in the same bag as all other exceptions.

As far as writing exception-safe code goes, this is very, very wrong.
It's wrong because exception safety almost exclusively deals with
recovery and cleanup techniques, who almost never require additional
resources. And if you __do__ have recovery/cleanup that does need
resources, then you preallocate those prior to starting whatever
operation you have and you're done.

Exception safety guarantees don't necessarily ensure I can continue
onward or retry the failed operation.

True, but I never claimed one should do either. (And I told you that
in a previous post, which you didn't answer to; do you think that re-
iterating a false argument makes it true? No, it's not like washing a
t-shirt).

Moreover, if what you said was really true, then exception handlers
wouldn't care about the type of the thrown exception. They do care
about type, so clearly the type of exception matters.

Yes, in a dozen catch blocks on a 100K codebase, 3 of them catch a
specific exception. That's sooo important... IOW, you are overstating
it. catch handlers are very rare by themselves, those who do need
something particular are even rarer (like catching bad_alloc). Again,
I put this to you: you don't know how to work with exceptions
effectively, otherwise you wouldn't be making such claims.

The cost of doing such a thing robustly is very, very large.

First, my argument is not related to robustness, only correctness.

Second, even robustness is not as hard as you're making it out to be,
not once you accept low resources. You simply prepare resources you
need need up-front and make sure you don't do something really dumb.
IOW, you lower your expectations and are done with it.

There's
a reason why substantial amounts of code do not provide a strong
exception guarantee,

What!? WHAT!? No, in practice, most code needs (and has) __exactly__
that. If code you write doesn't, then you don't know how to work with
exceptions.

never mind a no-throw guarantee. Merely
suppressing the potential exception doesn't mean robustness has been
improved in reality.

So what? No-throw guarantee doesn't imply robustness. It's not by
accident that it's also called "failure TRANSPARENCY". Robustness and
exception safety are theoretically 100% orthogonal.
In practice, they meet very rarely. E.g. in a last-ditch error-
reporting (NOT handling) attempt of a very large amount of code (in
"main", on the "outside" of a message loop handler of some GUI code,
on the loop of a "forever" thread proc etc).

You're just trying to put all in the same bag and make a mess. There's
no need for that.

Goran.

C++: The Good and Bad	17	Jan 26, 2007
performance of freestore management	3	Oct 6, 2006
high traffic/availability application and gnu_cxx::hash_map problem -better to use tr1/unordered_map	1	Jan 5, 2011
What I feel about STL.	6	Oct 17, 2006
C++ equivalent to spaghetti code	33	Jul 15, 2008
Musatov's 'Mode/Code' Primary method call	4	Oct 31, 2009
Musatov claims "Mode/Code"	2	Oct 31, 2009
Bjarne Stroustrup has new C++ text coming out ...	1	Oct 19, 2008

bad alloc

none

none

Adam Skutt

Adam Skutt

Adam Skutt

Paul

Adam Skutt

Paul

Goran

Paul

Goran

none

Adam Skutt

none

Adam Skutt

none

Adam Skutt

Goran

Adam Skutt

Goran

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads