using exceptions as a "deep" return

P

Phlip

Noah said:
500 iterations of 520 tests. Can't do more or exception version would
take forever. Guess that settles the speed question, no?

Thanks! But I can't tell what you mean by the results. Which was slower?

And we need Pete Becker to tell us if the depth of the call tree matters,
and if filling its stack frames with huge objects with expensive destructors
matters.
 
M

Mark P

Phlip said:
Hmm. Someday all of us hope to be good enough at STL to be able to tell the
difference between an elegant solution and STL abuse.

Right now I'm just thinking "wow! functors!", and can't constructively
criticize them!


Why not assign the boolean to a member of the root-most object, then croak
all the inner loops?

If I understand you, I think that's basically what I said in the
preceding paragraph about polling a parameter. As I said though, I find
this unappealing and it feels like manual exception handling.
 
R

Risto Lankinen

Phlip said:
The topic we have all been avoiding:

Is it possible, somewhere, somehow, that throwing the exception could
actually be faster than returning?

I have implemented a min-max game tree scanner with alpha/beta
cutoff, where Alpha and Beta were implemented as exceptions.

This makes the algorithmic code so clean that there is no strictly
equivalent "orthodox" version that it could be compared against
without adding non-trivial amount of structural code. Comparing
the version having the necessary extra wiring, their performance
was about the same, but the simplicity of the exception-based
algorithm made me stick with it (FWIW, the product was initially
shareware and later "slightly commercialized", e.g. tagged with
some ads of a company who bought its distribution for marketing
purposes).

For background: both Alpha and Beta cutoffs are such that when
they occur, the game tree preorder scanner can prune exactly two
levels of recursion (i.e. to the next level of the _same_ player in a
two-player game). This can be a bit tedious to implement without
using exceptions, since a flag needs to be added in every return
value of the scoring function to indicate whether an Alpha or Beta
cutoff may occur, and this flag also needs to be checked in every
iteration of the level in the middle.

Cheers!

- Risto -
 
N

Noah Roberts

Risto said:
For background: both Alpha and Beta cutoffs are such that when
they occur, the game tree preorder scanner can prune exactly two
levels of recursion (i.e. to the next level of the _same_ player in a
two-player game). This can be a bit tedious to implement without
using exceptions, since a flag needs to be added in every return
value of the scoring function to indicate whether an Alpha or Beta
cutoff may occur, and this flag also needs to be checked in every
iteration of the level in the middle.

Me thinks maybe you misunderstand the algorithm and/or have implemented
it inefficiently. Either that or you are talking about a different
algorithm, not the one designed by Knuth to improve the NegaMax. For
one thing, your statement that it, "can prune exactly two levels of
recursion," doesn't ring true. AB narrows the search window as it goes
so that it can drop uninteresting branches (complete branches, even at
the root+1) without searching them completely...as soon as it 'knows'
that the score is at least worse than some other, better move. The
assumption behind AB is that you and your opponent will always make the
best move that is possible.

int Search::search(int depth, int alpha, int beta)
{
if (depth == _targetDepth) //return
evaluator->fullEval(_searchBoard.get());
return quiescence(alpha, beta);

MoveList *moves = _moveControl->generateMoves(_searchBoard.get());
u_int16 m = moves->nextMove();
while (m && alpha < beta)
{
Move *move = new Move((m>>8), (m&0xFF));
_searchBoard->makeMove(move);
int score = -search(depth + 1, -beta, -alpha);
_searchBoard->unmakeMove(move);
if (score > alpha) alpha = score;
m = moves->nextMove();
delete move;
}
_moveControl->destroyMoves(moves);
return alpha;
}

I wrote this a few years ago before really learning that _ was bad so
forgive the icky variable names.

This same search function can be used with MTD(f) and other window
guessing searches. No flags, incredibly simple, and no using
exceptions as the return mechanism. Just a window and a target depth.

quiescence is a depthless version of the same algorithm that uses a
capture or escape only move generator meant to limit the horizon
effect.

Now, there are extensions to the basic AB search that you can add that
I am not doing. Null move heuristic, move ordering, etc... Move
ordering should be done by the generator so that is a non-issue. Null
move is an easy addition. Search extensions is something I hadn't
quite learned all the way and implemented when I last worked on this
guy...they do add a little bit extra in code. But the basic AB search,
like above, doesn't require the things you are talking about.

I've already shown exceptions to be significantly slower, by several
orders of magnitude, than the standard return mechanism, on at least
one widely used architecture/compiler and in an area like this search
that is extreemly important...the AB has to be fast or you will never
get deep enough to play decently. They also add nothing to the bargain
so they are a pure loss when used as a return mechanism. Exceptions
are for errors.
 
B

Bart van Ingen Schenau

Mark said:
I should add one other practical detail to this discussion. In my
chain of function calls, I don't have direct control over some of the
intermediate functions. Specifically, I use a std::set with a custom
comparison functor, and should the functor ever find that two elements
compare equal, it turns out that this implies that the result of my
entire computation is false. So you see that passing back a chain of
return values may not even be possible since the calls to the functor
are dictated by the std::set.

Even though you can't change how std::set calls your comparison functor,
that does not preclude finding out if two elements in the set would
compare equal.
std::set allows you to determine this, because one of the invariants of
the class is that no two elements in a std::set ever compare equal. If
you use the simple std::set::insert(Key) function to add elements to
the set, you can immediately look at the return value to determine if
insertion was successful.
If you use some other method of populating the set, you could verify
that after the insertions, the set contains as many elements as you
tried to insert into it.

Bart v Ingen Schenau
 
K

Kaz Kylheku

Noah said:
29673156
29673156
29673156
29675625

500 iterations of 520 tests. Can't do more or exception version would
take forever. Guess that settles the speed question, no?

#include <iostream>

#include <Windows.h>

Oh man, what an incompetent moron.

ANSI C gives us the clock() function for doing this type of thing. No
need for Windows extensions.
void test(bool (*f)())
{
cout << static_cast<unsigned long>(GetTickCount()) << endl;

Good one, let's sample the tick count just before ostream flushes its
buffer to the operating system.

Let's not care about the time it takes to call GetTickCount().

Let's not take multiple samples of time and average them together.

Let's not parametrize anything interesting, like the call stack depth.
 
P

Phlip

Kaz said:
Oh man, what an incompetent moron.

ANSI C gives us the clock() function for doing this type of thing. No
need for Windows extensions.

You must be describing yourself. clock() has a resolution in tens of
milliseconds. Everyone knows it's not good for profiling.

Or would you have lost your cool over a POSIX method posted here?
 
K

Kaz Kylheku

Phlip said:
The topic we have all been avoiding:

Is it possible, somewhere, somehow, that throwing the exception could
actually be faster than returning?

Yes. Here is a benchmark program.

First, the CPU and compiler information. Linux gives us this, so I will
just paste. This box has four of these processors, by the way:

processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Xeon(TM) CPU 2.80GHz
stepping : 1
cpu MHz : 2793.087
cache size : 1024 KB
physical id : 0
siblings : 2
runqueue : 0
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm lm
bogomips : 5570.56

Next, the compiler, GNU C++:

g++ (GCC) 3.2.3 20030502 (Red Hat Linux 3.2.3-49)

I just used this with -O2 to make an a.out.

The program:

#include <cstdio>
#include <ctime>

using std::printf;
using std::clock_t;
using std::clock;

const int calibrate_iterations = 1000000;
const int return_test_iterations = 100000;
const int exception_test_iterations = 100000;

const int num_call_depths = 10;
const int starting_call_depth = 100;
const int call_depth_increment = 100;

static clock_t start;
static long sample_sum;
static long sample_count;

static void reset_chrono()
{
start = sample_sum = sample_count = 0;
}

static void start_chrono()
{
start = clock();
}

static void stop_chrono()
{
clock_t stop = clock();
clock_t diff = stop - start;
sample_sum += diff;
sample_count++;
}

double get_average_time()
{
return ((double) sample_sum) / sample_count / CLOCKS_PER_SEC;
}

/*
* Regular return.
*/

bool rt_top(int, int = 0);
bool rt_middle(int, int);
bool rt_bottom(int, int);

bool rt_test(int max_depth)
{
return rt_top(max_depth);
}

bool rt_top(int max_depth, int depth)
{
return rt_middle(max_depth, depth + 1);
}

bool rt_middle(int max_depth, int depth)
{
return rt_bottom(max_depth, depth + 1);
}

bool rt_bottom(int max_depth, int depth)
{
if (depth >= max_depth)
return false;
return rt_top(max_depth, depth + 1);
}

/*
* Exception-based return
*/

bool ex_top(int, int = 0);
bool ex_middle(int, int);
bool ex_bottom(int, int);

bool ex_test(int max_depth)
try {
return ex_top(max_depth);
} catch (...) {
return false;
}

bool ex_top(int max_depth, int depth)
{
ex_middle(max_depth, depth + 1);
}

bool ex_middle(int max_depth, int depth)
{
ex_bottom(max_depth, depth + 1);
}

bool ex_bottom(int max_depth, int depth)
{
if (depth >= max_depth)
throw false;
ex_top(max_depth, depth + 1);
}


int main()
{
/*
* First, calibrate test.
*/

reset_chrono();

for (int i = 0; i < calibrate_iterations; i++) {
start_chrono();
stop_chrono();
}

double chrono_bias = get_average_time();

printf("chrono_bias = %12.9fs\n", chrono_bias);

/*
* Now do return speed test.
*/

for (int depth = 0; depth < num_call_depths; depth++) {
int max_depth = starting_call_depth + depth *
call_depth_increment;

reset_chrono();

for (int i = 0; i < return_test_iterations; i++) {
start_chrono();
rt_test(max_depth);
stop_chrono();
}

double return_time = get_average_time() - chrono_bias;

reset_chrono();

for (int i = 0; i < exception_test_iterations; i++) {
start_chrono();
ex_test(max_depth);
stop_chrono();
}

double exception_time = get_average_time() - chrono_bias;

printf("\n");
printf("max_depth = %d\n", max_depth);
printf("return_time = %12.9fs\n", return_time);
printf("exception_time = %12.9fs\n", exception_time);
}

return 0;
}


The results:

chrono_bias = 0.000000400s

max_depth = 100
return_time = 0.000002000s
exception_time = 0.000009000s

max_depth = 200
return_time = 0.000003000s
exception_time = 0.000009400s

max_depth = 300
return_time = 0.000004900s
exception_time = 0.000009900s

max_depth = 400
return_time = 0.000006300s
exception_time = 0.000010000s

max_depth = 500
return_time = 0.000007700s
exception_time = 0.000010800s

max_depth = 600
return_time = 0.000009400s
exception_time = 0.000010500s

max_depth = 700
return_time = 0.000010900s
exception_time = 0.000011300s

max_depth = 800
return_time = 0.000012800s
exception_time = 0.000011100s

max_depth = 900
return_time = 0.000014200s
exception_time = 0.000011900s

max_depth = 1000
return_time = 0.000016400s
exception_time = 0.000012100s


As you can see, as the stack grows deeper, the exception mechanism
eventually becomes faster. The break-even point is somewhere at a depth
of about 700 to 800.

The results suggest that that exceptions have a high fixed cost on this
compiler, but a marginal additional cost to jump deeper.

In one experimental run, I moved the start_chrono() call for the
exception test to within the try { } region, and the results weren't
significantly affected. So there isn't much cost to setting up the
catch. The large, fixed overhead is somewhere in the processing of the
exception.
 
N

Noah Roberts

Kaz said:
Oh man, what an incompetent moron.

Heh. Can't argue on technical merit alone eh?
ANSI C gives us the clock() function for doing this type of thing. No
need for Windows extensions.

Guess you don't know that clock() can return just about anything,
useful or otherwise.

If the fact that I used the windows api instead of clock is the best
you got then you ain't got shit. The question is simply, so what if I
used one over the other? Do you know the differences enough to answer
that?
Good one, let's sample the tick count just before ostream flushes its
buffer to the operating system.

Done in both cases and small enough to not show up in
measurements...irrelivant
Let's not care about the time it takes to call GetTickCount().

irrelivant....less than 1 millisecond...done in both.
Let's not take multiple samples of time and average them together.

5000 is not enough samples??!! I could do more but then I would have
to take the exception test out as it would take way too long.

You can do the math to get the average if you really want...its
easy...(v2 - v1)/5000
Let's not parametrize anything interesting, like the call stack depth.

Granted, it isn't a perfect test but it also doesn't have to be. With
differences that great you don't need to get into much detail. It is
sufficient to show that in a simple case like above, the one you
described, using exceptions as a return mechanism (something they were
never intended to be) is several orders of magnitude slower than using
the _appropriate_ mechanism built for that purpose. No other factor in
that test can make enough difference to account for the vast difference
in results.

The second and third output show that the time it takes to call
GetTickCount and flush the buffer is so minute that it is not
measurable in milliseconds. Since we are dealing with a difference of
nearly 2500 milliseconds the impact of a variable adding less than 1
millisecond has negligable effects.

But go ahead and continue using the language wrong. Keep writing code
that is incredibly ineffecient. Keep calling everyone else a moron.
So long as I don't ever have to fix it I don't care.
 
N

Noah Roberts

Kaz said:
As you can see, as the stack grows deeper, the exception mechanism
eventually becomes faster. The break-even point is somewhere at a depth
of about 700 to 800.

Of course that is good because after a depth > 800 you get a few
microseconds faster using exceptions on every compiler and architecture.
 
K

Kaz Kylheku

Phlip said:
You must be describing yourself. clock() has a resolution in tens of
milliseconds.

So does GetTickCount(). ISTR that on some WinCE systems, they crank it
up a bit, but normally it's 100Hz.
Everyone knows it's not good for profiling.

Idiot, you need a high resolution hardware clock in order to time a
/single/ event accurately. If you can repeat the event enough times,
you can use averaging to magnify the precision of a low resolution
clock.

You can even time events that are much shorter than the granularity of
the clock.

Suppose that you have a clock that increments every millisecond. Over
many trials, the event you are timing reads as zero milliseconds about
90% of the time, but 10% of the time, it registers as one millisecond.
From that, you can estimate the duration of the event to be 0.1
milliseconds. One way to look at it is that there is a 10%
probability that the increment of the clock lands in the middle of the
event, which means that the event spans 10% of the clock period. This
estimate does assume that there is no bias introduced by some
dependency between the clock and the timing of event.

I think this has to do with that probability and statistics stuff I had
to pass once upon a time to get a CS degree. What do you think?
Or would you have lost your cool over a POSIX method posted here?

I haven't lost my cool; I just like to insult stupid people, because it
feels good.

Why don't you go net-psychoanalyze someone else.
 
P

Phlip

Noah said:
Of course that is good because after a depth > 800 you get a few
microseconds faster using exceptions on every compiler and architecture.

You might consider focussing on people you can actually learn from here.

(And also me;)
 
K

Kaz Kylheku

Noah said:
5000 is not enough samples??!! I could do more but then I would have
to take the exception test out as it would take way too long.

5000 is the number times the function is called, not the number of
times the clock is sampled.

Calling the function isn't sampling. Reading the clock is sampling.
You can do the math to get the average if you really want...its
easy...(v2 - v1)/5000

That depends on whether v2 and v1 have a good enough resolution
relative to the 5000, since they represent a single sampled clock
delta.

In the case of the return test, you witnessed that there was usually no
increment of the clock at all, so a zero average would be computed. At
odd times, you might catch a single increment of the tick by 10
milliseconds, and at those times, the above formula would yield 1/500th
of a millisecond.

A more accurate estimate is somwhere between zero and 1/500 ms, and can
be obtained with more clock samples.
Granted, it isn't a perfect test but it also doesn't have to be. With
differences that great you don't need to get into much detail. It is
sufficient to show that in a simple case like above, the one you
described, using exceptions as a return mechanism (something they were
never intended to be)

Regardless of their intent, slow or fast, exceptions are a dynamic
non-local return mechanism. That is what they are. This is semantically
proved by the fact that in fact function activations terminate and
control ends up at a higher level.
is several orders of magnitude slower than using the _appropriate_ mechanism built for that purpose.

You don't appear to be qualified to prescribe to others what is an
appropriate mechanism for a given purpose.

I reproduced a situation in which returning to a top level by exception
was faster than cascading returns.

I didn't deserve to be insulted for proposing the skeptical hypothesis
that under some circumstances, it /might/ be faster to return by
exception.

I defended that hypothesis. And thus my wristwatch tells me that I'm,
oh, right about done here.
 
I

Ian Collins

Kaz said:
Yes. Here is a benchmark program.

bool ex_top(int max_depth, int depth)
{
ex_middle(max_depth, depth + 1);
}
Doesn't gcc complain about the lack of a return?
The results:

chrono_bias = 0.000000400s

max_depth = 100
return_time = 0.000002000s
exception_time = 0.000009000s

max_depth = 200
return_time = 0.000003000s
exception_time = 0.000009400s

max_depth = 300
return_time = 0.000004900s
exception_time = 0.000009900s

max_depth = 400
return_time = 0.000006300s
exception_time = 0.000010000s

max_depth = 500
return_time = 0.000007700s
exception_time = 0.000010800s

max_depth = 600
return_time = 0.000009400s
exception_time = 0.000010500s

max_depth = 700
return_time = 0.000010900s
exception_time = 0.000011300s

max_depth = 800
return_time = 0.000012800s
exception_time = 0.000011100s

max_depth = 900
return_time = 0.000014200s
exception_time = 0.000011900s

max_depth = 1000
return_time = 0.000016400s
exception_time = 0.000012100s


As you can see, as the stack grows deeper, the exception mechanism
eventually becomes faster. The break-even point is somewhere at a depth
of about 700 to 800.

The results suggest that that exceptions have a high fixed cost on this
compiler, but a marginal additional cost to jump deeper.
OK, I'll bite:

Pentium M 2GHz, Solaris snv_b33, Sun Studio 11.

Built with -O2.

chrono_bias = 0.000000540s

max_depth = 100
return_time = 0.000001060s
exception_time = 0.000005960s

max_depth = 200
return_time = 0.000002060s
exception_time = 0.000011260s

max_depth = 300
return_time = 0.000002360s
exception_time = 0.000016460s

max_depth = 400
return_time = 0.000003960s
exception_time = 0.000022360s

max_depth = 500
return_time = 0.000004860s
exception_time = 0.000027460s

max_depth = 600
return_time = 0.000006060s
exception_time = 0.000033260s

max_depth = 700
return_time = 0.000007160s
exception_time = 0.000038560s

max_depth = 800
return_time = 0.000008360s
exception_time = 0.000043560s

max_depth = 900
return_time = 0.000009360s
exception_time = 0.000049660s

max_depth = 1000
return_time = 0.000010260s
exception_time = 0.000054660s

As you can see, with this compiler, exceptions are always slower.
 
N

Noah Roberts

Kaz said:
5000 is the number times the function is called, not the number of
times the clock is sampled.

Calling the function isn't sampling. Reading the clock is sampling.

WTF are you talking about? You do it that way you DEFINATELY have to
account for all of the following and more:

All calls to timing functions.
All time spent outside the program (os premption)
All time restablishing program counters, etc...

Since each run takes less than your measurable clock there is no way to
account for any of them. No...you run multiple times and take the time
pre/post iterations and then average. Then all the little stuff is
insignificant.

You can't take an average of any unit that is smaller than your
smallest measure. Only after you take enough samples to add up to at
least one of your measuring units can you even think about it.

I'm surprised you passed your statistics class with your methodology.

I could run 5000 iterations several times and take an average and this
would be more accurate, but the variation would not alter results this
vastly different. It is simply not necissary to get that detailed. It
would be like taking several samples of a blade of grass and comparing
to several samples of full grown adult fir trees. You don't have to
sample more than one of each to know what the outcome will be.

Besides, I did run the tests several times to see if there was ever a
significant value change and there wasn't. Exceptions always take at
least 2000x longer on this arch/compiler....no amount of multi-sample
averaging will change that.
 
K

Kaz Kylheku

Ian said:
OK, I'll bite:

Pentium M 2GHz, Solaris snv_b33, Sun Studio 11.

Built with -O2.

chrono_bias = 0.000000540s

Wouldn't you know it, that awful clock() function that may return "just
about anything, useful or otherwise" is actually working on a
different OS and development platform.
max_depth = 400
return_time = 0.000003960s
exception_time = 0.000022360s

Wow, that is quite interesting. Thanks for the data points.

GNU C++ works more according to my intuition about exceptions. I
suspected that just mashing through the stack would be faster than
actually taking the branches through the return points, regardless of
whatever fixed cost. That's why I wrote the test that way: to see how a
delta in the depth would affect the two approaches.

Still, both compilers could use some work. That fixed cost in g++'s
implementation is quite disturbing. Something happens at the start of
the exception processing on upon its termination which eats up a lot of
time. But once the unwinding loop actually gets going, it's quite
rapid. If that apparently fixed cost could be drastically reduced,
exceptions could compete with normal returns in broader circumstances.

I'm recompiling the program with -pg to get some profiling info out of
it.

The profiling shows that less time is spent in the ex_() functions.
part of the reason for that is that they never return. The way I set up
the test, the recursion just dives straight down and then throws. The
other half of the work, returning, is never done.

Flat profile:

Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls us/call us/call name
30.40 3.07 3.07 184000000 0.02 0.02 rt_bottom(int,
int)
26.24 5.72 2.65 184000000 0.01 0.01 rt_middle(int,
int)
24.75 8.22 2.50 184000000 0.01 0.01 rt_top(int, int)
6.73 8.90 0.68 184000000 0.00 0.00 ex_top(int, int)
5.94 9.50 0.60 184000000 0.00 0.00 ex_bottom(int,
int)
4.85 9.99 0.49 184000000 0.00 0.00 ex_middle(int,
int)
0.79 10.07 0.08 3000000 0.03 0.03 stop_chrono()
0.20 10.09 0.02 3000000 0.01 0.01 start_chrono()
0.10 10.10 0.01 1000000 0.01 1.78 ex_test(int)
0.00 10.10 0.00 1000000 0.00 8.22 rt_test(int)
0.00 10.10 0.00 21 0.00 0.00 reset_chrono()
0.00 10.10 0.00 21 0.00 0.00
get_average_time()
0.00 10.10 0.00 10 0.00 0.00 putchar

Still, 0.60 seconds spent in all the ex_bottom() calls versus 3.07
seconds in all the rt_bottom() calls. That's a whopping discrepancy
even considering that one of the two actually returns whereas the other
never does. Is it that much more expensive to return from a function
than to enter into it?

It would be nice to profile into the run-time support routines that
support exception handling, which is what I was /really/ after, but it
apperas that for that we'd have to get libgcc.a recompiled with
profiling support.
As you can see, with this compiler, exceptions are always slower.

By the way, I'm skeptical of your results. I have it from a reliable
source that

"after a depth > 800 you get a few microseconds faster
using exceptions on every compiler and architecture. "

Hee hee. Sigh.
 
N

Noah Roberts

Kaz said:
By the way, I'm skeptical of your results. I have it from a reliable
source that

"after a depth > 800 you get a few microseconds faster
using exceptions on every compiler and architecture. "

Hee hee. Sigh.

.... and _I'm_ the moron.
 
K

Kaz Kylheku

Ian said:
Doesn't gcc complain about the lack of a return?

Ah yes it does! But not if you have -O2 on the command line.

Before I posted the program, I wanted to have a peek at the assembly
language output, to make sure that there was no funny optimization,
like Scheme-like tail calls. But I didn't specify -O2. Now I remember
that the warnings did appear.

The optimized assembly language output shows that gcc has analyzed the
functions and turned the whole thing into tail calls (which I was
hoping wouldn't happen!)

Actually each function builds up and tears down the stack frame, but
then does a jump to the next one. With -fomit-frame-pointer, that
disappears, and here is what ex_top looks like, haha, with identifiers
demangled through c++filt:

..globl ex_top(int, int)
.type ex_top(int, int),@function
ex_top(int, int):
..LFB31:
incl 8(%esp)
..LCFI34:
jmp ex_middle(int, int)

Increment the depth and jump!

This is why the exception returns appear so fast and don't appear to
become more expensive with increasing depth. There are no additional
stack frames to search through.

Let's re-do the test with -O2 and -fno-optimize-sibling-calls. I
changed the ex_() functions to have void returns:

Now the exception times get quite gross. Your Sun compiler trounces
g++:

max_depth = 100
return_time = 0.000001670s
exception_time = 0.000200370s

max_depth = 200
return_time = 0.000003270s
exception_time = 0.000388070s

Ick!
 
I

Ian Collins

Kaz said:
Wow, that is quite interesting. Thanks for the data points.

GNU C++ works more according to my intuition about exceptions. I
suspected that just mashing through the stack would be faster than
actually taking the branches through the return points, regardless of
whatever fixed cost. That's why I wrote the test that way: to see how a
delta in the depth would affect the two approaches.
Have you tried higher levels of optimisation? At -O4, CC optimises most
of the code away in the non-exception case, giving a constant time less
than chrono_bias.
Still, both compilers could use some work. That fixed cost in g++'s
implementation is quite disturbing. Something happens at the start of
the exception processing on upon its termination which eats up a lot of
time. But once the unwinding loop actually gets going, it's quite
rapid. If that apparently fixed cost could be drastically reduced,
exceptions could compete with normal returns in broader circumstances.
Considering exceptions are for exceptional circumstances, I'm only realy
concern with normal operation, if the trade of is expensive throwing so
be it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top