Why does std::stack::pop() not throw an exception if the stack is empty?

Ian Collins · Feb 5, 2011

Here's a linux test for comparison:

psflor.informatica.com ~$ uname -a
Linux<host-name> 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008
x86_64 x86_64 x86_64 GNU/Linux

Test results for one execution (with output rearranged):

Manually Inlined Optimized Return Code : 5.21
Manually Inlined Return Code : 5.21
Inlineable Global Return Code : 5.21
Inlineable Member Return Code : 5.21
Virtual Return Code : 10.42
Virtual Exception : 9.33
Virtual Return Code Fake Try : 10.41

Sun CC on Solaris gives similar relative results:

Manually Inlined Optimized Return Code : 2.62
Manually Inlined Return Code : 2.61
Inlineable Global Return Code : 2.6
Inlineable Member Return Code : 2.65
Virtual Return Code : 11.79
Virtual Exception : 10.48
Virtual Return Code Fake Try : 13.02

Andre Kaufmann · Feb 5, 2011

...

Ok. Here we go. My test code is at the end. All results are from runs
executed today.

Thank you for the test code and your work. That's quite fair and a good
basis for discussions.

And then it's fair enough to invest some time by my own.

[...]
Also, if you see any problems with my test, please say so, so I can
fix it and rerun it.

No problem, but some remarks:

- Windows timers based on TickCount and Clock are quite inaccurate,
since they are updated on each IRQ and therefore have a resolution of
15ms (on most systems). For long running tests (such as this) and
to get a first impression they are o.k.
I prefer PerformanceCounters for high precision timings
(but for this test I don't think it makes a difference)

- From the command line parameters I conclude you are using VS 2008.
I use VS 2010. I don't think that they've changed the exception model
completely in VS2010 - but who knows I've gonna check this at work
next week. There is a free version of VS2010 available, but I don't
know if the code optimization is restricted (AFAIK no - since 2010).

I run the tests on my laptop - Intel dual core 2.2 GHz.
I used the same parameters, but full optimization -> but shouldn't have
a significant effect.

C++ command line:
/Zi /nologo /W3 /WX- /Ox /Oi /Ot /GL /D "WIN32" /D "NDEBUG" /D
"_CONSOLE" /D "_UNICODE" /D "UNICODE" /Gm- /EHsc /GS /Gy /fp

recise
/Zc:wchar_t /Zc:forScope /Yu"StdAfx.h" /Fp"x64\Release\ForumCpp.pch"
/Fa"x64\Release\" /Fo"x64\Release\" /Fd"x64\Release\vc100.pdb" /Gd
/errorReport:queue

Linker:

/OUT:"Cpp.exe" /INCREMENTAL:NO /NOLOGO "kernel32.lib" "user32.lib"
"gdi32.lib" "winspool.lib" "comdlg32.lib" "advapi32.lib" "shell32.lib"
"ole32.lib" "oleaut32.lib" "uuid.lib" "odbc32.lib" "odbccp32.lib"
/MANIFEST /ManifestFile:"x64\Release\Cpp.exe.intermediate.manifest"
/ALLOWISOLATION /MANIFESTUAC:"level='asInvoker' uiAccess='false'" /DEBUG
/PDB:"trash\Cpp.pdb" /SUBSYSTEM:CONSOLE /OPT:REF /OPT:ICF /PGD:"Cpp.pgd"
/LTCG /TLBID:1 /DYNAMICBASE /NXCOMPAT /MACHINE:X64 /ERRORREPORT:QUEUE

I've got the following results: (rearranged output)

Command line: 60000 -10 1:

Inlineable Member Return Code : 3.182
Inlineable Global Return Code : 3.151
Manually Inlined Return Code : 3.198
Virtual Return Code : 14.368
Virtual Return Code Fake Try : 17.394
Manually Inlined Optimized Return Code : 3.182
Virtual Exception : 14.243

Command line: 60000 -10:

Inlineable Member Return Code : 3.26
Inlineable Global Return Code : 3.198
Manually Inlined Return Code : 3.198
Virtual Return Code : 14.259
Virtual Return Code Fake Try : 17.459
Manually Inlined Optimized Return Code : 3.183
Virtual Exception : 17.487

Nearly the same results, besides some neglectable differences.
Only one result shows a difference:
After I've changed the code

class TestImpl2 : public TestInterface
{
public:
virtual void virtualCanThrow(int a, int b, int targetSum)
{
// cout << "XXX" << endl;
}
virtual ReturnCodeT virtualReturnCode(int a, int b, ...
{ //cout << "XXX" << endl;
return Success;
}
};

into

class TestImpl2 : public TestInterface
{
public:
virtual void virtualCanThrow(int a, int b, int targetSum)
{
if (a + b == targetSum)
cout << "XXX" << endl;
}
virtual ReturnCodeT virtualReturnCode(int a, int b, int ... { //cout
<< "XXX" << endl;
if (a + b == targetSum)
cout << "XXX" << endl;
return Success;
}
};

I've got the same results.
So far I don't experience speed differences.

> However, sadly too many compiler writers and ABI writers missed this
> very important memo, which I think is core to the entire C++

For x86 Windows I agree.
But let's have a look at the assembly code of your other test code:

int main(int argc, char* argv[])
{
try
{
if (argc == 3) throw 1;
}
catch(int)
{
return -1;
}
return 0;
}

Windows x86 code VC2010:

00BA1002 in al,dx
00BA1003 push 0FFFFFFFFh
00BA1005 push offset __ehhandler$_wmain (0BA1980h)
00BA100A mov eax,dword ptr fs:[00000000h]
00BA1010 push eax
00BA1011 mov dword ptr fs:[0],esp
00BA1018 sub esp,8
try
{
if (argc == 3) throw 1;
00BA101B cmp dword ptr [ebp+8],3
00BA101F push ebx
00BA1020 push esi
00BA1021 push edi
00BA1022 mov dword ptr [ebp-10h],esp
00BA1025 mov dword ptr [ebp-4],0
00BA102C jne $LN8+14h (0BA105Dh)
00BA102E push offset __TI1H (0BA22C8h)
00BA1033 lea eax,[ebp-14h]
00BA1036 push eax
00BA1037 mov dword ptr [ebp-14h],1
00BA103E call _CxxThrowException (0BA1970h)
}
catch(int)
{
return -1;
00BA1043 mov eax,offset $LN8 (0BA1049h)
00BA1048 ret
$LN8:
00BA1049 or eax,0FFFFFFFFh
}
return 0;
}
00BA104C mov ecx,dword ptr [ebp-0Ch]
00BA104F mov dword ptr fs:[0],ecx
00BA1056 pop edi
00BA1057 pop esi
00BA1058 pop ebx
00BA1059 mov esp,ebp
00BA105B pop ebp
00BA105C ret
00BA105D mov ecx,dword ptr [ebp-0Ch]
00BA1060 pop edi
00BA1061 pop esi
00BA1062 xor eax,eax
00BA1064 mov dword ptr fs:[0],ecx
00BA106B pop ebx
00BA106C mov esp,ebp
00BA106E pop ebp
00BA106F ret

Windows x64 code VC2010:

{
if (argc == 3) throw 1;
000000013F13100D cmp ecx,3
000000013F131010 jne wmain+2Ch (13F13102Ch)
000000013F131012 mov dword ptr [rsp+20h],1

000000013F13101A lea rdx,[_TI1H (13F1324A0h)]
000000013F131021 lea rcx,[rsp+20h]
000000013F131026 call _CxxThrowException (13F13190Eh)
000000013F13102B nop
}
return 0;

000000013F13102C xor eax,eax
000000013F13102E jmp $LN8+3 (13F131033h)
{
return -1;
000000013F131030 or eax,0FFFFFFFFh
}
000000013F131033 add rsp,38h
000000013F131037 ret

There is a huge difference. In x86 code the old implementation is used
-> exception stack which pointer is held in [fs] segment register.
In x64 code there is no exception stack anymore since the compiler uses
a static table for stack unwinding, therefore no overhead if no
exception is thrown.
So the implementation should be comparable to Linux / Unix systems and
compilers.

Since I already mentioned that SEH doesn't add any overhead (if the
compiler ignores SEH exceptions and doesn't need to track them if
thrown) it's the compilers fault if there is a speed difference.

And I don't think for example GCC under Windows uses a different
exception model than under Linux ? Or does it ?

>[...]

Juha Nieminen · Feb 5, 2011

Miles Bader said:
But remember that in typical C++ code you'll usually have many more
_implicit_ try-catch blocks, inserted by the compiler to call
the destructors of local objects.

So you can't judge the cost simply by looking for "try" in your source code...

You don't need to judge the cost of throwing (and catching) at all,
because they are *expected* to be somewhat heavy operations (relatively
speaking, at least). What really matters is that if nothing throws, your
code will basically be as fast as if the compiler didn't support exceptions
at all (in other words, no machine opcodes at all will be added by the
compiler to the normal execution path in order to support exceptions).

(The only possible overhead you can get from exceptions in the normal
execution path is caused by the additional conditionals you yourself
explicitly write in order to check the things that could throw.)

Öö Tiib · Feb 5, 2011

This is an open subject. The only thing forbidden is throwing from the
destructor of during an exception stack unwinding and there are ways
to avoid that.

What ways you mean? Is there something like:

bool std::unwinding(); // true during stack unwinding

There is at least one library where throwing from the destructor is
legitimate (an sql library IIRC); it uses RAI for commit/execute of
the SQL query. An exception is thrown upon error.

However, I prefer a std::logic_error thrown from a destructor and
being warned (even if it kills the program) than having something
catch it silently and let the program continue to run in a incorrect
state.

Goran is right. *generally* throwing from cleanup functions or
destructors is unexpected feature. I told it to die ... wtf it throws
now that it has two left legs and so can't die? It should throw when
second left leg was added to it.

Unexpected behavior is always confusing and annoying even when there
are workarounds. std::stack is generic standard container and on
several of cases throwing from pop() would be unexpected complication
for its users as would be throwing from destructors of elements of
stack.

I would just avoid using a SQL library with special design that throws
from destructors; there are plenty of SQL libraries that do not. Using
destructors on exit from scope to do silently lot of other things but
cleaning up is abuse of destructors and confusing at best.

Bo Persson · Feb 5, 2011

Juha said:
You don't need to judge the cost of throwing (and catching) at all,
because they are *expected* to be somewhat heavy operations
(relatively speaking, at least). What really matters is that if
nothing throws, your code will basically be as fast as if the
compiler didn't support exceptions at all (in other words, no
machine opcodes at all will be added by the compiler to the normal
execution path in order to support exceptions).

(The only possible overhead you can get from exceptions in the
normal execution path is caused by the additional conditionals you
yourself explicitly write in order to check the things that could
throw.)

Arguably, this isn't overhead either, as you reasonably would have to
check these conditions anyway.

Bo Persson

Miles Bader · Feb 5, 2011

Juha Nieminen said:
You don't need to judge the cost of throwing (and catching) at all,
because they are *expected* to be somewhat heavy operations (relatively
speaking, at least). What really matters is that if nothing throws, your
code will basically be as fast as if the compiler didn't support exceptions
at all (in other words, no machine opcodes at all will be added by the
compiler to the normal execution path in order to support exceptions).

Yes, I know that's true for some ABI's (modern gcc on linux, etc), but
in this case, the subject was VC++'s 32-bit ABI, where try-catch blocks
do incur a runtime cost, even when no except is thrown.

-Miles

Öö Tiib · Feb 5, 2011

No, my point is that in a VC++ 32-bit style ABI there's overhead in the
no-exception-thrown path for every use of RAII if an exception can
_possibly_ occur, compared to a more modern ABI.

Is it really so? I think that stack is untouched when exception flows
through. If nothing catches then no destructors run despite your raii-
shmaii and application exits.

Since use of RAII is of course very common in typical C++ code, this is
can be an issue.

Of course; but this path _is_ often affected, because of RAII (despite
the lack of explicit "try" statements).

Sounds strange concept. What MS compiler does it?

Michael Doubez · Feb 5, 2011

What ways you mean? Is there something like:

bool std::unwinding(); // true during stack unwinding

There is std::uncaught_exception(). If it returns true, then your are
in a stack unwinding due to exception and you can refrain from
throwing.

Goran is right. *generally* throwing from cleanup functions or
destructors is unexpected feature. I told it to die ... wtf it throws
now that it has two left legs and so can't die? It should throw when
second left leg was added to it.

Are you sure that all your destructor only calls functions that haver
the nothrow guarantee ? I am not so sure about mine.

Unexpected behavior is always confusing and annoying even when there
are workarounds. std::stack is generic standard container

nitpicking: it is a standard container adapter.

By the way, if I change the underlying container, it could well be a
container that throws upon pop_back/pop_front (by example specialy
designed to allow it).

Well, that's purely theoretical and I agree that not throwing from
destructor is the rule unless you have very good reason to break it.

and on
several of cases throwing from pop() would be unexpected complication
for its users as would be throwing from destructors of elements of
stack.

I would just avoid using a SQL library with special design that throws
from destructors; there are plenty of SQL libraries that do not. Using
destructors on exit from scope to do silently lot of other things but
cleaning up is abuse of destructors and confusing at best.

I have not used it personnaly but I remember looking at the
documentation and it made sense and it what quite a nice expressive
syntax.

Michael Doubez · Feb 5, 2011

There has been a proposal IIRC but there are some case where it is a
valid design.

Click to expand...

There are certain, very special cases where it is appropriate
for a destructor to throw. But only in such very special cases.
Such classes can't be put into a standard container, since
(§17.4.3.6):

In particular, the effects are undefined in the
following cases:
[...]
if any replacement function or handler function or
destructor operation throws an exception, unless
specifically allowed in the applicable Required
behavior paragraph.

A type used in a standard container is not allowed to exit
a destructor via an exception.

Thanks, I didn't know that. Well, that cuts my argument.

Does this apply to standard adaptors ?

Joshua Maurice · Feb 5, 2011

Ok. Here we go. My test code is at the end. All results are from runs
executed today.

Click to expand...

Thank you for the test code and your work. That's quite fair and a good
basis for discussions.

And then it's fair enough to invest some time by my own.

[...]
Also, if you see any problems with my test, please say so, so I can
fix it and rerun it.

Click to expand...

No problem, but some remarks:

- Windows timers based on TickCount and Clock are quite inaccurate,
since they are updated on each IRQ and therefore have a resolution of
15ms (on most systems). For long running tests (such as this) and
to get a first impression they are o.k.
I prefer PerformanceCounters for high precision timings
(but for this test I don't think it makes a difference)

- From the command line parameters I conclude you are using VS 2008.
I use VS 2010. I don't think that they've changed the exception model
completely in VS2010 - but who knows I've gonna check this at work
next week. There is a free version of VS2010 available, but I don't
know if the code optimization is restricted (AFAIK no - since 2010).

I run the tests on my laptop - Intel dual core 2.2 GHz.
I used the same parameters, but full optimization -> but shouldn't have
a significant effect.

C++ command line:
/Zi /nologo /W3 /WX- /Ox /Oi /Ot /GL /D "WIN32" /D "NDEBUG" /D
"_CONSOLE" /D "_UNICODE" /D "UNICODE" /Gm- /EHsc /GS /Gy /fprecise
/Zc:wchar_t /Zc:forScope /Yu"StdAfx.h" /Fp"x64\Release\ForumCpp.pch"
/Fa"x64\Release\" /Fo"x64\Release\" /Fd"x64\Release\vc100.pdb" /Gd
/errorReport:queue

Linker:

/OUT:"Cpp.exe" /INCREMENTAL:NO /NOLOGO "kernel32.lib" "user32.lib"
"gdi32.lib" "winspool.lib" "comdlg32.lib" "advapi32.lib" "shell32.lib"
"ole32.lib" "oleaut32.lib" "uuid.lib" "odbc32.lib" "odbccp32.lib"
/MANIFEST /ManifestFile:"x64\Release\Cpp.exe.intermediate.manifest"
/ALLOWISOLATION /MANIFESTUAC:"level='asInvoker' uiAccess='false'" /DEBUG
/PDB:"trash\Cpp.pdb" /SUBSYSTEM:CONSOLE /OPT:REF /OPT:ICF /PGD:"Cpp.pgd"
/LTCG /TLBID:1 /DYNAMICBASE /NXCOMPAT /MACHINE:X64 /ERRORREPORT:QUEUE

I've got the following results: (rearranged output)

Command line: 60000 -10 1:

Inlineable Member Return Code : 3.182
Inlineable Global Return Code : 3.151
Manually Inlined Return Code : 3.198
Virtual Return Code : 14.368
Virtual Return Code Fake Try : 17.394
Manually Inlined Optimized Return Code : 3.182
Virtual Exception : 14.243

Command line: 60000 -10:

Inlineable Member Return Code : 3.26
Inlineable Global Return Code : 3.198
Manually Inlined Return Code : 3.198
Virtual Return Code : 14.259
Virtual Return Code Fake Try : 17.459
Manually Inlined Optimized Return Code : 3.183
Virtual Exception : 17.487

Nearly the same results, besides some neglectable differences.
Only one result shows a difference:
After I've changed the code

class TestImpl2 : public TestInterface
{
public:
virtual void virtualCanThrow(int a, int b, int targetSum)
{
// cout << "XXX" << endl;
}
virtual ReturnCodeT virtualReturnCode(int a, int b, ...
{ //cout << "XXX" << endl;
return Success;
}

};

into

class TestImpl2 : public TestInterface
{
public:
virtual void virtualCanThrow(int a, int b, int targetSum)
{
if (a + b == targetSum)
cout << "XXX" << endl;
}
virtual ReturnCodeT virtualReturnCode(int a, int b, int .... { //cout
<< "XXX" << endl;
if (a + b == targetSum)
cout << "XXX" << endl;
return Success;
}

};

I've got the same results.
So far I don't experience speed differences.

> However, sadly too many compiler writers and ABI writers missed this
> very important memo, which I think is core to the entire C++

For x86 Windows I agree.
But let's have a look at the assembly code of your other test code:

int main(int argc, char* argv[])
{
try
{
if (argc == 3) throw 1;
}
catch(int)
{
return -1;
}
return 0;

}

Windows x86 code VC2010:

00BA1002 in al,dx
00BA1003 push 0FFFFFFFFh
00BA1005 push offset __ehhandler$_wmain (0BA1980h)
00BA100A mov eax,dword ptr fs:[00000000h]
00BA1010 push eax
00BA1011 mov dword ptr fs:[0],esp
00BA1018 sub esp,8
try
{
if (argc == 3) throw 1;
00BA101B cmp dword ptr [ebp+8],3
00BA101F push ebx
00BA1020 push esi
00BA1021 push edi
00BA1022 mov dword ptr [ebp-10h],esp
00BA1025 mov dword ptr [ebp-4],0
00BA102C jne $LN8+14h (0BA105Dh)
00BA102E push offset __TI1H (0BA22C8h)
00BA1033 lea eax,[ebp-14h]
00BA1036 push eax
00BA1037 mov dword ptr [ebp-14h],1
00BA103E call _CxxThrowException (0BA1970h)
}
catch(int)
{
return -1;
00BA1043 mov eax,offset $LN8 (0BA1049h)
00BA1048 ret
$LN8:
00BA1049 or eax,0FFFFFFFFh
}
return 0;}

00BA104C mov ecx,dword ptr [ebp-0Ch]
00BA104F mov dword ptr fs:[0],ecx
00BA1056 pop edi
00BA1057 pop esi
00BA1058 pop ebx
00BA1059 mov esp,ebp
00BA105B pop ebp
00BA105C ret
00BA105D mov ecx,dword ptr [ebp-0Ch]
00BA1060 pop edi
00BA1061 pop esi
00BA1062 xor eax,eax
00BA1064 mov dword ptr fs:[0],ecx
00BA106B pop ebx
00BA106C mov esp,ebp
00BA106E pop ebp
00BA106F ret

Windows x64 code VC2010:

{
if (argc == 3) throw 1;
000000013F13100D cmp ecx,3
000000013F131010 jne wmain+2Ch (13F13102Ch)
000000013F131012 mov dword ptr [rsp+20h],1

000000013F13101A lea rdx,[_TI1H (13F1324A0h)]
000000013F131021 lea rcx,[rsp+20h]
000000013F131026 call _CxxThrowException (13F13190Eh)
000000013F13102B nop
}
return 0;

000000013F13102C xor eax,eax
000000013F13102E jmp $LN8+3 (13F131033h)
{
return -1;
000000013F131030 or eax,0FFFFFFFFh}

000000013F131033 add rsp,38h
000000013F131037 ret

There is a huge difference. In x86 code the old implementation is used
-> exception stack which pointer is held in [fs] segment register.
In x64 code there is no exception stack anymore since the compiler uses
a static table for stack unwinding, therefore no overhead if no
exception is thrown.
So the implementation should be comparable to Linux / Unix systems and
compilers.

Since I already mentioned that SEH doesn't add any overhead (if the
compiler ignores SEH exceptions and doesn't need to track them if
thrown) it's the compilers fault if there is a speed difference.

And I don't think for example GCC under Windows uses a different
exception model than under Linux ? Or does it ?

Interesting. I'll get the assembly for my tests and post it. I wonder
if I'm measuring random noise or something for my windowws AMD 64
test.

Also, your deduction is correct that I am using visual studios 2008. I
should have mentioned that. I wonder if 2010 makes a difference.
That'll likely have to wait until Monday or Tuesday though.

Michael Doubez · Feb 5, 2011

Wrong, std::stack:op() calls std::stack::c.pop_back(); pop_back() on
an empty container is undefined behaviour; invoking undefined behaviour
is a bug which must be fixed.

I missed that. Thanks for the correction.

Paul · Feb 5, 2011

Ok. Here we go. My test code is at the end. All results are from runs
executed today.

Click to expand...

Thank you for the test code and your work. That's quite fair and a good
basis for discussions.

And then it's fair enough to invest some time by my own.

[...]
Also, if you see any problems with my test, please say so, so I can
fix it and rerun it.

Click to expand...

No problem, but some remarks:

- Windows timers based on TickCount and Clock are quite inaccurate,
since they are updated on each IRQ and therefore have a resolution of
15ms (on most systems). For long running tests (such as this) and
to get a first impression they are o.k.
I prefer PerformanceCounters for high precision timings
(but for this test I don't think it makes a difference)

- From the command line parameters I conclude you are using VS 2008.
I use VS 2010. I don't think that they've changed the exception model
completely in VS2010 - but who knows I've gonna check this at work
next week. There is a free version of VS2010 available, but I don't
know if the code optimization is restricted (AFAIK no - since 2010).

I run the tests on my laptop - Intel dual core 2.2 GHz.
I used the same parameters, but full optimization -> but shouldn't have
a significant effect.

C++ command line:
/Zi /nologo /W3 /WX- /Ox /Oi /Ot /GL /D "WIN32" /D "NDEBUG" /D
"_CONSOLE" /D "_UNICODE" /D "UNICODE" /Gm- /EHsc /GS /Gy /fprecise
/Zc:wchar_t /Zc:forScope /Yu"StdAfx.h" /Fp"x64\Release\ForumCpp.pch"
/Fa"x64\Release\" /Fo"x64\Release\" /Fd"x64\Release\vc100.pdb" /Gd
/errorReport:queue

Linker:

/OUT:"Cpp.exe" /INCREMENTAL:NO /NOLOGO "kernel32.lib" "user32.lib"
"gdi32.lib" "winspool.lib" "comdlg32.lib" "advapi32.lib" "shell32.lib"
"ole32.lib" "oleaut32.lib" "uuid.lib" "odbc32.lib" "odbccp32.lib"
/MANIFEST /ManifestFile:"x64\Release\Cpp.exe.intermediate.manifest"
/ALLOWISOLATION /MANIFESTUAC:"level='asInvoker' uiAccess='false'" /DEBUG
/PDB:"trash\Cpp.pdb" /SUBSYSTEM:CONSOLE /OPT:REF /OPT:ICF /PGD:"Cpp.pgd"
/LTCG /TLBID:1 /DYNAMICBASE /NXCOMPAT /MACHINE:X64 /ERRORREPORT:QUEUE

I've got the following results: (rearranged output)

Command line: 60000 -10 1:

Inlineable Member Return Code : 3.182
Inlineable Global Return Code : 3.151
Manually Inlined Return Code : 3.198
Virtual Return Code : 14.368
Virtual Return Code Fake Try : 17.394
Manually Inlined Optimized Return Code : 3.182
Virtual Exception : 14.243

Command line: 60000 -10:

Inlineable Member Return Code : 3.26
Inlineable Global Return Code : 3.198
Manually Inlined Return Code : 3.198
Virtual Return Code : 14.259
Virtual Return Code Fake Try : 17.459
Manually Inlined Optimized Return Code : 3.183
Virtual Exception : 17.487

Nearly the same results, besides some neglectable differences.
Only one result shows a difference:
After I've changed the code

class TestImpl2 : public TestInterface
{
public:
virtual void virtualCanThrow(int a, int b, int targetSum)
{
// cout << "XXX" << endl;
}
virtual ReturnCodeT virtualReturnCode(int a, int b, ...
{ //cout << "XXX" << endl;
return Success;
}

};

into

class TestImpl2 : public TestInterface
{
public:
virtual void virtualCanThrow(int a, int b, int targetSum)
{
if (a + b == targetSum)
cout << "XXX" << endl;
}
virtual ReturnCodeT virtualReturnCode(int a, int b, int ... { //cout
<< "XXX" << endl;
if (a + b == targetSum)
cout << "XXX" << endl;
return Success;
}

};

I've got the same results.
So far I don't experience speed differences.

However, sadly too many compiler writers and ABI writers missed this
very important memo, which I think is core to the entire C++

Click to expand...

For x86 Windows I agree.
But let's have a look at the assembly code of your other test code:

int main(int argc, char* argv[])
{
try
{
if (argc == 3) throw 1;
}
catch(int)
{
return -1;
}
return 0;

}

Windows x86 code VC2010:

00BA1002 in al,dx
00BA1003 push 0FFFFFFFFh
00BA1005 push offset __ehhandler$_wmain (0BA1980h)
00BA100A mov eax,dword ptr fs:[00000000h]
00BA1010 push eax
00BA1011 mov dword ptr fs:[0],esp
00BA1018 sub esp,8
try
{
if (argc == 3) throw 1;
00BA101B cmp dword ptr [ebp+8],3
00BA101F push ebx
00BA1020 push esi
00BA1021 push edi
00BA1022 mov dword ptr [ebp-10h],esp
00BA1025 mov dword ptr [ebp-4],0
00BA102C jne $LN8+14h (0BA105Dh)
00BA102E push offset __TI1H (0BA22C8h)
00BA1033 lea eax,[ebp-14h]
00BA1036 push eax
00BA1037 mov dword ptr [ebp-14h],1
00BA103E call _CxxThrowException (0BA1970h)
}
catch(int)
{
return -1;
00BA1043 mov eax,offset $LN8 (0BA1049h)
00BA1048 ret
$LN8:
00BA1049 or eax,0FFFFFFFFh
}
return 0;}

00BA104C mov ecx,dword ptr [ebp-0Ch]
00BA104F mov dword ptr fs:[0],ecx
00BA1056 pop edi
00BA1057 pop esi
00BA1058 pop ebx
00BA1059 mov esp,ebp
00BA105B pop ebp
00BA105C ret
00BA105D mov ecx,dword ptr [ebp-0Ch]
00BA1060 pop edi
00BA1061 pop esi
00BA1062 xor eax,eax
00BA1064 mov dword ptr fs:[0],ecx
00BA106B pop ebx
00BA106C mov esp,ebp
00BA106E pop ebp
00BA106F ret

Windows x64 code VC2010:

{
if (argc == 3) throw 1;
000000013F13100D cmp ecx,3
000000013F131010 jne wmain+2Ch (13F13102Ch)
000000013F131012 mov dword ptr [rsp+20h],1

000000013F13101A lea rdx,[_TI1H (13F1324A0h)]
000000013F131021 lea rcx,[rsp+20h]
000000013F131026 call _CxxThrowException (13F13190Eh)
000000013F13102B nop
}
return 0;

000000013F13102C xor eax,eax
000000013F13102E jmp $LN8+3 (13F131033h)
{
return -1;
000000013F131030 or eax,0FFFFFFFFh}

000000013F131033 add rsp,38h
000000013F131037 ret

There is a huge difference. In x86 code the old implementation is used
-> exception stack which pointer is held in [fs] segment register.
In x64 code there is no exception stack anymore since the compiler uses
a static table for stack unwinding, therefore no overhead if no
exception is thrown.
So the implementation should be comparable to Linux / Unix systems and
compilers.

Since I already mentioned that SEH doesn't add any overhead (if the
compiler ignores SEH exceptions and doesn't need to track them if
thrown) it's the compilers fault if there is a speed difference.

And I don't think for example GCC under Windows uses a different
exception model than under Linux ? Or does it ?

Interesting. I'll get the assembly for my tests and post it. I wonder
if I'm measuring random noise or something for my windowws AMD 64
test.

Also, your deduction is correct that I am using visual studios 2008. I
should have mentioned that. I wonder if 2010 makes a difference.
That'll likely have to wait until Monday or Tuesday though.
......................................................................

This will help you understand the win 32 ASM code, and possibly the other
ASM code aswell:
http://win32assembly.online.fr/Exceptionhandling.html

Please note the part that says:
"C programmers will use various shortcuts provided by their compilers by
including in their source code statements such as _try, _except, _finally,
_catch and _throw.
One real disadvantage in relying on the compiler's code is that it can
enlarge the final exe file enormously. "

But I guess you guys are trying to establish if C++ EH produces an overhead
in terms of number of opcodes processed and not program size.

Andre Kaufmann · Feb 6, 2011

On 05.02.2011 22:52, Joshua Maurice wrote:
[...]

I've logged in remotely at work and the results are as I had suspected.
The same results (performance of x64 exception code == performance x64
no exception code), besides some slight differences in the test "Virtual
Exception". After I've changed the code slightly, to force the optimizer
to generate (significant) code, I've got the same results.

>[...]

Interesting. I'll get the assembly for my tests and post it. I wonder
if I'm measuring random noise or something for my windowws AMD 64
test.

The generated assembly code of the small test program is the same as
under VS2010.
I wonder why you got different results, your settings are correct and
the compiler version is sufficient too. Curious.

Also, your deduction is correct that I am using visual studios 2008. I
should have mentioned that. I wonder if 2010 makes a difference.
That'll likely have to wait until Monday or Tuesday though.

I used VS2008 SP1 for the tests, but I don't think that matters anyways.

Joshua Maurice · Feb 6, 2011

On 05.02.2011 22:52, Joshua Maurice wrote:
[...]

I've logged in remotely at work and the results are as I had suspected.
The same results (performance of x64 exception code == performance x64
no exception code), besides some slight differences in the test "Virtual
Exception". After I've changed the code slightly, to force the optimizer
to generate (significant) code, I've got the same results.

>[...]

Interesting. I'll get the assembly for my tests and post it. I wonder
if I'm measuring random noise or something for my windowws AMD 64
test.

Click to expand...

The generated assembly code of the small test program is the same as
under VS2010.
I wonder why you got different results, your settings are correct and
the compiler version is sufficient too. Curious.

Also, your deduction is correct that I am using visual studios 2008. I
should have mentioned that. I wonder if 2010 makes a difference.
That'll likely have to wait until Monday or Tuesday though.

Click to expand...

I used VS2008 SP1 for the tests, but I don't think that matters anyways.

Dunno. I'll look at the generated assembly when I get back to work.

James Kanze · Feb 8, 2011

There are certain, very special cases where it is appropriate
for a destructor to throw. But only in such very special cases.
Such classes can't be put into a standard container, since
(§17.4.3.6):
In particular, the effects are undefined in the
following cases:
[...]
if any replacement function or handler function or
destructor operation throws an exception, unless
specifically allowed in the applicable Required
behavior paragraph.
A type used in a standard container is not allowed to exit
a destructor via an exception.

Click to expand...

Thanks, I didn't know that. Well, that cuts my argument.

Does this apply to standard adaptors ?

I would guess that it was meant to, but I'm not sure that the
standard is clear about it. Fundamentally, I'd say that there's
no reason for it to impose any requirements that the underlying
container didn't impose.

On the other hand, in all but very special cases, what does it
mean for a destructor to throw? That you haven't successfully
destructed the objet. That would imply that it still exists.
Which wouldn't work very well if the object had auto lifetime,
or was contained in an object with auto lifetime.

(In the only reasonable cases I've seen for a destructor
throwing, the object was designed to be used exclusively as a
temporary, and the destructor throwing was a means of differing
the throw until some other operations -- generally formatting
the message -- had finished.)

Joshua Maurice · Feb 9, 2011

On 05.02.2011 22:52, Joshua Maurice wrote:
[...]

Click to expand...

I've logged in remotely at work and the results are as I had suspected.
The same results (performance of x64 exception code == performance x64
no exception code), besides some slight differences in the test "Virtual
Exception". After I've changed the code slightly, to force the optimizer
to generate (significant) code, I've got the same results.

[...]
Interesting. I'll get the assembly for my tests and post it. I wonder
if I'm measuring random noise or something for my windowws AMD 64
test.

Click to expand...

Click to expand...

The generated assembly code of the small test program is the same as
under VS2010.
I wonder why you got different results, your settings are correct and
the compiler version is sufficient too. Curious.

Click to expand...

I used VS2008 SP1 for the tests, but I don't think that matters anyways.

Click to expand...

Dunno. I'll look at the generated assembly when I get back to work.

Close to the release now. I just got some time. I reran the tests, and
I still got similar results.

???>test_solution.exe 50000 -10
Virtual Exception : 8.255
Inlineable Global Return Code : 1.686
Inlineable Member Return Code : 1.661
Manually Inlined Optimized Return Code : 1.721
Virtual Return Code Fake Try : 9.104
Virtual Return Code : 6.736
Manually Inlined Return Code : 1.686

???>test_solution.exe 50000 -10
Manually Inlined Optimized Return Code : 1.645
Inlineable Member Return Code : 1.667
Inlineable Global Return Code : 1.717
Virtual Return Code : 6.574
Manually Inlined Return Code : 1.62
Virtual Exception : 8.2
Virtual Return Code Fake Try : 9.113

???>test_solution.exe 50000 -10
Inlineable Global Return Code : 1.717
Inlineable Member Return Code : 1.658
Virtual Return Code Fake Try : 9.137
Virtual Return Code : 6.544
Virtual Exception : 8.228
Manually Inlined Optimized Return Code : 1.702
Manually Inlined Return Code : 1.695

Here's the disassembly output from the visual studios debugger. I'm
not sure what all of that means. I can barely follow it even with
googling some of the opcodes. I ask you to please make sense of it.

void testVirtualException(TestInterface *& x, int loopIterations, int
failIfEqualsNumber)
{ for (int i=0; i<loopIterations; ++i)
000000013FBE1780 mov dword ptr [rsp+18h],r8d
000000013FBE1785 mov dword ptr [rsp+10h],edx
000000013FBE1789 mov qword ptr [rsp+8],rcx
000000013FBE178E push rbx
000000013FBE178F push rsi
000000013FBE1790 push rdi
000000013FBE1791 push r12
000000013FBE1793 push r13
000000013FBE1795 sub rsp,30h
000000013FBE1799 mov qword ptr [rsp+28h],0FFFFFFFFFFFFFFFEh
000000013FBE17A2 mov r12d,r8d
000000013FBE17A5 mov esi,edx
000000013FBE17A7 mov r13,rcx
000000013FBE17AA xor edi,edi
000000013FBE17AC mov dword ptr ,edi
000000013FBE17B0 cmp edi,esi
000000013FBE17B2 jge $LN15+26h (13FBE1800h)

void testVirtualReturnCode(TestInterface *& x, int loopIterations, int
failIfEqualsNumber)
{ for (int i=0; i<loopIterations; ++i)
000000013F6B16D0 test edx,edx
000000013F6B16D2 jle testVirtualReturnCode+0A6h (13F6B1776h)
000000013F6B16D8 push rbp
000000013F6B16D9 push rsi
000000013F6B16DA push rdi
000000013F6B16DB sub rsp,20h
000000013F6B16DF mov qword ptr [x],rbx
000000013F6B16E4 mov qword ptr [loopIterations],r12
000000013F6B16E9 mov qword ptr [failIfEqualsNumber],r13
000000013F6B16EE lea r13,[TestImpl2::`vftable' (13F6B4698h)]
000000013F6B16F5 mov edi,edx
000000013F6B16F7 mov rsi,rcx
000000013F6B16FA xor ebp,ebp
000000013F6B16FC lea r12d,[r8+6]
{ for (int j=0; j<loopIterations; ++j)
000000013F6B1700 xor ebx,ebx
{ for (int j=0; j<loopIterations; ++j)
{ try
{ if (Failure == x->virtualReturnCode(i, j, failIfEqualsNumber+5))
{ cout << "virtual return code with fake try, returned failure" <<
endl;
x = new TestImpl2();
}
} catch (...)
{ cout << "ERROR impossible exception caught" << endl;
x = new TestImpl2();
}
}
}
}

void testVirtualReturnCodeFakeTry(TestInterface *& x, int
loopIterations, int failIfEqualsNumber)
{ for (int i=0; i<loopIterations; ++i)
000000013F6B15D0 mov dword ptr [rsp+18h],r8d
000000013F6B15D5 mov dword ptr [rsp+10h],edx
000000013F6B15D9 mov qword ptr [rsp+8],rcx
000000013F6B15DE push rbx
000000013F6B15DF push rsi
000000013F6B15E0 push rdi
000000013F6B15E1 push r12
000000013F6B15E3 push r13
000000013F6B15E5 push r14
000000013F6B15E7 sub rsp,38h
000000013F6B15EB mov qword ptr [rsp+28h],0FFFFFFFFFFFFFFFEh
000000013F6B15F4 mov r14d,r8d
000000013F6B15F7 mov r12d,edx
000000013F6B15FA mov r13,rcx
000000013F6B15FD xor esi,esi
000000013F6B15FF mov dword ptr ,esi
000000013F6B1603 lea rdi,[TestImpl2::`vftable' (13F6B4698h)]
000000013F6B160A nop word ptr [rax+rax]
000000013F6B1610 cmp esi,r12d
000000013F6B1613 jge $LN18+3Dh (13F6B16BFh)
{ cout << "inlineable member return code, vector size comparison true"
<< endl;
x = new TestImpl2();
}
}

And ack. Remind me to copy the full disassembly next time. Forgot to,
and I don't have access to that computer for a while now.

Andre Kaufmann · Feb 10, 2011

On Feb 5, 10:27 pm, Andre Kaufmann<[email protected]> wrote:

Click to expand...

[...]
Here's the disassembly output from the visual studios debugger. I'm
not sure what all of that means. I can barely follow it even with
googling some of the opcodes. I ask you to please make sense of it.

Thank you posting the outputs.
So far the code seems to be ok.

Some minor differences (different registers etc.). But not the typical
exception stack initialization and keeping track of created objects as
it's the case for Windows x86 and VC++.

I still wonder why you see differences between calling your test program
with different parameters under Windows x64 -
perhaps we are discussing different things ?

Do you experience with parameters:

60000 -10

and

60000 -10 1

different timings ?

(I deactivated the outputs, but used a global variable to fake the
optimizer).

The only difference between my settings, I deactivated the security
cookie code generation.

And ack. Remind me to copy the full disassembly next time. Forgot to,
and I don't have access to that computer for a while now.

Some code is missing but I don't think it's relevant.

E.g. for the code:

void throwit() { throw 1; }

struct v
{
v(int argc) { if (argc > 3) throwit(); }
~v() { printf("destroyed\r\n"); }
};

int _tmain(int argc, char* argv[])
{
try
{
v v1(argc+1);
v v2(argc+2);
v v3(argc+3);
v v4(argc+4);
v v5(argc+5);
}
catch(...)
{
printf("Exception");
}
return 0;
}

The relevant x86 assembly code would be:

Initialization of the exception stack: [fs] cpu segment register

mov eax,dword ptr fs:[00000000h]
push eax
mov dword ptr fs:[0],esp

And to keep track of created objects

mov byte ptr [ebp-4], x [x=0..]

(I omit the code for deinitialization - restoration of exception stack
at the end of the function)

This surely is the old "Use exception stack to keep track of created
objects" method. In x64 code this tracking is not used, therefore there
should no (significant) overhead between code using exceptions and code
which doesn't use exceptions.

And this should be the same (comparable) implementation as for GCC under
Linux. At least I don't know any better (faster) code, than code which
isn't executed ;-)

Andre

like to know why it is segmentation fault on simple throw-exception program	9	Jun 1, 2011
poping empty stack in visual c++ express - doesn't throw exception ?	11	Mar 5, 2007
Implementing Many Stacks in the Same Program	1	Aug 10, 2021
An idea for heap allocation at near stack allocation speed	14	Feb 13, 2011
Why is str(None) == 'None' and not an empty string?	6	Aug 28, 2013
Why does std::exception constructor specifies it can throw any ?	3	Mar 2, 2005
Place Assert in Exception	19	Oct 9, 2010
std::map unhandled exception problem	3	Feb 7, 2009

Why does std::stack::pop() not throw an exception if the stack is empty?

Ian Collins

Andre Kaufmann

Juha Nieminen

Öö Tiib

Bo Persson

Miles Bader

Öö Tiib

Michael Doubez

Michael Doubez

Joshua Maurice

Michael Doubez

Paul

Andre Kaufmann

Joshua Maurice

James Kanze

Joshua Maurice

Andre Kaufmann

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads