Why does std::stack::pop() not throw an exception if the stack is empty?

  • Thread starter Debajit Adhikary
  • Start date
I

Ian Collins

Here's a linux test for comparison:

psflor.informatica.com ~$ uname -a
Linux<host-name> 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008
x86_64 x86_64 x86_64 GNU/Linux

Test results for one execution (with output rearranged):

Manually Inlined Optimized Return Code : 5.21
Manually Inlined Return Code : 5.21
Inlineable Global Return Code : 5.21
Inlineable Member Return Code : 5.21
Virtual Return Code : 10.42
Virtual Exception : 9.33
Virtual Return Code Fake Try : 10.41

Sun CC on Solaris gives similar relative results:

Manually Inlined Optimized Return Code : 2.62
Manually Inlined Return Code : 2.61
Inlineable Global Return Code : 2.6
Inlineable Member Return Code : 2.65
Virtual Return Code : 11.79
Virtual Exception : 10.48
Virtual Return Code Fake Try : 13.02
 
A

Andre Kaufmann

...

Ok. Here we go. My test code is at the end. All results are from runs
executed today.

Thank you for the test code and your work. That's quite fair and a good
basis for discussions.

And then it's fair enough to invest some time by my own.
[...]
Also, if you see any problems with my test, please say so, so I can
fix it and rerun it.

No problem, but some remarks:

- Windows timers based on TickCount and Clock are quite inaccurate,
since they are updated on each IRQ and therefore have a resolution of
15ms (on most systems). For long running tests (such as this) and
to get a first impression they are o.k.
I prefer PerformanceCounters for high precision timings
(but for this test I don't think it makes a difference)

- From the command line parameters I conclude you are using VS 2008.
I use VS 2010. I don't think that they've changed the exception model
completely in VS2010 - but who knows I've gonna check this at work
next week. There is a free version of VS2010 available, but I don't
know if the code optimization is restricted (AFAIK no - since 2010).

I run the tests on my laptop - Intel dual core 2.2 GHz.
I used the same parameters, but full optimization -> but shouldn't have
a significant effect.

C++ command line:
/Zi /nologo /W3 /WX- /Ox /Oi /Ot /GL /D "WIN32" /D "NDEBUG" /D
"_CONSOLE" /D "_UNICODE" /D "UNICODE" /Gm- /EHsc /GS /Gy /fp:precise
/Zc:wchar_t /Zc:forScope /Yu"StdAfx.h" /Fp"x64\Release\ForumCpp.pch"
/Fa"x64\Release\" /Fo"x64\Release\" /Fd"x64\Release\vc100.pdb" /Gd
/errorReport:queue

Linker:

/OUT:"Cpp.exe" /INCREMENTAL:NO /NOLOGO "kernel32.lib" "user32.lib"
"gdi32.lib" "winspool.lib" "comdlg32.lib" "advapi32.lib" "shell32.lib"
"ole32.lib" "oleaut32.lib" "uuid.lib" "odbc32.lib" "odbccp32.lib"
/MANIFEST /ManifestFile:"x64\Release\Cpp.exe.intermediate.manifest"
/ALLOWISOLATION /MANIFESTUAC:"level='asInvoker' uiAccess='false'" /DEBUG
/PDB:"trash\Cpp.pdb" /SUBSYSTEM:CONSOLE /OPT:REF /OPT:ICF /PGD:"Cpp.pgd"
/LTCG /TLBID:1 /DYNAMICBASE /NXCOMPAT /MACHINE:X64 /ERRORREPORT:QUEUE

I've got the following results: (rearranged output)

Command line: 60000 -10 1:

Inlineable Member Return Code : 3.182
Inlineable Global Return Code : 3.151
Manually Inlined Return Code : 3.198
Virtual Return Code : 14.368
Virtual Return Code Fake Try : 17.394
Manually Inlined Optimized Return Code : 3.182
Virtual Exception : 14.243


Command line: 60000 -10:

Inlineable Member Return Code : 3.26
Inlineable Global Return Code : 3.198
Manually Inlined Return Code : 3.198
Virtual Return Code : 14.259
Virtual Return Code Fake Try : 17.459
Manually Inlined Optimized Return Code : 3.183
Virtual Exception : 17.487


Nearly the same results, besides some neglectable differences.
Only one result shows a difference:
After I've changed the code

class TestImpl2 : public TestInterface
{
public:
virtual void virtualCanThrow(int a, int b, int targetSum)
{
// cout << "XXX" << endl;
}
virtual ReturnCodeT virtualReturnCode(int a, int b, ...
{ //cout << "XXX" << endl;
return Success;
}
};

into

class TestImpl2 : public TestInterface
{
public:
virtual void virtualCanThrow(int a, int b, int targetSum)
{
if (a + b == targetSum)
cout << "XXX" << endl;
}
virtual ReturnCodeT virtualReturnCode(int a, int b, int ... { //cout
<< "XXX" << endl;
if (a + b == targetSum)
cout << "XXX" << endl;
return Success;
}
};

I've got the same results.
So far I don't experience speed differences.
> However, sadly too many compiler writers and ABI writers missed this
> very important memo, which I think is core to the entire C++

For x86 Windows I agree.
But let's have a look at the assembly code of your other test code:

int main(int argc, char* argv[])
{
try
{
if (argc == 3) throw 1;
}
catch(int)
{
return -1;
}
return 0;
}

Windows x86 code VC2010:

00BA1002 in al,dx
00BA1003 push 0FFFFFFFFh
00BA1005 push offset __ehhandler$_wmain (0BA1980h)
00BA100A mov eax,dword ptr fs:[00000000h]
00BA1010 push eax
00BA1011 mov dword ptr fs:[0],esp
00BA1018 sub esp,8
try
{
if (argc == 3) throw 1;
00BA101B cmp dword ptr [ebp+8],3
00BA101F push ebx
00BA1020 push esi
00BA1021 push edi
00BA1022 mov dword ptr [ebp-10h],esp
00BA1025 mov dword ptr [ebp-4],0
00BA102C jne $LN8+14h (0BA105Dh)
00BA102E push offset __TI1H (0BA22C8h)
00BA1033 lea eax,[ebp-14h]
00BA1036 push eax
00BA1037 mov dword ptr [ebp-14h],1
00BA103E call _CxxThrowException (0BA1970h)
}
catch(int)
{
return -1;
00BA1043 mov eax,offset $LN8 (0BA1049h)
00BA1048 ret
$LN8:
00BA1049 or eax,0FFFFFFFFh
}
return 0;
}
00BA104C mov ecx,dword ptr [ebp-0Ch]
00BA104F mov dword ptr fs:[0],ecx
00BA1056 pop edi
00BA1057 pop esi
00BA1058 pop ebx
00BA1059 mov esp,ebp
00BA105B pop ebp
00BA105C ret
00BA105D mov ecx,dword ptr [ebp-0Ch]
00BA1060 pop edi
00BA1061 pop esi
00BA1062 xor eax,eax
00BA1064 mov dword ptr fs:[0],ecx
00BA106B pop ebx
00BA106C mov esp,ebp
00BA106E pop ebp
00BA106F ret


Windows x64 code VC2010:

{
if (argc == 3) throw 1;
000000013F13100D cmp ecx,3
000000013F131010 jne wmain+2Ch (13F13102Ch)
000000013F131012 mov dword ptr [rsp+20h],1


000000013F13101A lea rdx,[_TI1H (13F1324A0h)]
000000013F131021 lea rcx,[rsp+20h]
000000013F131026 call _CxxThrowException (13F13190Eh)
000000013F13102B nop
}
return 0;


000000013F13102C xor eax,eax
000000013F13102E jmp $LN8+3 (13F131033h)
{
return -1;
000000013F131030 or eax,0FFFFFFFFh
}
000000013F131033 add rsp,38h
000000013F131037 ret


There is a huge difference. In x86 code the old implementation is used
-> exception stack which pointer is held in [fs] segment register.
In x64 code there is no exception stack anymore since the compiler uses
a static table for stack unwinding, therefore no overhead if no
exception is thrown.
So the implementation should be comparable to Linux / Unix systems and
compilers.

Since I already mentioned that SEH doesn't add any overhead (if the
compiler ignores SEH exceptions and doesn't need to track them if
thrown) it's the compilers fault if there is a speed difference.

And I don't think for example GCC under Windows uses a different
exception model than under Linux ? Or does it ?

 
J

Juha Nieminen

Miles Bader said:
But remember that in typical C++ code you'll usually have many more
_implicit_ try-catch blocks, inserted by the compiler to call
the destructors of local objects.

So you can't judge the cost simply by looking for "try" in your source code...

You don't need to judge the cost of throwing (and catching) at all,
because they are *expected* to be somewhat heavy operations (relatively
speaking, at least). What really matters is that if nothing throws, your
code will basically be as fast as if the compiler didn't support exceptions
at all (in other words, no machine opcodes at all will be added by the
compiler to the normal execution path in order to support exceptions).

(The only possible overhead you can get from exceptions in the normal
execution path is caused by the additional conditionals you yourself
explicitly write in order to check the things that could throw.)
 
Ö

Öö Tiib

This is an open subject. The only thing forbidden is throwing from the
destructor of during an exception stack unwinding and there are ways
to avoid that.

What ways you mean? Is there something like:

bool std::unwinding(); // true during stack unwinding

There is at least one library where throwing from the destructor is
legitimate (an sql library IIRC); it uses RAI for commit/execute of
the SQL query. An exception is thrown upon error.

However, I prefer a std::logic_error thrown from a destructor and
being warned (even if it kills the program) than having something
catch it silently and let the program continue to run in a incorrect
state.

Goran is right. *generally* throwing from cleanup functions or
destructors is unexpected feature. I told it to die ... wtf it throws
now that it has two left legs and so can't die? It should throw when
second left leg was added to it.

Unexpected behavior is always confusing and annoying even when there
are workarounds. std::stack is generic standard container and on
several of cases throwing from pop() would be unexpected complication
for its users as would be throwing from destructors of elements of
stack.

I would just avoid using a SQL library with special design that throws
from destructors; there are plenty of SQL libraries that do not. Using
destructors on exit from scope to do silently lot of other things but
cleaning up is abuse of destructors and confusing at best.
 
B

Bo Persson

Juha said:
You don't need to judge the cost of throwing (and catching) at all,
because they are *expected* to be somewhat heavy operations
(relatively speaking, at least). What really matters is that if
nothing throws, your code will basically be as fast as if the
compiler didn't support exceptions at all (in other words, no
machine opcodes at all will be added by the compiler to the normal
execution path in order to support exceptions).

(The only possible overhead you can get from exceptions in the
normal execution path is caused by the additional conditionals you
yourself explicitly write in order to check the things that could
throw.)

Arguably, this isn't overhead either, as you reasonably would have to
check these conditions anyway.


Bo Persson
 
M

Miles Bader

Juha Nieminen said:
You don't need to judge the cost of throwing (and catching) at all,
because they are *expected* to be somewhat heavy operations (relatively
speaking, at least). What really matters is that if nothing throws, your
code will basically be as fast as if the compiler didn't support exceptions
at all (in other words, no machine opcodes at all will be added by the
compiler to the normal execution path in order to support exceptions).

Yes, I know that's true for some ABI's (modern gcc on linux, etc), but
in this case, the subject was VC++'s 32-bit ABI, where try-catch blocks
do incur a runtime cost, even when no except is thrown.

-Miles
 
Ö

Öö Tiib

No, my point is that in a VC++ 32-bit style ABI there's overhead in the
no-exception-thrown path for every use of RAII if an exception can
_possibly_ occur, compared to a more modern ABI.

Is it really so? I think that stack is untouched when exception flows
through. If nothing catches then no destructors run despite your raii-
shmaii and application exits.
Since use of RAII is of course very common in typical C++ code, this is
can be an issue.


Of course; but this path _is_ often affected, because of RAII (despite
the lack of explicit "try" statements).

Sounds strange concept. What MS compiler does it?
 
M

Michael Doubez

What ways you mean? Is there something like:

 bool std::unwinding(); // true during stack unwinding

There is std::uncaught_exception(). If it returns true, then your are
in a stack unwinding due to exception and you can refrain from
throwing.
Goran is right. *generally* throwing from cleanup functions or
destructors is unexpected feature. I told it to die ... wtf it throws
now that it has two left legs and so can't die? It should throw when
second left leg was added to it.

Are you sure that all your destructor only calls functions that haver
the nothrow guarantee ? I am not so sure about mine.
Unexpected behavior is always confusing and annoying even when there
are workarounds. std::stack is generic standard container

nitpicking: it is a standard container adapter.

By the way, if I change the underlying container, it could well be a
container that throws upon pop_back/pop_front (by example specialy
designed to allow it).

Well, that's purely theoretical and I agree that not throwing from
destructor is the rule unless you have very good reason to break it.
and on
several of cases throwing from pop() would be unexpected complication
for its users as would be throwing from destructors of elements of
stack.

I would just avoid using a SQL library with special design that throws
from destructors; there are plenty of SQL libraries that do not. Using
destructors on exit from scope to do silently lot of other things but
cleaning up is abuse of destructors and confusing at best.

I have not used it personnaly but I remember looking at the
documentation and it made sense and it what quite a nice expressive
syntax.
 
M

Michael Doubez

There has been a proposal IIRC but there are some case where it is a
valid design.

There are certain, very special cases where it is appropriate
for a destructor to throw.  But only in such very special cases.
Such classes can't be put into a standard container, since
(§17.4.3.6):

    In particular, the effects are undefined in the
    following cases:
        [...]
        if any replacement function or handler function or
        destructor operation throws an exception, unless
        specifically allowed in the applicable Required
        behavior paragraph.

A type used in a standard container is not allowed to exit
a destructor via an exception.

Thanks, I didn't know that. Well, that cuts my argument.

Does this apply to standard adaptors ?
 
J

Joshua Maurice

Ok. Here we go. My test code is at the end. All results are from runs
executed today.

Thank you for the test code and your work. That's quite fair and a good
basis for discussions.

And then it's fair enough to invest some time by my own.
[...]
Also, if you see any problems with my test, please say so, so I can
fix it and rerun it.

No problem, but some remarks:

- Windows timers based on TickCount and Clock are quite inaccurate,
   since they are updated on each IRQ and therefore have a resolution of
   15ms (on most systems). For long running tests (such as this) and
   to get a first impression they are o.k.
   I prefer PerformanceCounters for high precision timings
   (but for this test I don't think it makes a difference)

- From the command line parameters I conclude you are using VS 2008.
   I use VS 2010. I don't think that they've changed the exception model
   completely in VS2010 - but who knows I've gonna check this at work
   next week. There is a free version of VS2010 available, but I don't
   know if the code optimization is restricted (AFAIK no - since 2010).

I run the tests on my laptop - Intel dual core 2.2 GHz.
I used the same parameters, but full optimization -> but shouldn't have
a significant effect.

C++ command line:
/Zi /nologo /W3 /WX- /Ox /Oi /Ot /GL /D "WIN32" /D "NDEBUG" /D
"_CONSOLE" /D "_UNICODE" /D "UNICODE" /Gm- /EHsc /GS /Gy /fp:precise
/Zc:wchar_t /Zc:forScope /Yu"StdAfx.h" /Fp"x64\Release\ForumCpp.pch"
/Fa"x64\Release\" /Fo"x64\Release\" /Fd"x64\Release\vc100.pdb" /Gd
/errorReport:queue

Linker:

/OUT:"Cpp.exe" /INCREMENTAL:NO /NOLOGO "kernel32.lib" "user32.lib"
"gdi32.lib" "winspool.lib" "comdlg32.lib" "advapi32.lib" "shell32.lib"
"ole32.lib" "oleaut32.lib" "uuid.lib" "odbc32.lib" "odbccp32.lib"
/MANIFEST /ManifestFile:"x64\Release\Cpp.exe.intermediate.manifest"
/ALLOWISOLATION /MANIFESTUAC:"level='asInvoker' uiAccess='false'" /DEBUG
/PDB:"trash\Cpp.pdb" /SUBSYSTEM:CONSOLE /OPT:REF /OPT:ICF /PGD:"Cpp.pgd"
/LTCG /TLBID:1 /DYNAMICBASE /NXCOMPAT /MACHINE:X64 /ERRORREPORT:QUEUE

I've got the following results: (rearranged output)

Command line: 60000 -10 1:

   Inlineable Member    Return Code       : 3.182
   Inlineable Global    Return Code       : 3.151
   Manually Inlined           Return Code : 3.198
   Virtual Return Code                    : 14.368
   Virtual Return Code  Fake Try          : 17.394
   Manually Inlined Optimized Return Code : 3.182
   Virtual Exception                      : 14.243

Command line: 60000 -10:

   Inlineable Member    Return Code       : 3.26
   Inlineable Global    Return Code       : 3.198
   Manually Inlined           Return Code : 3.198
   Virtual Return Code                    : 14.259
   Virtual Return Code  Fake Try          : 17.459
   Manually Inlined Optimized Return Code : 3.183
   Virtual Exception                      : 17.487

Nearly the same results, besides some neglectable differences.
Only one result shows a difference:
After I've changed the code

class TestImpl2 : public TestInterface
{
public:
        virtual void virtualCanThrow(int a, int b, int targetSum)
        {
                // cout << "XXX" << endl;
        }
        virtual ReturnCodeT virtualReturnCode(int a, int b, ...
        {   //cout << "XXX" << endl;
        return Success;
        }

};

into

class TestImpl2 : public TestInterface
{
public:
        virtual void virtualCanThrow(int a, int b, int targetSum)
        {
                 if (a + b == targetSum)
                         cout << "XXX" << endl;
        }
        virtual ReturnCodeT virtualReturnCode(int a, int b, int ....     {   //cout
<< "XXX" << endl;
                 if (a + b == targetSum)
                         cout << "XXX" << endl;
        return Success;
        }

};

I've got the same results.
So far I don't experience speed differences.

 > However, sadly too many compiler writers and ABI writers missed this
 > very important memo, which I think is core to the entire C++

For x86 Windows I agree.
But let's have a look at the assembly code of your other test code:

int main(int argc, char* argv[])
{
        try
        {
                if (argc == 3) throw 1;
        }
        catch(int)
        {
                return -1;
        }
        return 0;

}

Windows x86 code VC2010:

00BA1002  in          al,dx
00BA1003  push        0FFFFFFFFh
00BA1005  push        offset __ehhandler$_wmain (0BA1980h)
00BA100A  mov         eax,dword ptr fs:[00000000h]
00BA1010  push        eax
00BA1011  mov         dword ptr fs:[0],esp
00BA1018  sub         esp,8
        try
        {
                if (argc == 3) throw 1;
00BA101B  cmp         dword ptr [ebp+8],3
00BA101F  push        ebx
00BA1020  push        esi
00BA1021  push        edi
00BA1022  mov         dword ptr [ebp-10h],esp
00BA1025  mov         dword ptr [ebp-4],0
00BA102C  jne         $LN8+14h (0BA105Dh)
00BA102E  push        offset __TI1H (0BA22C8h)
00BA1033  lea         eax,[ebp-14h]
00BA1036  push        eax
00BA1037  mov         dword ptr [ebp-14h],1
00BA103E  call        _CxxThrowException (0BA1970h)
        }
        catch(int)
        {
                return -1;
00BA1043  mov         eax,offset $LN8 (0BA1049h)
00BA1048  ret
$LN8:
00BA1049  or          eax,0FFFFFFFFh
        }
        return 0;}

00BA104C  mov         ecx,dword ptr [ebp-0Ch]
00BA104F  mov         dword ptr fs:[0],ecx
00BA1056  pop         edi
00BA1057  pop         esi
00BA1058  pop         ebx
00BA1059  mov         esp,ebp
00BA105B  pop         ebp
00BA105C  ret
00BA105D  mov         ecx,dword ptr [ebp-0Ch]
00BA1060  pop         edi
00BA1061  pop         esi
00BA1062  xor         eax,eax
00BA1064  mov         dword ptr fs:[0],ecx
00BA106B  pop         ebx
00BA106C  mov         esp,ebp
00BA106E  pop         ebp
00BA106F  ret

Windows x64 code VC2010:

        {
                if (argc == 3) throw 1;
000000013F13100D  cmp         ecx,3
000000013F131010  jne         wmain+2Ch (13F13102Ch)
000000013F131012  mov         dword ptr [rsp+20h],1

000000013F13101A  lea         rdx,[_TI1H (13F1324A0h)]
000000013F131021  lea         rcx,[rsp+20h]
000000013F131026  call        _CxxThrowException (13F13190Eh)
000000013F13102B  nop
        }
        return 0;

000000013F13102C  xor         eax,eax
000000013F13102E  jmp         $LN8+3 (13F131033h)
        {
        return -1;
000000013F131030  or          eax,0FFFFFFFFh}

000000013F131033  add         rsp,38h
000000013F131037  ret

There is a huge difference. In x86 code the old implementation is used
-> exception stack which pointer is held in [fs] segment register.
In x64 code there is no exception stack anymore since the compiler uses
a static table for stack unwinding, therefore no overhead if no
exception is thrown.
So the implementation should be comparable to Linux / Unix systems and
compilers.

Since I already mentioned that SEH doesn't add any overhead (if the
compiler ignores SEH exceptions and doesn't need to track them if
thrown) it's the compilers fault if there is a speed difference.

And I don't think for example GCC under Windows uses a different
exception model than under Linux ? Or does it ?

Interesting. I'll get the assembly for my tests and post it. I wonder
if I'm measuring random noise or something for my windowws AMD 64
test.

Also, your deduction is correct that I am using visual studios 2008. I
should have mentioned that. I wonder if 2010 makes a difference.
That'll likely have to wait until Monday or Tuesday though.
 
M

Michael Doubez

Wrong, std::stack::pop() calls std::stack::c.pop_back(); pop_back() on
an empty container is undefined behaviour; invoking undefined behaviour
is a bug which must be fixed.

I missed that. Thanks for the correction.
 
P

Paul

Ok. Here we go. My test code is at the end. All results are from runs
executed today.

Thank you for the test code and your work. That's quite fair and a good
basis for discussions.

And then it's fair enough to invest some time by my own.
[...]
Also, if you see any problems with my test, please say so, so I can
fix it and rerun it.

No problem, but some remarks:

- Windows timers based on TickCount and Clock are quite inaccurate,
since they are updated on each IRQ and therefore have a resolution of
15ms (on most systems). For long running tests (such as this) and
to get a first impression they are o.k.
I prefer PerformanceCounters for high precision timings
(but for this test I don't think it makes a difference)

- From the command line parameters I conclude you are using VS 2008.
I use VS 2010. I don't think that they've changed the exception model
completely in VS2010 - but who knows I've gonna check this at work
next week. There is a free version of VS2010 available, but I don't
know if the code optimization is restricted (AFAIK no - since 2010).

I run the tests on my laptop - Intel dual core 2.2 GHz.
I used the same parameters, but full optimization -> but shouldn't have
a significant effect.

C++ command line:
/Zi /nologo /W3 /WX- /Ox /Oi /Ot /GL /D "WIN32" /D "NDEBUG" /D
"_CONSOLE" /D "_UNICODE" /D "UNICODE" /Gm- /EHsc /GS /Gy /fp:precise
/Zc:wchar_t /Zc:forScope /Yu"StdAfx.h" /Fp"x64\Release\ForumCpp.pch"
/Fa"x64\Release\" /Fo"x64\Release\" /Fd"x64\Release\vc100.pdb" /Gd
/errorReport:queue

Linker:

/OUT:"Cpp.exe" /INCREMENTAL:NO /NOLOGO "kernel32.lib" "user32.lib"
"gdi32.lib" "winspool.lib" "comdlg32.lib" "advapi32.lib" "shell32.lib"
"ole32.lib" "oleaut32.lib" "uuid.lib" "odbc32.lib" "odbccp32.lib"
/MANIFEST /ManifestFile:"x64\Release\Cpp.exe.intermediate.manifest"
/ALLOWISOLATION /MANIFESTUAC:"level='asInvoker' uiAccess='false'" /DEBUG
/PDB:"trash\Cpp.pdb" /SUBSYSTEM:CONSOLE /OPT:REF /OPT:ICF /PGD:"Cpp.pgd"
/LTCG /TLBID:1 /DYNAMICBASE /NXCOMPAT /MACHINE:X64 /ERRORREPORT:QUEUE

I've got the following results: (rearranged output)

Command line: 60000 -10 1:

Inlineable Member Return Code : 3.182
Inlineable Global Return Code : 3.151
Manually Inlined Return Code : 3.198
Virtual Return Code : 14.368
Virtual Return Code Fake Try : 17.394
Manually Inlined Optimized Return Code : 3.182
Virtual Exception : 14.243

Command line: 60000 -10:

Inlineable Member Return Code : 3.26
Inlineable Global Return Code : 3.198
Manually Inlined Return Code : 3.198
Virtual Return Code : 14.259
Virtual Return Code Fake Try : 17.459
Manually Inlined Optimized Return Code : 3.183
Virtual Exception : 17.487

Nearly the same results, besides some neglectable differences.
Only one result shows a difference:
After I've changed the code

class TestImpl2 : public TestInterface
{
public:
virtual void virtualCanThrow(int a, int b, int targetSum)
{
// cout << "XXX" << endl;
}
virtual ReturnCodeT virtualReturnCode(int a, int b, ...
{ //cout << "XXX" << endl;
return Success;
}

};

into

class TestImpl2 : public TestInterface
{
public:
virtual void virtualCanThrow(int a, int b, int targetSum)
{
if (a + b == targetSum)
cout << "XXX" << endl;
}
virtual ReturnCodeT virtualReturnCode(int a, int b, int ... { //cout
<< "XXX" << endl;
if (a + b == targetSum)
cout << "XXX" << endl;
return Success;
}

};

I've got the same results.
So far I don't experience speed differences.
However, sadly too many compiler writers and ABI writers missed this
very important memo, which I think is core to the entire C++

For x86 Windows I agree.
But let's have a look at the assembly code of your other test code:

int main(int argc, char* argv[])
{
try
{
if (argc == 3) throw 1;
}
catch(int)
{
return -1;
}
return 0;

}

Windows x86 code VC2010:

00BA1002 in al,dx
00BA1003 push 0FFFFFFFFh
00BA1005 push offset __ehhandler$_wmain (0BA1980h)
00BA100A mov eax,dword ptr fs:[00000000h]
00BA1010 push eax
00BA1011 mov dword ptr fs:[0],esp
00BA1018 sub esp,8
try
{
if (argc == 3) throw 1;
00BA101B cmp dword ptr [ebp+8],3
00BA101F push ebx
00BA1020 push esi
00BA1021 push edi
00BA1022 mov dword ptr [ebp-10h],esp
00BA1025 mov dword ptr [ebp-4],0
00BA102C jne $LN8+14h (0BA105Dh)
00BA102E push offset __TI1H (0BA22C8h)
00BA1033 lea eax,[ebp-14h]
00BA1036 push eax
00BA1037 mov dword ptr [ebp-14h],1
00BA103E call _CxxThrowException (0BA1970h)
}
catch(int)
{
return -1;
00BA1043 mov eax,offset $LN8 (0BA1049h)
00BA1048 ret
$LN8:
00BA1049 or eax,0FFFFFFFFh
}
return 0;}

00BA104C mov ecx,dword ptr [ebp-0Ch]
00BA104F mov dword ptr fs:[0],ecx
00BA1056 pop edi
00BA1057 pop esi
00BA1058 pop ebx
00BA1059 mov esp,ebp
00BA105B pop ebp
00BA105C ret
00BA105D mov ecx,dword ptr [ebp-0Ch]
00BA1060 pop edi
00BA1061 pop esi
00BA1062 xor eax,eax
00BA1064 mov dword ptr fs:[0],ecx
00BA106B pop ebx
00BA106C mov esp,ebp
00BA106E pop ebp
00BA106F ret

Windows x64 code VC2010:

{
if (argc == 3) throw 1;
000000013F13100D cmp ecx,3
000000013F131010 jne wmain+2Ch (13F13102Ch)
000000013F131012 mov dword ptr [rsp+20h],1

000000013F13101A lea rdx,[_TI1H (13F1324A0h)]
000000013F131021 lea rcx,[rsp+20h]
000000013F131026 call _CxxThrowException (13F13190Eh)
000000013F13102B nop
}
return 0;

000000013F13102C xor eax,eax
000000013F13102E jmp $LN8+3 (13F131033h)
{
return -1;
000000013F131030 or eax,0FFFFFFFFh}

000000013F131033 add rsp,38h
000000013F131037 ret

There is a huge difference. In x86 code the old implementation is used
-> exception stack which pointer is held in [fs] segment register.
In x64 code there is no exception stack anymore since the compiler uses
a static table for stack unwinding, therefore no overhead if no
exception is thrown.
So the implementation should be comparable to Linux / Unix systems and
compilers.

Since I already mentioned that SEH doesn't add any overhead (if the
compiler ignores SEH exceptions and doesn't need to track them if
thrown) it's the compilers fault if there is a speed difference.

And I don't think for example GCC under Windows uses a different
exception model than under Linux ? Or does it ?

Interesting. I'll get the assembly for my tests and post it. I wonder
if I'm measuring random noise or something for my windowws AMD 64
test.

Also, your deduction is correct that I am using visual studios 2008. I
should have mentioned that. I wonder if 2010 makes a difference.
That'll likely have to wait until Monday or Tuesday though.
......................................................................


This will help you understand the win 32 ASM code, and possibly the other
ASM code aswell:
http://win32assembly.online.fr/Exceptionhandling.html

Please note the part that says:
"C programmers will use various shortcuts provided by their compilers by
including in their source code statements such as _try, _except, _finally,
_catch and _throw.
One real disadvantage in relying on the compiler's code is that it can
enlarge the final exe file enormously. "

But I guess you guys are trying to establish if C++ EH produces an overhead
in terms of number of opcodes processed and not program size.
 
A

Andre Kaufmann

On 05.02.2011 22:52, Joshua Maurice wrote:
[...]

I've logged in remotely at work and the results are as I had suspected.
The same results (performance of x64 exception code == performance x64
no exception code), besides some slight differences in the test "Virtual
Exception". After I've changed the code slightly, to force the optimizer
to generate (significant) code, I've got the same results.

>[...]

Interesting. I'll get the assembly for my tests and post it. I wonder
if I'm measuring random noise or something for my windowws AMD 64
test.

The generated assembly code of the small test program is the same as
under VS2010.
I wonder why you got different results, your settings are correct and
the compiler version is sufficient too. Curious.

Also, your deduction is correct that I am using visual studios 2008. I
should have mentioned that. I wonder if 2010 makes a difference.
That'll likely have to wait until Monday or Tuesday though.

I used VS2008 SP1 for the tests, but I don't think that matters anyways.
 
J

Joshua Maurice

On 05.02.2011 22:52, Joshua Maurice wrote:
[...]

I've logged in remotely at work and the results are as I had suspected.
The same results (performance of x64 exception code == performance x64
no exception code), besides some slight differences in the test "Virtual
Exception". After I've changed the code slightly, to force the optimizer
to generate (significant) code, I've got the same results.

 >[...]


Interesting. I'll get the assembly for my tests and post it. I wonder
if I'm measuring random noise or something for my windowws AMD 64
test.

The generated assembly code of the small test program is the same as
under VS2010.
I wonder why you got different results, your settings are correct and
the compiler version is sufficient too. Curious.
Also, your deduction is correct that I am using visual studios 2008. I
should have mentioned that. I wonder if 2010 makes a difference.
That'll likely have to wait until Monday or Tuesday though.

I used VS2008 SP1 for the tests, but I don't think that matters anyways.

Dunno. I'll look at the generated assembly when I get back to work.
 
J

James Kanze

There are certain, very special cases where it is appropriate
for a destructor to throw. But only in such very special cases.
Such classes can't be put into a standard container, since
(§17.4.3.6):
In particular, the effects are undefined in the
following cases:
[...]
if any replacement function or handler function or
destructor operation throws an exception, unless
specifically allowed in the applicable Required
behavior paragraph.
A type used in a standard container is not allowed to exit
a destructor via an exception.
Thanks, I didn't know that. Well, that cuts my argument.
Does this apply to standard adaptors ?

I would guess that it was meant to, but I'm not sure that the
standard is clear about it. Fundamentally, I'd say that there's
no reason for it to impose any requirements that the underlying
container didn't impose.

On the other hand, in all but very special cases, what does it
mean for a destructor to throw? That you haven't successfully
destructed the objet. That would imply that it still exists.
Which wouldn't work very well if the object had auto lifetime,
or was contained in an object with auto lifetime.

(In the only reasonable cases I've seen for a destructor
throwing, the object was designed to be used exclusively as a
temporary, and the destructor throwing was a means of differing
the throw until some other operations -- generally formatting
the message -- had finished.)
 
J

Joshua Maurice

On 05.02.2011 22:52, Joshua Maurice wrote:
[...]
I've logged in remotely at work and the results are as I had suspected.
The same results (performance of x64 exception code == performance x64
no exception code), besides some slight differences in the test "Virtual
Exception". After I've changed the code slightly, to force the optimizer
to generate (significant) code, I've got the same results.
[...]
Interesting. I'll get the assembly for my tests and post it. I wonder
if I'm measuring random noise or something for my windowws AMD 64
test.
The generated assembly code of the small test program is the same as
under VS2010.
I wonder why you got different results, your settings are correct and
the compiler version is sufficient too. Curious.
I used VS2008 SP1 for the tests, but I don't think that matters anyways.

Dunno. I'll look at the generated assembly when I get back to work.

Close to the release now. I just got some time. I reran the tests, and
I still got similar results.

???>test_solution.exe 50000 -10
Virtual Exception : 8.255
Inlineable Global Return Code : 1.686
Inlineable Member Return Code : 1.661
Manually Inlined Optimized Return Code : 1.721
Virtual Return Code Fake Try : 9.104
Virtual Return Code : 6.736
Manually Inlined Return Code : 1.686

???>test_solution.exe 50000 -10
Manually Inlined Optimized Return Code : 1.645
Inlineable Member Return Code : 1.667
Inlineable Global Return Code : 1.717
Virtual Return Code : 6.574
Manually Inlined Return Code : 1.62
Virtual Exception : 8.2
Virtual Return Code Fake Try : 9.113

???>test_solution.exe 50000 -10
Inlineable Global Return Code : 1.717
Inlineable Member Return Code : 1.658
Virtual Return Code Fake Try : 9.137
Virtual Return Code : 6.544
Virtual Exception : 8.228
Manually Inlined Optimized Return Code : 1.702
Manually Inlined Return Code : 1.695

Here's the disassembly output from the visual studios debugger. I'm
not sure what all of that means. I can barely follow it even with
googling some of the opcodes. I ask you to please make sense of it.


void testVirtualException(TestInterface *& x, int loopIterations, int
failIfEqualsNumber)
{ for (int i=0; i<loopIterations; ++i)
000000013FBE1780 mov dword ptr [rsp+18h],r8d
000000013FBE1785 mov dword ptr [rsp+10h],edx
000000013FBE1789 mov qword ptr [rsp+8],rcx
000000013FBE178E push rbx
000000013FBE178F push rsi
000000013FBE1790 push rdi
000000013FBE1791 push r12
000000013FBE1793 push r13
000000013FBE1795 sub rsp,30h
000000013FBE1799 mov qword ptr [rsp+28h],0FFFFFFFFFFFFFFFEh
000000013FBE17A2 mov r12d,r8d
000000013FBE17A5 mov esi,edx
000000013FBE17A7 mov r13,rcx
000000013FBE17AA xor edi,edi
000000013FBE17AC mov dword ptr ,edi
000000013FBE17B0 cmp edi,esi
000000013FBE17B2 jge $LN15+26h (13FBE1800h)


void testVirtualReturnCode(TestInterface *& x, int loopIterations, int
failIfEqualsNumber)
{ for (int i=0; i<loopIterations; ++i)
000000013F6B16D0 test edx,edx
000000013F6B16D2 jle testVirtualReturnCode+0A6h (13F6B1776h)
000000013F6B16D8 push rbp
000000013F6B16D9 push rsi
000000013F6B16DA push rdi
000000013F6B16DB sub rsp,20h
000000013F6B16DF mov qword ptr [x],rbx
000000013F6B16E4 mov qword ptr [loopIterations],r12
000000013F6B16E9 mov qword ptr [failIfEqualsNumber],r13
000000013F6B16EE lea r13,[TestImpl2::`vftable' (13F6B4698h)]
000000013F6B16F5 mov edi,edx
000000013F6B16F7 mov rsi,rcx
000000013F6B16FA xor ebp,ebp
000000013F6B16FC lea r12d,[r8+6]
{ for (int j=0; j<loopIterations; ++j)
000000013F6B1700 xor ebx,ebx
{ for (int j=0; j<loopIterations; ++j)
{ try
{ if (Failure == x->virtualReturnCode(i, j, failIfEqualsNumber+5))
{ cout << "virtual return code with fake try, returned failure" <<
endl;
x = new TestImpl2();
}
} catch (...)
{ cout << "ERROR impossible exception caught" << endl;
x = new TestImpl2();
}
}
}
}


void testVirtualReturnCodeFakeTry(TestInterface *& x, int
loopIterations, int failIfEqualsNumber)
{ for (int i=0; i<loopIterations; ++i)
000000013F6B15D0 mov dword ptr [rsp+18h],r8d
000000013F6B15D5 mov dword ptr [rsp+10h],edx
000000013F6B15D9 mov qword ptr [rsp+8],rcx
000000013F6B15DE push rbx
000000013F6B15DF push rsi
000000013F6B15E0 push rdi
000000013F6B15E1 push r12
000000013F6B15E3 push r13
000000013F6B15E5 push r14
000000013F6B15E7 sub rsp,38h
000000013F6B15EB mov qword ptr [rsp+28h],0FFFFFFFFFFFFFFFEh
000000013F6B15F4 mov r14d,r8d
000000013F6B15F7 mov r12d,edx
000000013F6B15FA mov r13,rcx
000000013F6B15FD xor esi,esi
000000013F6B15FF mov dword ptr ,esi
000000013F6B1603 lea rdi,[TestImpl2::`vftable' (13F6B4698h)]
000000013F6B160A nop word ptr [rax+rax]
000000013F6B1610 cmp esi,r12d
000000013F6B1613 jge $LN18+3Dh (13F6B16BFh)
{ cout << "inlineable member return code, vector size comparison true"
<< endl;
x = new TestImpl2();
}
}


And ack. Remind me to copy the full disassembly next time. Forgot to,
and I don't have access to that computer for a while now.
 
A

Andre Kaufmann

On Feb 5, 10:27 pm, Andre Kaufmann<[email protected]> wrote:

[...]
Here's the disassembly output from the visual studios debugger. I'm
not sure what all of that means. I can barely follow it even with
googling some of the opcodes. I ask you to please make sense of it.

Thank you posting the outputs.
So far the code seems to be ok.

Some minor differences (different registers etc.). But not the typical
exception stack initialization and keeping track of created objects as
it's the case for Windows x86 and VC++.

I still wonder why you see differences between calling your test program
with different parameters under Windows x64 -
perhaps we are discussing different things ?

Do you experience with parameters:

60000 -10

and

60000 -10 1

different timings ?

(I deactivated the outputs, but used a global variable to fake the
optimizer).

The only difference between my settings, I deactivated the security
cookie code generation.
And ack. Remind me to copy the full disassembly next time. Forgot to,
and I don't have access to that computer for a while now.

Some code is missing but I don't think it's relevant.

E.g. for the code:

void throwit() { throw 1; }

struct v
{
v(int argc) { if (argc > 3) throwit(); }
~v() { printf("destroyed\r\n"); }
};

int _tmain(int argc, char* argv[])
{
try
{
v v1(argc+1);
v v2(argc+2);
v v3(argc+3);
v v4(argc+4);
v v5(argc+5);
}
catch(...)
{
printf("Exception");
}
return 0;
}


The relevant x86 assembly code would be:

Initialization of the exception stack: [fs] cpu segment register

mov eax,dword ptr fs:[00000000h]
push eax
mov dword ptr fs:[0],esp

And to keep track of created objects

mov byte ptr [ebp-4], x [x=0..]

(I omit the code for deinitialization - restoration of exception stack
at the end of the function)


This surely is the old "Use exception stack to keep track of created
objects" method. In x64 code this tracking is not used, therefore there
should no (significant) overhead between code using exceptions and code
which doesn't use exceptions.

And this should be the same (comparable) implementation as for GCC under
Linux. At least I don't know any better (faster) code, than code which
isn't executed ;-)

Andre
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top