GCC 4.4.0 produce crashing code in release mode

Q

Qi

The code compiled by VC 2008 express and GCC 4.5.2 works fine
in both debug and release mode.
Works also fine in debug mode with GCC 4.4.0 (the MingW GCC
in the Qt package).
But the code always crashes (segment fault, reading address 0x00000004)
in release mode with GCC 4.4.0.

Can it be a bug in GCC 4.4.0?
I'm quite worrying that if it's a potential bug in my code.

However, I can't post any code piece here because the code
is a little bit complicated and I don't know where causes
the crash.

I spent two hours on trying to see what's wrong in my code
but got no result.

Is there any famous known bug that will produce wrong code
in GCC 4.4.0?
 
Q

Qi

Forgot to say, the crashing only happens if optimization is enabled
(-O2 or -O3). If no optimization, it won't crash.
 
W

werasm

Can it be a bug in GCC 4.4.0?
I'm quite worrying that if it's a potential bug in my code.

More than likely a bug in your code...
However, I can't post any code piece here because the code
is a little bit complicated and I don't know where causes
the crash.

How many lines? Then we can't help you.
I spent two hours on trying to see what's wrong in my code
but got no result.

Only two hours? ;-)
Is there any famous known bug that will produce wrong code
in GCC 4.4.0?

I'd rather trust GCC 4.4.0 than your code, no offence meant.

Regards,

Werner
 
K

Kai-Uwe Bux

Qi said:
Guess so too...


Not just a piece of code.
Spreading in some templates.


I can use two hours to write a lot of new code...


Understand.
But VC 9 and GCC 4.5.2 works quite quite fine, very weird.

Differences in program behavior under various compilers or optimization
settings are, in the cases I experienced, most often caused by undefined
behavior within the code.

Hunting for undefined behavior is difficult. You probably had a look at the
crash in a debugger and worked your way up the call stack leading to the
crash. Another thing, I would try, is to search for errors using tools such
as valgrind.


Best,

Kai-Uwe Bux
 
D

Dombo

Op 29-Mar-11 16:19, Qi schreef:
The code compiled by VC 2008 express and GCC 4.5.2 works fine
in both debug and release mode.
Works also fine in debug mode with GCC 4.4.0 (the MingW GCC
in the Qt package).
But the code always crashes (segment fault, reading address 0x00000004)
in release mode with GCC 4.4.0.

Can it be a bug in GCC 4.4.0?

It can be a bug in GCC, but chances are the bug is in your code. The
fact that your code works fine with VC 2008 Express and GCC 4.5.2
doesn't mean it is correct; sometimes incorrect code appears to run just
fine. Those are one of the nastiest bugs.

My experience is that when release (optimized) builds fail, when debug
builds run just fine it is often caused by the use of uninitialized
variables. Sometimes increasing the warning level of the compiler helps
to locate those bugs. In fact some compilers find more suspect code when
optimization is enabled.

Just like any other piece of software a compiler will have bugs. Some of
these bugs may only reveal themselves when certain code is compiled with
certain compiler settings. I have used compilers in the past which when
optimize for speed was enabled did produce incorrect code with certain
code fragments.
I'm quite worrying that if it's a potential bug in my code.

However, I can't post any code piece here because the code
is a little bit complicated and I don't know where causes
the crash.

Run it in the debugger and see where it crashes. If necessary look at
the assembly code to figure out what is going on (optimized code tends
to confuse the debugger at times).
I spent two hours on trying to see what's wrong in my code
but got no result.

Is there any famous known bug that will produce wrong code
in GCC 4.4.0?

I you want any help you have got to be a bit more specific. No doubt
there are hundreds, if not thousands, bugs in GCC 4.4.0. Without code it
is impossible to tell whether it is the compiler or your code to blame.
Post the smallest amount of code that reproduces the problem.
 
K

Kevin P. Fleming

I you want any help you have got to be a bit more specific. No doubt
there are hundreds, if not thousands, bugs in GCC 4.4.0. Without code it
is impossible to tell whether it is the compiler or your code to blame.
Post the smallest amount of code that reproduces the problem.

There is also not much point in trying to determine whether this is a
bug in GCC 4.4.0 or not, as the GCC team won't fix bugs in that
version... it's not even the most recent release in the 4.4.x series.

First, the OP should try using the most recent 4.4.x release before
spending any more 'two hour' blocks of time. Only then is it worth
investigating whether it is a compiler bug or not... but if it is, and
it's fixed in 4.5.x, there's not a lot of value in knowing that.
 
Q

Qi

Hunting for undefined behavior is difficult. You probably had a look at the
crash in a debugger and worked your way up the call stack leading to the
crash. Another thing, I would try, is to search for errors using tools such
as valgrind.

valgrind looks good. But it has no Windows version (why?).
Seems it would cost more time to setup valgrind+wine
than fixing my original problem... :(
 
Q

Qi

First, the OP should try using the most recent 4.4.x release before
spending any more 'two hour' blocks of time. Only then is it worth
investigating whether it is a compiler bug or not... but if it is, and
it's fixed in 4.5.x, there's not a lot of value in knowing that.

I don't really care much if it's a bug in gcc 4.4.0 (and 4.6.0 is there
already). What I really care is if it's a bug in my code.
If I can verify it's a bug in gcc, then maybe not a bug in my code...

I would like to budget some more "two hours" on that because this kind
of potential bug is always scarying me than a 100% reproducable bug.

Maybe I also need to try newer 4.4.x.
 
Q

Qi

After another two hours work, now I'm 99% sure it's a bug in gcc
4.4.0 when optimizing for function with "stdcall" calling convention.

Below is a piece of disassemble code (dump from OllyDbg, I don't
know how to use Code::Blocks to debug efficiently).

004067EA > CC int3
004067EB . C74424 78 CF07>mov [dword ss:esp+78],7CF
004067F3 . 8B4C24 28 mov ecx,[dword ss:esp+28]
004067F7 . C641 08 00 mov [byte ds:ecx+8],0
004067FB . 8B39 mov edi,[dword ds:ecx]
004067FD . 897C24 2C mov [dword ss:esp+2C],edi
00406801 . 89F8 mov eax,edi
00406803 . 39F9 cmp ecx,edi
00406805 . 0F84 5B010000 je RunTestC.00406966
0040680B . 8D4424 48 lea eax,[dword ss:esp+48]
0040680F . 894424 1C mov [dword ss:esp+1C],eax
00406813 . 90 nop
00406814 > 8B5424 2C mov edx,[dword ss:esp+2C]
00406818 . 8B42 08 mov eax,[dword ds:edx+8]
0040681B . 85C0 test eax,eax
0040681D . 74 10 je short RunTestC.0040682F
0040681F . 8B10 mov edx,[dword ds:eax]
00406821 . 8D4C24 78 lea ecx,[dword ss:esp+78]
00406825 . 894C24 04 mov [dword ss:esp+4],ecx
00406829 . 890424 mov [dword ss:esp],eax
0040682C . FF52 10 call [dword ds:edx+10]
0040682F > 8B7C24 2C mov edi,[dword ss:esp+2C]
00406833 . 8B47 18 mov eax,[dword ds:edi+18]

Note the call at address 0040682C, it's a call to a member function,
void __stdcall callback1(int n) const {
(void)n;
TS_TRACE("callback1");
}

And note that in the end of callback1, there is a ret instruction
which is "retn 8", which is a normal return instruction for stdcall.
However, the problem is, among all of the above disassemble code,
there is no any normal "push" instruction to pass the arguments.
Instead of that, the compiler uses existing stack frame to pass the
argument, see address 004067EB, 7CF is the argument 1999 I passed
to the callback1.

As as stdcall, after the call, at the address 0040682F, ESP should
be same as at the address 00406829 so the stack is balanced.
But due to the "retn 8" I mentioned above, the stack gets unbalanced
and then at address 0040682F EDI gets wrong value.

I also examined GCC 4.5.2, the code of callback1 it generated, the last
return instruction is "retn", as if it's not a stdcall, so there is
no problem (of course other code is also different with 4.4.0).

Since this bug had been fixed in newer GCC, we can ignore this bug.
But I'm happy to spend two "two hours" to learn and verify that,
1, My code is safe and has no the potential nasty bugs I suspected,
2, Learn a little bit how C++ compilers optimize the code (the
disassemble code is too hard to read).

I think I should read the release notes of past GCC versions when
I have another two hours. :)
 
J

Jorgen Grahn

valgrind looks good. But it has no Windows version (why?).

Valgrind is delicate software. It seems to be a lot of work to port it
even from Linux to some other Unix, or from x86 to some other CPU
architecture.

I'm not sure what is available and inexpensive on Windows.

/Jorgen
 
J

jacob navia

Le 29/03/11 16:32, werasm a écrit :
More than likely a bug in your code...


How many lines? Then we can't help you.


Only two hours? ;-)


I'd rather trust GCC 4.4.0 than your code, no offence meant.

Regards,

Werner

It was a bug in GCC.

I have been bitten by so many that your blind trust in that software
looks ridiculous. No offense meant.

:)
 
V

Virchanza

Forgot to say, the crashing only happens if optimization is enabled
(-O2 or -O3). If no optimization, it won't crash.


I've experienced this before, i.e. my program worked fine until I did -
O3.

I'm willing to bet you have a sequence point violation somewhere in
your code.

My own problem was caused by the following function:

void StrToLower(char *p)
{
while ( *p++ = tolower( (char unsigned)*p ) );
}
 
J

Joshua Maurice

It was a bug in GCC.

I have been bitten by so many that your blind trust in that software
looks ridiculous. No offense meant.

I'm curious. Can you link to the bug report, or give a short
description of it?
 
J

jacob navia

Le 04/04/11 09:45, Joshua Maurice a écrit :
I'm curious. Can you link to the bug report, or give a short
description of it?
I don't report errors to them anymore.
Last time I did it (it was an error in gdb) I send them a fix.

There was no answer, the bug is still there, 2 years later.

Maybe they did not want to fix it, or they did not care. In any case
I just do some work around or try to use another compiler that is
a better compiler, like Intel's.

If you want to know about gcc's bugs go to their site and look at
the "open bugs" page.

There, you will find those that were reported.
 
M

Miles Bader

jacob navia said:
It was a bug in GCC.

Er, the OP (Qi) gave no example, and indeed, no information about what
his problem is. How on earth do you know "it was a bug in GCC"?!

GCC, like every compiler, has bugs, but it's a fairly solid compiler,
and it's far, far, more common for crashes to be caused by problems with
user code. So absent any actual information, it's much more likely that
werasm is correct than you are...

-miles
 
J

Jorgen Grahn

Er, the OP (Qi) gave no example, and indeed, no information about what
his problem is. How on earth do you know "it was a bug in GCC"?!

The OP wrote about it somewhere in the thread, and it sounded
trustworthy ... even though he didn't enable us to repeat his results.
GCC, like every compiler, has bugs, but it's a fairly solid compiler,
and it's far, far, more common for crashes to be caused by problems with
user code. So absent any actual information, it's much more likely that
werasm is correct than you are...

IIRC, it turned out the OP used MinGW or whatever that Windows port is
called, and some Windows-specific(?) extension called "stdcall".

A compiler bug suddenly sounds less unlikely!

/Jorgen
 
Q

Qi

The OP wrote about it somewhere in the thread, and it sounded
trustworthy ... even though he didn't enable us to repeat his results.

Yes.
And indeed the source code is on my personal site as open source.
But it makes no sense if I post the link here then some persons
start *wasting* time on compiling and trying to reproduce the bug.

It's meaningless because now I'm convinced that my code is fine
and 99% sure it's a bug in gcc, so no wasting time any more.

It's also meaningless to verify it's a bug in gcc 4.4.0. The bug
at least had been fixed in later gcc.

And thanks for your trust. :)
IIRC, it turned out the OP used MinGW or whatever that Windows port is
called, and some Windows-specific(?) extension called "stdcall".

A compiler bug suddenly sounds less unlikely!

Yes again.
I disassemblied the binary and found the problematical code.
That's why I can say 99% sure.

An off topic note is that stdcall should work mostly fine in MinGW because
so many applications are developed with MinGW/Qt. Seems the bug only occurs
under some complicated environment.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top