Compiler bug in lcc-win32

T

tm

Hello,

I discovered a compiler bug in lcc-win32 under Windows XP.
The || and && operators work wrong when -O is used.
The following program shows the bug:

-------------------- Begin file testlcc.c --------------------
/* Compile with:
* lc -O -g5 -A testlcc.c -o testlcc.exe
*/

#include "stdio.h"

typedef enum {ILLEGALCHAR, EOFCHAR, LETTERCHAR, DIGITCHAR,
UNDERLINECHAR, SHARPCHAR, QUOTATIONCHAR, APOSTROPHECHAR,
LEFTPARENCHAR, PARENCHAR, SPECIALCHAR, SPACECHAR,
NEWLINECHAR} charclass;

typedef int booltype;

charclass ch_class[256 - EOF];
booltype ch_op[256 - EOF];

#define char_class(CHARACTER) ch_class [((int)(CHARACTER)) - EOF]
#define op_char(CHARACTER) ch_op [((int)(CHARACTER)) - EOF]

void get_ident (unsigned char *name, unsigned int length)
{
printf("length:\n%d # should be: 1\n", length);
printf("name[0]:\n'%c' (%d) # should be: '$' (36)\n",
name[0], name[0]);
printf("op_char(name[0]):\n%d # should be: 1\n",
op_char(name[0]));
printf("char_class(name[0]):\n%2d # should be: 10\n",
char_class(name[0]));
printf("char_class(name[0])==LEFTPARENCHAR:\n");
printf("%d # should be: 0\n",
char_class(name[0])==LEFTPARENCHAR);
printf("op_char(name[0]) || "
"char_class(name[0])==LEFTPARENCHAR:\n");
printf("%d # should be: 1\n",
op_char(name[0]) ||
char_class(name[0])==LEFTPARENCHAR);
printf("char_class(name[0])==PARENCHAR:\n");
printf("%d # should be: 0\n",
char_class(name[0])==PARENCHAR);
printf("op_char(name[0]) || "
"char_class(name[0])==LEFTPARENCHAR || "
"char_class(name[0])==PARENCHAR:\n");
printf("%d # should be 1\n",
op_char(name[0]) ||
char_class(name[0])==LEFTPARENCHAR ||
char_class(name[0])==PARENCHAR);
printf("length==1 && (op_char(name[0]) || "
"char_class(name[0])==LEFTPARENCHAR || "
"char_class(name[0])==PARENCHAR):\n");
printf("%d # should be 1\n",
length==1 && (op_char(name[0]) ||
char_class(name[0])==LEFTPARENCHAR ||
char_class(name[0])==PARENCHAR));
}

int main (int argc, char *argv[])
{
char_class('$') = SPECIALCHAR;
op_char('$') = 1;
get_ident("$", 1);
}
-------------------- End file testlcc.c --------------------

When I compile with

lc -O -g5 -A testlcc.c -o testlcc.exe

and start testlcc.exe, I get the following output:

length:
1 # should be: 1
name[0]:
'$' (36) # should be: '$' (36)
op_char(name[0]):
1 # should be: 1
char_class(name[0]):
10 # should be: 10
char_class(name[0])==LEFTPARENCHAR:
0 # should be: 0
op_char(name[0]) || char_class(name[0])==LEFTPARENCHAR:
0 # should be: 1
char_class(name[0])==PARENCHAR:
0 # should be: 0
op_char(name[0]) || char_class(name[0])==LEFTPARENCHAR ||
char_class(name[0])==PARENCHAR:
0 # should be 1
length==1 && (op_char(name[0]) || char_class(name[0])==LEFTPARENCHAR
|| char_class(name[0])==PARENCHAR):
0 # should be 1

--------------------

The correct result can be obtained with gcc:

gcc testlcc.c -o testlcc

1 # should be: 1
name[0]:
'$' (36) # should be: '$' (36)
op_char(name[0]):
1 # should be: 1
char_class(name[0]):
10 # should be: 10
char_class(name[0])==LEFTPARENCHAR:
0 # should be: 0
op_char(name[0]) || char_class(name[0])==LEFTPARENCHAR:
1 # should be: 1
char_class(name[0])==PARENCHAR:
0 # should be: 0
op_char(name[0]) || char_class(name[0])==LEFTPARENCHAR ||
char_class(name[0])==PARENCHAR:
1 # should be 1
length==1 && (op_char(name[0]) || char_class(name[0])==LEFTPARENCHAR
|| char_class(name[0])==PARENCHAR):
1 # should be 1

--------------------

I hope this helps to find and correct the compiler bug.


Greetings Thomas Mertes

--
Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.
 
J

James Kuyper

Hello,

I discovered a compiler bug in lcc-win32 under Windows XP.

Why are you posting this here, instead of contacting the vendor
directly? It gives the (possibly correct) impression that your main
purpose is to embarrass the vendor, rather than simply getting the bug
fixed (assuming that it actually is a bug - I didn't bother to check).
Even if you were justified in making a big public issue about this,
comp.compilers.lcc would have been the more appropriate forum.

While I did not examine your code closely, I did notice one minor issue:
....
#include "stdio.h"

That should be <stdio.h>; it can make a difference which one you use.

....
lc -O -g5 -A testlcc.c -o testlcc.exe ....
gcc testlcc.c -o testlcc

Keep in mind that, like most compilers, neither of the compilers that
you're comparing fully conforms to any version of the C standard in
their default mode. You haven't chosen the correct options to put gcc
into a fully conforming mode - you need to use at least -ansi -pedantic.
I'm not sure which options you should use with lcc, but whichever ones
they are, you should use them before comparing the results with gcc;
otherwise the comparison is meaningless.
 
T

tm

Why are you posting this here, instead of contacting the vendor
directly?

Because jacob navia often discusses in this group.

[snip accusation]

It is not my intention to embarrass anybody.
A compiler is not a religion. Please calm down.
While I did not examine your code closely, I did notice one minor issue:
...


That should be <stdio.h>; it can make a difference which one you use.

For the purpose of showing this bug, this does not make
any difference.

[snip]
You haven't chosen the correct options to put gcc
into a fully conforming mode - you need to use at least -ansi -pedantic.

No, The expression

op_char(name[0]) || char_class(name[0])==LEFTPARENCHAR

where name[0] is '$' and op_char('$') is 1, should
always return 1, independend of K&R, ANSI C89, ANSI C99
and probably also for a future version. The expression

1 || anyting

should always evaluate to 1, independend of C version
or compiler. Otherwise it is not C.
I'm not sure which options you should use with lcc, but whichever ones
they are, you should use them before comparing the results with gcc;
otherwise the comparison is meaningless.

I do not agree. See above.


Greetings Thomas Mertes

--
Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.
 
E

Eric Sosman

[...]The expression

1 || anyting

should always evaluate to 1, independend of C version
or compiler. Otherwise it is not C.

No matter: Neither is your program. Once you invoke undefined
behavior, the implementation is not obliged to live up to anything
the C Standard promises.

I grant that the particular instances of undefined behavior (two)
and of implementation-defined behavior (one) are extremely unlikely
to be responsible for the results you report. Nonetheless, the claim
"Frobozz Magic C doesn't treat my buggy program the way I want, ergo
Frobozz Magic C is at fault" is a bit of a stretch.
I do not agree. See above.

Any implementation conforming to C90 or to C99 is required to
issue diagnostics for your code. Have you complained to the gcc
folks about their compiler's failure to do so?
 
T

tm

[...]The expression
1 || anyting
should always evaluate to 1, independend of C version
or compiler. Otherwise it is not C.

     No matter: Neither is your program.

Would you please be so kindly to explain, where did I
invoke undefined behavior.

AFAIK. The if-statement

if (1 || sub_expression)
print("TRUE\n");

should print TRUE. Why should this be invoke undefined
behavior?

To state it clear:

**** LCC FAILS TO WRITE TRUE WITH THIS IF STATEMENT ****

[snip]

I think the next time I will not spend my time to
examine a compiler bug, write a test program and
tell about it. I will just refuse to use that compiler.


Greetings Thomas Mertes

--
Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.
 
T

tm

On 7/17/2011 12:46 PM, tm wrote:
[...]The expression
1 || anyting
should always evaluate to 1, independend of C version
or compiler. Otherwise it is not C.
     No matter: Neither is your program.

Would you please be so kindly to explain, where did I
invoke undefined behavior.

AFAIK. The if-statement

if (1 || sub_expression)
  print("TRUE\n");

Before some nitpicker shows up. This should be:

if (1 || sub_expression)
printf("TRUE\n");
should print TRUE. Why should this be invoke undefined
behavior?

And this should be:

Why should this invoke undefined behavior?
To state it clear:

**** LCC FAILS TO WRITE TRUE WITH THIS IF STATEMENT ****

[snip]

I think the next time I will not spend my time to
examine a compiler bug, write a test program and
tell about it. I will just refuse to use that compiler.

Greetings Thomas Mertes

--
Seed7 Homepage:  http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.
 
J

jacob navia

Hi Thomas

Thank you for your bug report.

I am unable to fix it right now because I am in holidays at the beach in
Grière plage, Vandée. (Atlantic coast of France)

It is nice here, wind, sea, sun, kids, ice-cream, swimming pool, no
internet connection, I have to go to some cyber cafe to get one...

Will fix it later.

jacob
 
T

tm

Hi Thomas

Thank you for your bug report.

I am unable to fix it right now because I am in holidays at the beach in
Gri re plage, Vand e. (Atlantic coast of France)

Enjoy your holidays.
It is nice here, wind, sea, sun, kids, ice-cream, swimming pool, no
internet connection, I have to go to some cyber cafe to get one...

Will fix it later.

No problem. I hope that my testprogram is helpful.
I examined the problem for two days and almost gave up...


Greetings Thomas Mertes

--
Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.
 
E

Eric Sosman

[...]The expression
1 || anyting
should always evaluate to 1, independend of C version
or compiler. Otherwise it is not C.

No matter: Neither is your program.

Would you please be so kindly to explain, where did I
invoke undefined behavior.

Run the compiler(s) in a conforming mode, and they'll tell you
about one instance of U.B. (the diagnostic is required). I don't
know about lcc, but gcc will tell you about the other instance, too,
if you ask it politely. The instance of implementation-defined
behavior has already been brought to your attention. (It occurs
to me that there are other instances of I-D behavior in your code,
but they're of the extremely far-fetched and nit-picky variety.)
 
K

Keith Thompson

tm said:
To state it clear:

**** LCC FAILS TO WRITE TRUE WITH THIS IF STATEMENT ****

[snip]

I think the next time I will not spend my time to
examine a compiler bug, write a test program and
tell about it. I will just refuse to use that compiler.

Why on Earth would you say that?

Nobody is criticizing you for reporting a bug. Reporting bugs is A
Good Thing. We're just suggesting that a newsgroup (comp.lang.c)
that discusses a programming language is not the best place to do
it, since you're not really talking about the language. Especially
since there's a newsgroup, comp.compilers.lcc that discusses that
particular compiler (both lcc-win32 and the original lcc), and
because it would make more sense to contact the author direction.
Yes, jacob regularly posts here; I'm sure he reularly reads his
e-mail.

It's not even that big a deal. We get a lot of off-topic posts in
clc; yours is far from the worst. Why are you overreacting like
this to the simple suggestion that there are better places for
bug reports?

(Yes, this is cross-posted to a more appropriate newsgroup, but the
original post, and most of the discussion, were just in clc.)
 
J

James Kuyper

Because jacob navia often discusses in this group.

Yes, and so do a few people who regularly post bug reports against lcc,
not too dissimilar from yours, for the sole apparent purpose of baiting
jacob into an intemperate response. A hallmark of those reports is that
the "bug" usually depends in some subtle fashion upon some
well-documented extension to C supported by lcc-win32. I don't know
lcc-win32's extensions well-enough to be sure that your code has not run
afoul of one of them.
For the purpose of showing this bug, this does not make
any difference.

Until you know what caused the bug, you can't be sure. An incompatible
file named stdio.h in a location that is searched when you use "", but
is not searched when you use <>, could in principle be the reason why
printf() doesn't print the value you expected it to print. I consider
this extremely unlikely, which is why I labeled it a "minor issue", but
I still recommend correcting it.
You haven't chosen the correct options to put gcc
into a fully conforming mode - you need to use at least -ansi -pedantic.

No, The expression

op_char(name[0]) || char_class(name[0])==LEFTPARENCHAR

where name[0] is '$' and op_char('$') is 1, should
always return 1, independend of K&R, ANSI C89, ANSI C99
and probably also for a future version. The expression

But in their default modes, like most other compilers, neither of those
compilers implements any of the languages you listed, so even if you're
right about those four languages, that doesn't guarantee that you're
right about the default lcc-win32 or gnu versions of C. Keep in mind,
also, that the problem could be in some entirely different part of the
code. It might not be the implementation of || that is producing
unexpected results, but a problem elsewhere in your code that manifests
itself in this location.
1 || anyting

should always evaluate to 1, independend of C version
or compiler. Otherwise it is not C.

That depends upon how you define "C"; jacob's definition is more
flexible than mine, but I doubt that lcc-win32 deliberately implements
|| in a way significantly incompatible with the requirements of the C
standard. However, it's quite possible that some other aspect of your
program is causing these symptoms.
 
G

glen herrmannsfeldt

(snip)
Why on Earth would you say that?
Nobody is criticizing you for reporting a bug. Reporting bugs is A
Good Thing. We're just suggesting that a newsgroup (comp.lang.c)
that discusses a programming language is not the best place to do
it, since you're not really talking about the language.

I missed the beginning, but sometimes it is hard to know what
is a compiler bug and what is a language (mis)feature. That comes
up pretty often in Fortran as new features are added and it takes
some time for people to understand how to use them.
Especially
since there's a newsgroup, comp.compilers.lcc that discusses that
particular compiler (both lcc-win32 and the original lcc), and
because it would make more sense to contact the author direction.
Yes, jacob regularly posts here; I'm sure he reularly reads his
e-mail.

If you are sure that it is a compiler bug, then that does
seem a better way.

-- glen
 
T

tm

[...]
To state it clear:
**** LCC FAILS TO WRITE TRUE WITH THIS IF STATEMENT ****

I think the next time I will not spend my time to
examine a compiler bug, write a test program and
tell about it. I will just refuse to use that compiler.

Why on Earth would you say that?

Nobody is criticizing you for reporting a bug.  Reporting bugs is A
Good Thing.  We're just suggesting that a newsgroup (comp.lang.c)
that discusses a programming language is not the best place to do
it, since you're not really talking about the language.  Especially
since there's a newsgroup, comp.compilers.lcc that discusses that
particular compiler (both lcc-win32 and the original lcc), and
because it would make more sense to contact the author direction.
Yes, jacob regularly posts here; I'm sure he reularly reads his
e-mail.

Maybe you saw it. I posted to comp.compilers.lcc shortly
after I heared from it.
It's not even that big a deal.  We get a lot of off-topic posts in
clc; yours is far from the worst.  Why are you overreacting like
this to the simple suggestion that there are better places for
bug reports?

Using the word "overreacting" is a little bit overreacting. :)

Eric said:
No matter: Neither is your program. Once you invoke undefined
behavior, the implementation is not obliged to live up to anything
the C Standard promises.

With -Wall gcc complains a little bit:
Control reaches the end of the non-void function 'main'
and the function 'get_ident' expects ‘unsigned char *’
but the argument is of type ‘char *’.

This two things have no influence on the compiler bug.

Instead of telling me this two things straight forward, he
started with a quiz: Your program has 'undefined behavior',
find them out.

After searching a compiler bug for two days I was not
interested to discuss at this level. Especially because
I was busy to prepare my new Seed7 release. Sorry for my
"overreaction".

I like to hear about errors and undefined behavior in my
programs. If you are interested, you can download Seed7
and check it.

This would be a real challenge. :)


Greetings Thomas Mertes

--
Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.
 
K

Keith Thompson

glen herrmannsfeldt said:
I missed the beginning, but sometimes it is hard to know what
is a compiler bug and what is a language (mis)feature. That comes
up pretty often in Fortran as new features are added and it takes
some time for people to understand how to use them.

Did you miss the subject header, where it says "Compiler bug in
lcc-win32"?

[...]
 
T

tm

Yes, and so do a few people who regularly post bug reports against lcc,
not too dissimilar from yours, for the sole apparent purpose of baiting
jacob into an intemperate response. A hallmark of those reports is that
the "bug" usually depends in some subtle fashion upon some
well-documented extension to C supported by lcc-win32. I don't know
lcc-win32's extensions well-enough to be sure that your code has not run
afoul of one of them.

I this case it would be an "extension" of the || operator.
The "extension" that would only be effective, when the
program was compiled with optimizing (option -O).
Until you know what caused the bug, you can't be sure. An incompatible
file named stdio.h in a location that is searched when you use "", but
is not searched when you use <>, could in principle be the reason why
printf() doesn't print the value you expected it to print. I consider
this extremely unlikely, which is why I labeled it a "minor issue", but
I still recommend correcting it.

There is NO file stdio.h at the place searched with "".
No, The expression
op_char(name[0]) || char_class(name[0])==LEFTPARENCHAR
where name[0] is '$' and op_char('$') is 1, should
always return 1, independend of K&R, ANSI C89, ANSI C99
and probably also for a future version. The expression

But in their default modes, like most other compilers, neither of those
compilers implements any of the languages you listed, so even if you're
right about those four languages, that doesn't guarantee that you're
right about the default lcc-win32 or gnu versions of C. Keep in mind,
also, that the problem could be in some entirely different part of the
code. It might not be the implementation of || that is producing
unexpected results, but a problem elsewhere in your code that manifests
itself in this location.

I am aware of this possibility.
To make it clear: The error surfaced in an (approx.) 100000
line program. The program posted here was created as
small demo program, which still has the bug.
That depends upon how you define "C"; jacob's definition is more
flexible than mine, but I doubt that lcc-win32 deliberately implements
|| in a way significantly incompatible with the requirements of the C
standard. However, it's quite possible that some other aspect of your
program is causing these symptoms.

This is possible but improbable, since this is not the
first program I wrote.


Greetings Thomas Mertes

--
Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.
 
J

Joachim Schmitz

Eric said:
On 7/17/2011 12:46 PM, tm wrote:

[...]The expression

1 || anyting

should always evaluate to 1, independend of C version
or compiler. Otherwise it is not C.

No matter: Neither is your program.

Would you please be so kindly to explain, where did I
invoke undefined behavior.

Run the compiler(s) in a conforming mode, and they'll tell you
about one instance of U.B. (the diagnostic is required). I don't
know about lcc, but gcc will tell you about the other instance, too,
if you ask it politely. The instance of implementation-defined
behavior has already been brought to your attention. (It occurs
to me that there are other instances of I-D behavior in your code,
but they're of the extremely far-fetched and nit-picky variety.)

Please tell us more. I did compile it with a 2 usually very strict
compilers, for C89 and C99 (non of them gcc though) and neither complained
nor showed the problem the OP described.

Bye, Jojo
 
G

gwowen

This is possible but improbable, since this is not the
first program I wrote.

What happens in LCC if you change the first argument to get_ident from
"unsigned char*" (which is not the correct type of pointer to which
"$" decays) to "const char *" (which is)?
 
B

BartC

gwowen said:
What happens in LCC if you change the first argument to get_ident from
"unsigned char*" (which is not the correct type of pointer to which
"$" decays) to "const char *" (which is)?

Same result.

Tricky bug which seems to depend on the exact nature of the expression x
(and possibly of y and z) in:

x || y || z
 
E

Eric Sosman

Eric said:
[...]
Run the compiler(s) in a conforming mode, and they'll tell you
about one instance of U.B. (the diagnostic is required). I don't
know about lcc, but gcc will tell you about the other instance, too,
if you ask it politely. The instance of implementation-defined
behavior has already been brought to your attention. (It occurs
to me that there are other instances of I-D behavior in your code,
but they're of the extremely far-fetched and nit-picky variety.)

Please tell us more. I did compile it with a 2 usually very strict
compilers, for C89 and C99 (non of them gcc though) and neither
complained nor showed the problem the OP described.

gcc -Wall -W -std=c99 testlcc.c
testlcc.c: In function 'main':
testlcc.c:56: warning: pointer targets in passing argument 1 of
'get_ident' differ in signedness
testlcc.c:16: note: expected 'unsigned char *' but argument is of type
'char *'
testlcc.c:52: warning: unused parameter 'argc'
testlcc.c:52: warning: unused parameter 'argv'

To my surprise, gcc did *not* complain about the other instance
of undefined behavior. I'm surprised; usually it catches that sort
of thing.
 
G

gwowen

     To my surprise, gcc did *not* complain about the other instance
of undefined behavior.  I'm surprised; usually it catches that sort
of thing.

Using -std=c90 warns about falling off the bottom of main(), -std=c99
doesn't (gcc 4.5.2).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top