Strange - a simple assignment statement shows error in VC++ but worksin gcc !

I

Ike Naar

The -Wwrite-strings can cause it to omit the required diagnostic for
the following:

const char (*p)[6] = &"hello";

Without -Wwrite-strings, this would go undiagnosed:

char (*q)[6] = &"hello";

Are there situations where using q would be safe,
while using p would be dangerous?
 
K

Keith Thompson

The -Wwrite-strings can cause it to omit the required diagnostic for
the following:

const char (*p)[6] = &"hello";

Without -Wwrite-strings, this would go undiagnosed:

char (*q)[6] = &"hello";

Yes, the language doesn't require a diagnostic (though an
implementation is free to issue one anyway).
Are there situations where using q would be safe,
while using p would be dangerous?

Well, the declaration of p, if a compiler accepts it, invokes
undefined behavior because it violates a constraint. But that's
unlikely to be a problem in practice.

Attempting to modify the array created by a string literal invokes UB,
and it's perfectly reasonable for a compiler to warn about it when it
can. The problem with gcc's -Wwrite-strings option is that it does
this by changing the type of the array, in violation of the standard.
It's a quick and dirty way to enable the warnings, but as we've seen
it makes the compiler non-conforming. Presumably it works by setting
an attribute on the compiler's internal data structure that describes
the type, indicating that it's been declared as "const". A more
robust approach would be to set a different attribute on the type,
indicating that it's the type of a string literal. Compatibility
rules involving "const" would then be unaffected, but the compiler
could issue optional diagnostics for code that modifies an object of
such a type, or that may lead to such an object being modified.
 
I

Ike Naar

[email protected] (Ike Naar) said:
The -Wwrite-strings can cause it to omit the required diagnostic for
the following:

const char (*p)[6] = &"hello";

Without -Wwrite-strings, this would go undiagnosed:

char (*q)[6] = &"hello";

Yes, the language doesn't require a diagnostic (though an
implementation is free to issue one anyway).

Gcc compiles it without any warnings.
Well, the declaration of p, if a compiler accepts it, invokes
undefined behavior because it violates a constraint. But that's
unlikely to be a problem in practice.

If the declaration of p violates a constraint, and the declaration
of q doesn't, one would expect that the language designers had a good
reason to make it that way. At first sight, it looks like q has more
opportunity for misuse than p has. So I am still curious to know about
situations where it's the other way around.
Attempting to modify the array created by a string literal invokes UB,
and it's perfectly reasonable for a compiler to warn about it when it
can. The problem with gcc's -Wwrite-strings option is that it does
this by changing the type of the array, in violation of the standard.
It's a quick and dirty way to enable the warnings, but as we've seen
it makes the compiler non-conforming. Presumably it works by setting
an attribute on the compiler's internal data structure that describes
the type, indicating that it's been declared as "const". A more
robust approach would be to set a different attribute on the type,
indicating that it's the type of a string literal. Compatibility
rules involving "const" would then be unaffected, but the compiler
could issue optional diagnostics for code that modifies an object of
such a type, or that may lead to such an object being modified.

I have been using -Wwrite-strings for a long time and so far have
found it advantageous. Now (to my surprise) it turns out that
it's not such a nice option after all, if it can silently produce
UB, e.g. by running a program that contains a p-like declaration.

So here's a dilemma: should one dump a useful option, just because it
can result in undefined behaviour for certain contrived (sorry) programs?

Perhaps a solution would be to build the code both with, and without
the option enabled, and make sure it compiles cleanly in both situations.
 
K

Keith Thompson

[email protected] (Ike Naar) said:
The -Wwrite-strings can cause it to omit the required diagnostic for
the following:

const char (*p)[6] = &"hello";

Without -Wwrite-strings, this would go undiagnosed:

char (*q)[6] = &"hello";

Yes, the language doesn't require a diagnostic (though an
implementation is free to issue one anyway).

Gcc compiles it without any warnings.

The declaration is potentially dangerous; the gcc developers
apparently decided that the way to avoid the danger is to use
-Wwrite-strings.

Note that the example uses a pointer to a fixed-size array, something
that's rather rare outside the context of multidimensional arrays.
Strings and arrays are much more commonly manipulated using a pointer
to the element type, with some other method (a separate count or a
terminator) defining the length.
If the declaration of p violates a constraint, and the declaration
of q doesn't, one would expect that the language designers had a good
reason to make it that way. At first sight, it looks like q has more
opportunity for misuse than p has. So I am still curious to know about
situations where it's the other way around.

The *only* reason for making string literals non-const is backward
compatibility. K&R C (pre-ANSI) didn't have the "const" keyword, so
if you had a function like this:

func(s)
char *s;
{
/* ... */
}

there was no way to specify that the function wouldn't modify the data
pointed to by s. If ANSI had made string literals const, then given the
above definition, this call:

func("hello");

would have become a constraint violation, breaking tons of existing
code. ANSI also added "const", so you could change the function
definition to:

void func(const char *s)
{
/* ... */
}

but forcing so much code to be changed would likely have greatly
reduced acceptance of the standard.

Few people would disagree that making string literals const would have
been better if the language were being designed from scratch today,
but that's no longer possible. (<OT>C++ did make string literals
const; there wasn't as much concern for maintaining backward
compatibility with C.</OT>)

[...]
I have been using -Wwrite-strings for a long time and so far have
found it advantageous. Now (to my surprise) it turns out that
it's not such a nice option after all, if it can silently produce
UB, e.g. by running a program that contains a p-like declaration.

I wouldn't worry about it. In a sense, -Wwrite-strings causes gcc to
compile a language that's subtly different from C -- and one could
argue that it's a *better* language. One drawback as with any
language extension, conforming or not, is that it can make more
difficult to be sure that your code conforms to the standard and will
be portable to other compilers.
So here's a dilemma: should one dump a useful option, just because it
can result in undefined behaviour for certain contrived (sorry) programs?

Perhaps a solution would be to build the code both with, and without
the option enabled, and make sure it compiles cleanly in both situations.

That's not a bad idea.
 
I

Ike Naar

[about const char (*p)[6] = &"hello"; ]
[snip excellent explanation why string literals are non-const]

Thank you for the explanation.
Now taking for granted that string literals are non-const, the question
remains why the declaration for p requires a diagnostic, in other words,
why is the conversion from pointer-to-array-of-char to
pointer-to-array-of-const-char problematic?

CLC FAQ item 11.10 discusses a case that is somewhat similar:
why you can't assign a char** to a const char** .
It shows a series of legal (if the conversion were valid)
steps that would allow you to modify a const object:

const char c = 'x'; /* 1 */
char *p1; /* 2 */
const char **p2 = &p1; /* 3 */
*p2 = &c; /* 4 */
*p1 = 'X'; /* 5 */

(in reality, the compiler would complain at step 3).

In the (const char (*)[6]) = (char (*)[6]) case, the scenario from the
FAQ does not apply, because arrays are not assignable like pointers are.
Offhand, I cannot think of a horror scenario for this case.
Perhaps that will be a nice puzzle for a rainy Sunday afternoon.
 
L

lawrence.jones

Ike Naar said:
In the (const char (*)[6]) = (char (*)[6]) case, the scenario from the
FAQ does not apply, because arrays are not assignable like pointers are.
Offhand, I cannot think of a horror scenario for this case.

There probably isn't one. The C conversion rules are very conservative:
they forbid all of the dangerous stuff, but they also forbid a lot of
perfectly safe stuff. No one has ever gone to the (significant) effort
of deriving more precise rules and convincing the standards committee
that they still forbid all the dangerous stuff. C++ has much better
rules but they aren't directly applicable to C because of the semantic
differences between the two languages and the different terminology and
styles of the two standards.
 
C

CBFalconer

Keith said:
.... snip ...

Attempting to modify the array created by a string literal
invokes UB, and it's perfectly reasonable for a compiler to warn
about it when it can. The problem with gcc's -Wwrite-strings
option is that it does this by changing the type of the array,
in violation of the standard.

I don't see the problem. There will be no message if the user
doesn't attempt to write into the string. But if he does, he
creates undefined behaviour, and gets a message.
 
C

CBFalconer

Keith said:
.... snip ...

I wouldn't worry about it. In a sense, -Wwrite-strings causes
gcc to compile a language that's subtly different from C -- and
one could argue that it's a *better* language. One drawback as
with any language extension, conforming or not, is that it can
make more difficult to be sure that your code conforms to the
standard and will be portable to other compilers.

Why? It seems to me that if the program compiles with
-Wwrite-strings set it should compile on any standard compiler (at
least the sections involved). However the reverse does not apply.
 
C

CBFalconer

Ike said:
Keith Thompson said:
[about const char (*p)[6] = &"hello"; ]
[snip excellent explanation why string literals are non-const]
.... snip ...

CLC FAQ item 11.10 discusses a case that is somewhat similar:
why you can't assign a char** to a const char** .
It shows a series of legal (if the conversion were valid)
steps that would allow you to modify a const object:

const char c = 'x'; /* 1 */
char *p1; /* 2 */
const char **p2 = &p1; /* 3 */
*p2 = &c; /* 4 */
*p1 = 'X'; /* 5 */

(in reality, the compiler would complain at step 3).

I don't know if you are referring to this, or something else.
However p1 has no value assigned to it, with should not bother step
3, but will cause all sorts of trouble in step 5.
 
B

Ben Bacarisse

CBFalconer said:
Ike said:
Keith Thompson said:
[about const char (*p)[6] = &"hello"; ]
[snip excellent explanation why string literals are non-const]
... snip ...

CLC FAQ item 11.10 discusses a case that is somewhat similar:
why you can't assign a char** to a const char** .
It shows a series of legal (if the conversion were valid)
steps that would allow you to modify a const object:

const char c = 'x'; /* 1 */
char *p1; /* 2 */
const char **p2 = &p1; /* 3 */
*p2 = &c; /* 4 */
*p1 = 'X'; /* 5 */

(in reality, the compiler would complain at step 3).

I don't know if you are referring to this, or something else.
However p1 has no value assigned to it, with should not bother step
3, but will cause all sorts of trouble in step 5.

But it get a value in step 4.
 
B

Ben Bacarisse

CBFalconer said:
I don't see the problem. There will be no message if the user
doesn't attempt to write into the string. But if he does, he
creates undefined behaviour, and gets a message.

Keith explained a couple of messages back, that:

| The -Wwrite-strings can cause it to omit the required diagnostic for
| the following:
|
| const char (*p)[6] = &"hello";

Does that sound familiar?
 
I

Ike Naar

I don't know if you are referring to this, or something else.
However p1 has no value assigned to it, with should not bother step
3, but will cause all sorts of trouble in step 5.

In step 3, the address of p1 is assigned to p2.
In step 4, the assignment to *p2 is an implicit assigment to p1.
so an assignment to *p2 changes p1.
 
B

Ben Bacarisse

CBFalconer said:
Why? It seems to me that if the program compiles with
-Wwrite-strings set it should compile on any standard compiler (at
least the sections involved). However the reverse does not apply.

Keith gave an example elsewhere in the thread:

| The -Wwrite-strings can cause it to omit the required diagnostic for
| the following:
|
| const char (*p)[6] = &"hello";

If you compile with gcc -Wwrite-strings you are likely to miss this
constraint violation. Another compiler is permitted to reject the
translation unit.

This has come up a few times (I count three). Why not just make a
note that -Wwrite-strings makes gcc non-conforming?

By the way, I like the option and I use it all the time, but I accept
that it introduces a problem and I'd want to compile without it for
"production" builds (if I ever did such things anymore) just to check.
 
D

David Schwartz

Do you have an example of MVC catching something GCC misses?

Regards.

I had a function that used to return a bool (true=success or
false=failure). It was changed to return zero for success or an error
code on failure. One caller was not changed to reflect this. It still
read something like:

if(!FunctionThatNowReturnsInt(x))
{
// process error
}

GCC did not issue a warning. MVC did.

This was code that had been in use on other platforms. Compiling on
MVC allowed this hidden bug to be fixed before more widespread
deployment.

In fairness, MSVC issues a lot of warnings on perfectly legal and
valid code. IMO, it's worth making the change to suppress the warning.

I don't mind changing perfectly valid code to more clearly express its
intent 50 times if in exchange I catch one real bug that would
otherwise have been released. (And we do wind up putting in a lot of
technically superfluous casts and comparisons to suppress those
warnings. But, IMO, the resulting clearer code is worth it anyway.)

DS
 
M

Martin Ambuhl

David said:
I had a function that used to return a bool (true=success or
false=failure). It was changed to return zero for success or an error
code on failure. One caller was not changed to reflect this. It still
read something like:

if(!FunctionThatNowReturnsInt(x))
{
// process error
}

GCC did not issue a warning. MVC did.

This was code that had been in use on other platforms. Compiling on
MVC allowed this hidden bug to be fixed before more widespread
deployment.

I'm sorry, but I can't understand why you think MVC's behavior
desirable. If it issues a warning here, it will issue a warning for
thousands of times that there is no problem at all. An overzealous
compiler that warns about 'if (!x) /* x is an int */' would not survive
long in most programming environments.
 
D

David Schwartz

I'm sorry, but I can't understand why you think MVC's behavior
desirable.  If it issues a warning here, it will issue a warning for
thousands of times that there is no problem at all.  An overzealous
compiler that warns about 'if (!x) /* x is an int */' would not survive
long in most programming environments.

It would in ours, since our coding standards prohibit implicit
comparisons and specify that the negation operator can only be used on
boolean types. IMO, the two extra characters to make "x!=0" are well
worth it for the improved clarity of the code. For one thing, it makes
it clear that you know that x is an integer.

MVC also warns on assignments of larger types into smaller types. For
example 'x=y' if 'y''s type includes values 'x' cannot represent. IMO,
the extra characters to add an explicit cast are well worth it, as
they make it clear to anyone looking at the code that there is a
conversion and that you intended that conversion.

When one person looks over another person's code, it's very important
that if they see non-obvious effects of the code, it be clear that the
original author intended those effects. MVC does a better job of
warning that your code does something non-obvious where you didn't
make it clear that this non-obvious behavior was intentional.

For every useful warnings it gives, I admit, it gives ten useless
ones. But I still think it's worth it to make the code a little bit
clearer ten times, even though it was pretty clear already, to catch
an actual honest to goodness bug.

It also frequently catches cases where the code turns out to be okay,
but catches conditions you didn't think about. In this case, the
effort turns out to be wasted, but you can't blame that on MVC. (For
example, where you copy a variable into a variable with a smaller
range, and it turns out be safe, but you didn't even realize you were
doing that.)

DS
 
M

Martin Ambuhl

David said:
It would in ours, since our coding standards prohibit implicit
comparisons and specify that the negation operator can only be used on
boolean types. IMO, the two extra characters to make "x!=0" are well
worth it for the improved clarity of the code. For one thing, it makes
it clear that you know that x is an integer.


Frankly, this is a stupid rule for stupid programmers hired, no doubt,
by stupid HR people and retained by stupid managers. This rule
prohibits straight forward idioms like:

if (!(printer_state & ready))

I would hope that, contrary to the implication of your post, which you
snipped away, this is a warning that you ask of MVC, not one that it
gives by default.
 
H

Harald van Dijk

Frankly, this is a stupid rule for stupid programmers hired, no doubt,
by stupid HR people and retained by stupid managers. This rule
prohibits straight forward idioms like:

if (!(printer_state & ready))

I can't speak for others, but I _want_ to rewrite that to != 0. It's easy,
if you allow implicit conversions from just any int to bool, to miss

if (state & (a | b))

where you really want

if ((state & (a | b)) == (a | b))

Had you written

if ((state & (a | b)) != 0)

in the first place, it would've been more obvious what it meant.

In other words, what if ready == online | loaded?
I would hope that, contrary to the implication of your post, which you
snipped away, this is a warning that you ask of MVC, not one that it
gives by default.

I can't check, but I suspect it's not enabled by default, but also not
requested explicitly: it's probably enabled at a specific warning level.

[ microsoft.public.vc.language dropped; it seems to be unavailable to me. ]
 
D

David Schwartz

Frankly, this is a stupid rule for stupid programmers hired, no doubt,
by stupid HR people and retained by stupid managers.  This rule
prohibits straight forward idioms like:

   if (!(printer_state & ready))

Since that code is broken, it should be prohibited. Consider:

#define PRINTER_ON 0x100
#define PRINTER_HAS_PAPER 0x200
#define PRINTER_NO_ERROR 0x400
#define ready (PRINTER_ON|PRINTER_HAS_PAPER|PRINTER_NO_ERROR)

Oops, now the printer is ready if it's on even if it has no paper.
That can't be right.
I would hope that, contrary to the implication of your post, which you
snipped away, this is a warning that you ask of MVC, not one that it
gives by default.

It's sort of a bit of both. It activates with the error level that
would typically be used during debug and development.

DS
 
D

David Schwartz

Compared to what? Watcom C, Borland C, gcc....

Compared to GCC, at least with the standard warnings levels used for
debug and development on both. It may simply be that GCC's standard
warning level is lower than MSVC. I haven't investigated in detail.

I'm just saying, I've seen real live code that was developed and
debugged for months with GCC, then I've compiled it with MSVC and
quickly caught errors that had been missed. I've seen the same the
other way around, of course. I've also seen the same with ICC.

There are a few morals here, and one is that cross-platform code is
easier to debug than single-platform code.
And /that's/ a big issue, because of the "wood for the trees" effect.

But even though ten of them turn out to be useless, usually even one
or two of those useless ones will be an "Oh, I didn't realize the code
was doing that" moment. It's still, IMO, worth the minute to change
the code so that it makes clear that it's doing the non-obvious thing
you now know it's doing.
Manys the time our Quant team have let slip errors in their code because
MSVC generated a forest of non-errors in microsoft headers. Fortunately
we recompile on Solaris and Linux using gcc, so those tend to get picked
up later. Still, it extends the development time, and causes fierce
debate between the quants and technology as to what to change the
'broken' code to.

I think MS has done a good job of controlling that lately. But that is
a fair point, when you see warnings in code you cannot or should not
change, you tend to get into the habit of ignoring warnings.

I am not saying MSVC is better than GCC at all. They're pretty
comparable in terms of helping you find problems with code. Just that
more tools are better, since they're different.

DS
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,276
Latest member
Sawatmakal

Latest Threads

Top