UB when flowing off end of value-returning function

James Kanze · Nov 4, 2010

Scott Meyers wrote:

[...]

Some C++ compilers manage to detect a "flow off" apparently
without any trouble, which leads to believe that this is doable.

Really! Which ones? (Some detect some more or less trivial
cases, and most of those detect cases where in fact there is no
flow off.)

James Kanze · Nov 5, 2010

Which of course means that any one of them could be changed to
"default:" without any problem.

Except rendering the code more difficult to read.

That one keyword will tell both the compiler and a human
reader that all possible paths are covered, without either
having to do any higher order analysis.

Understanding what the code is doing (converting UTF-8) will
tell the human reader that all possible paths are covered.

It seems to me that its is quite easy to *say* that the switch
covers all possible cases, and it is even not that hard to
write toy code that has switches that can be shown to cover
all possible cases with only some effort, but turning one of
the cases into "default:" removes all doubt.

So I ask again, why no default clause? Is the writer of the
code deliberately trying to make his code hard to parse?

Just the opposite. The word "default" has a very definite
meaning in English. Which doesn't apply to any of the cases
here. There is no default; there are four very precise and
distinct cases.

Rui Maciel · Nov 6, 2010

James said:
Scott Meyers wrote:
[...]
Some C++ compilers manage to detect a "flow off" apparently
without any trouble, which leads to believe that this is doable.

Click to expand...

Really! Which ones? (Some detect some more or less trivial
cases, and most of those detect cases where in fact there is no
flow off.)

Both G++ and MS's C++ compiler manage to detect when a method presents a
"flow off" problem. I don't know if their technology is able to detect
every conceivable case but they do throw warnings and errors for that,
depending on the configuration.

Rui Maciel

James Kanze · Nov 7, 2010

James said:
James said:

Scott Meyers wrote:

Click to expand...

[...]

Some C++ compilers manage to detect a "flow off" apparently
without any trouble, which leads to believe that this is doable.

Click to expand...

Really! Which ones? (Some detect some more or less trivial
cases, and most of those detect cases where in fact there is no
flow off.)

Click to expand...

Both G++ and MS's C++ compiler manage to detect when a method
presents a "flow off" problem.

Both G++ and VC++ warn about flowing off the end in cases where
the code can't flow off the end, and fail to warn about it in
some cases where the code could flow off the end. In addition,
where you get warnings from g++ depends on the various
optimization options.

In other words, neither detect "flow off" reliably.

I don't know if their technology is able to detect every
conceivable case but they do throw warnings and errors for
that, depending on the configuration.

Which is why the standard can't require an error. Requiring an
error means being able to reliably detect the condition, always,
and with no false detections.

Rui Maciel · Nov 7, 2010

James said:
Both G++ and VC++ warn about flowing off the end in cases where
the code can't flow off the end, and fail to warn about it in
some cases where the code could flow off the end.

Interesting. Is there an example which is able to demonstrate that?

Rui Maciel

James Kanze · Nov 8, 2010

Interesting. Is there an example which is able to demonstrate that?

I've already posted an example in this thread:

#include <stdexcept>

unsigned int getMultiByteUTF8(char const*&);

unsigned int
getUTF8(char const*& in)
{
switch (*in & 0xC0) {
case 0x00:
case 0x40:
return *in ++;

case 0x80:
throw std::runtime_error("Illegal char");

case 0xC0:
return getMultiByteUTF8(in);
}
}

Both g++ and VC++ warn about flowing off the end, although it's
clearly impossible.

Rui Maciel · Nov 8, 2010

James said:
I've already posted an example in this thread:

#include <stdexcept>

unsigned int getMultiByteUTF8(char const*&);

unsigned int
getUTF8(char const*& in)
{
switch (*in & 0xC0) {
case 0x00:
case 0x40:
return *in ++;

case 0x80:
throw std::runtime_error("Illegal char");

case 0xC0:
return getMultiByteUTF8(in);
}
}

Both g++ and VC++ warn about flowing off the end, although it's
clearly impossible.

That is not true. According to the C++ standard it is very possible for that code to flow off the
end, and for multiple reasons, as not only sizeof(int) is implementation-defined but also the number
of bits in a byte can vary between implementations. Your assertion would only be true if an int was
defined as being exactly 4 bytes and each byte was exactly 8 bits large.

Rui Maciel

James Kanze · Nov 8, 2010

That is not true. According to the C++ standard it is very
possible for that code to flow off the end, and for multiple
reasons, as not only sizeof(int) is implementation-defined but
also the number of bits in a byte can vary between
implementations. Your assertion would only be true if an int
was defined as being exactly 4 bytes and each byte was exactly
8 bits large.

Come again. There is no way the above code can fall off the
end, regardless of the size of an int or a byte. (If int is
less than 21 bits, getMultiByteUTF8 cannot return the correct
value in some cases, but that's another problem.)

Alf P. Steinbach /Usenet · Nov 8, 2010

* Rui Maciel, on 08.11.2010 15:33:

That is not true. According to the C++ standard it is very possible for that code to flow off the
end, and for multiple reasons, as not only sizeof(int) is implementation-defined but also the number
of bits in a byte can vary between implementations. Your assertion would only be true if an int was
defined as being exactly 4 bytes and each byte was exactly 8 bits large.

Rui, it's difficult (to say the least) to see what the size of int or bits per
byte has to do with anything here. Could you cite the standard, please? *And*
explain how that relates to your conclusion?

Anyway, your message is interesting as a demonstration of the quoting bugs in
Mozilla MoronBird.

Even when I selected all of the text before hitting "reply", it insisted on
changing "switch (*in & 0xC0)" to "switch (*in& 0xC0)". Jeez. I'd KILL that
bird, except then I'd have to rewrite a useful extension for some other prog.

Cheers,

- Alf

Juha Nieminen · Nov 8, 2010

Paavo Helde said:
Even if this was true, who should know better about the sizes of ints and
numbers of bits than the compiler doing the compilation? And thanks, I do
not want warnings about my code not being portable to some hypothetical
platform with 16-bit ints or 9-bit bytes (I would use other code branches
or typedefs for them if I really need to support such).

I don't see how the behavior of the example code being discussed would
depend on the amount of bits in a char on an int. If you take a value of
type char and calculate a bitwise-and with an int literal, you will get
an exact set of possible results, and this does not depend on the bit
sizes of anything. (Even in the hypothetical case that the amount of bits
in an int would be smaller than the int literal used in the code, you
would *still* not get a result which would fall outside of the listed
cases.)

I have, in fact, seen some people sometimes having very odd
misconceptions about C (and hence C++), with regards to bit manipulation.
For example, I once saw an open-source program where bit-shifting was
avoided (in a situation where you would normally use it) because the
programmer thought that its result depends on the endianess of the
target architecture, and hence bit-shifting is not portable (this was
stated in the code comments).

Kevin P. Fleming · Dec 22, 2010

On 11/02/2010 03:34 PM, Marcel Müller wrote:

(resurrecting an old thread as I catch up on this newsgroup)

Btw. The second feature I regularly miss is to declare a function as
side effect free, which implies that the return value is a compile time
constant if all arguments are compile time constant.
I have seen (and used) this on a very old compiler for the inmos
Transputer platform. Looking at the generated code this turned out to
give the compiler room for very advanced optimizations.
Of course, it makes no difference as long as the code is inlined. But
this is not always an option, e.g. because PIMPL.

GCC has this, via the 'const' and 'pure' attributes that can be
specified via __attribute__. They exist solely to help the optimizer do
a better job.

Type of lambda function returning a lambda function...	4	Mar 16, 2013
Returning value is blank	19	Apr 21, 2009
Trying & failing to make a function that returns functions... (C++11)	4	Mar 9, 2012
function composition, sequence point, and unsuspected side effects	63	Nov 15, 2013
'exit function'...always use it when returning a value?	3	Aug 6, 2004
returning None instead of value: how to fix?	8	Sep 22, 2006
Some errors in MIT's intro C++ course	109	Sep 8, 2010
[ANN] Rails 0.9.2: End of requiring models, models in sessions, fixes	0	Dec 23, 2004

UB when flowing off end of value-returning function

James Kanze

James Kanze

Rui Maciel

James Kanze

Rui Maciel

James Kanze

Rui Maciel

James Kanze

Alf P. Steinbach /Usenet

Juha Nieminen

Kevin P. Fleming

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads