Unsigned types are DANGEROUS??

M

Michael Doubez

On 22/03/2011 16:56, Michael Doubez wrote:
On 3/20/2011 7:42 AM, James Kanze wrote:
James Kanze wrote:
       [...]
And in my experience, something like 95% of the programmers seem
to have learned it along the lines of what Stroustrup presents.
So, summed up, your position is that IF the first sheep jumped offof the
cliff and sheep have been following that precedent ever since, then that
should be the overriding basis for decision on what action to takegoing
forward, rather than the alternative, which is to think about it and then
make a decision. Yes? :)
No.  My point is that if there is a generally accepted
convention, and there is no good arguments against it,
readability favors going along with the convention.  The most
wide spread general convention is to use int, unless there is a
good reason to do otherwise.
I don't really believe this.  We've seen two well known experts agreeing
with you, but that's it, and Meyers doesn't actually do C++ development
(except for his experimental stuff when he's trying to figure thingsout
or writing a presentation).  I've personally seen a lot more application
developers, people using C++ to make products (whom you call "experts")
who believe that 'unsigned int' offers two clear advantages when it's a
reasonable option:
1) it clearly documents that the function is expected to only accept
positive numbers.
And at the next iteration, you may need a singular value; keeping the
parameter signed allows you to use negative values (and you avoid the
horrible std::string::npos notation).
There is nothing horrible about the std::string::npos "notation"; it is
perfectly fine.
Except that it clutters the code and cannot be used as a start
position: you have to test against it along the whole chain for simple
parsing.

Any and all uses of symbols rather than values for constant expressions
"clutters" code.  Had a boss once that told me the same thing about
new-style casts.

Well, new style cast are ugly but have the advantage of discouraging
their use and make them standout in the code (and grep for them).
I use boost::eek:ptional on a regular basis.  I probably wouldn't for the
suggested case here, but it has proved itself quite useful.  You
shouldn't discount it so easily.

I use it as a return value and in some cases of variables and/or
asynchronous operation result (somewhat like an homespun future) but
not as a parameter.
I suppose it is a naming issue, Fallible<> if not a good parameter
type name while optional<> make somewhat sense.
 
Ö

Öö Tiib

I use it as a return value and in some cases of variables and/or
asynchronous operation result (somewhat like an homespun future) but
not as a parameter.
I suppose it is a naming issue, Fallible<> if not a good parameter
type name while optional<> make somewhat sense.

I think Leigh's Fallible<> parameter is a red herring. In C++ it is
possible to limit most interfaces with reference and value parameters.
There seems to be no need for optional<> or pointer parameters.

When argument is optional then it is usually cleaner to make an
overload without that parameter. It is C where you have to have
pointer parameters and check them from nullness (because in C there
are no references nor overloads).
 
Ö

Öö Tiib

I was talking about using optional variables as return values rather
than parameters but yeah multiple overloads can be cleaner than using
optional variable parameters; however to avoid code duplication
overloads that differ simply by the emission of a particular parameter
will typically forward to the overload that does the actual work (if the
work is non-trivial) and this overload will have all possible parameters
present making optional variables or pointer parameters still necessary
for this overload.  Code duplication is not clean.

I was not advocating code duplication. I was suggesting to split non-
trivial differences between different functions. How i have seen it
usually done:
a) If the optional parameter adds responsibilities to function then
the additional work is done by the longer overload (so longer overload
calls shorter overload and additionally fulfills the
responsibilities).
b) If the optional parameter if present carries some "unusual"
behavior then there can be always constructed such "default" that
behaves in "usual" way (so shorter overload calls longer overload with
that "default"). That is if there are reasons why not to put such
default into function's signature at first place.
c) If the behavior between functions with present and missing
parameter differs by a lot then there is always third way. Both
overloads implement their unique part and for common work call third
one (usually private).

Single long and complex function is often more painful than family of
shorter ones.
 
J

James Kanze

On 3/20/2011 7:42 AM, James Kanze wrote:
[...]
Furthermore, even if you are right...consensus among experts
is actually not as important as consensus within a project and
within code.

That is exactly what I've been saying all along in this thread.
I've worked in a large number of places, and the consensus has
always been to stick with int. Probably because most
programmers have learned C++ (directly or indirectly) from
Stroustrup, and Meyers has also been a large influence. But the
reason for choosing int is because that's what most programmers
expect.

Obviously, if you find yourself in an environment where the
consensus is different, you should follow the consensus. There
are sound technical reasons for preferring int, but they are far
from being overwhelming; you can write very good, readable and
maintainable code using either convention. What you can't do is
write good, readable and maintainable code being inconsistent
about it.
Obviously, if you're working for google (what started all this) then
you'll use signed numbers for everything.

Not just Google. If you're working anywhere where most of the
programmers have learned C++ from Stroustrup, or have read and
been influenced by Meyers, then you'll use signed numbers for
everything. If the major influence at the learning stage of C++
was someone else, the conventions could be significantly
different. It's just that I've yet to see such places.
On the other hand, there's a huge number of people and
projects that do otherwise and this includes the standard
library. Frankly, the fact that the standard library does a
thing is more important than the fact that some "expert"
disagrees.
Clearly one is more inclined to see consensus with one's own opinion
than otherwise, but I simply have not seen enough evidence for me to
agree that wide, general convention agrees with your position.

I don't know, then. All I know is about the places where I've
worked. Which does include several large companies in the
telecoms field, and several large banks. But obviously doesn't
include all companies.
 
J

James Kanze

[...]
Using unsigned to indicate the parameter must be positive

Is not a valid argument for anything, unless the valid range is
exactlyl [0...UINT_MAX]. C++ doesn't have ranged integral
types, so you have to make do. Any type which will hold the
desired range is equally good from that point of view, and the
type of an argument can never be assumed to tell you anything
about the range.

Suppose I want to represent the value of a six sided die which
has been thrown. The legal range is 1...6. Which type,
unsigned or int, tells me more about this? The answer, of
course, is neither.
 
J

James Kanze

[...]
I use boost::eek:ptional on a regular basis. I probably wouldn't for the
suggested case here, but it has proved itself quite useful. You
shouldn't discount it so easily.
I use it as a return value and in some cases of variables and/or
asynchronous operation result (somewhat like an homespun future) but
not as a parameter.
I suppose it is a naming issue, Fallible<> if not a good parameter
type name while optional<> make somewhat sense.

And optional is not a good return type name, but Fallible is:).
And neither are really a good name for a nullable type (but
optional will do in a pinch).

The best name (in the sense of being the most generally
applicable) I've seen so far is Maybe<>. But I first saw it
only a year and a half ago, and it hadn't occured to me before
then. For better or worse, most experienced programmers are
familiar with Fallible (it's hard to imagine an experienced
programmer -- i.e. a programmer with 10 or more years experience
in C++ -- who hasn't read Barton & Nackman).
 
M

MikeP

James said:
[...]
Using unsigned to indicate the parameter must be positive

Is not a valid argument for anything, unless the valid range is
exactlyl [0...UINT_MAX]. C++ doesn't have ranged integral
types, so you have to make do. Any type which will hold the
desired range is equally good from that point of view, and the
type of an argument can never be assumed to tell you anything
about the range.

Suppose I want to represent the value of a six sided die which
has been thrown. The legal range is 1...6. Which type,
unsigned or int, tells me more about this? The answer, of
course, is neither.

Focusing on an exact range is a strawman, for it is not relevant to the
discussion of the relative semantic value of using signed or unsigned.
Any suggestion that unsigned does not give more meaning than signed (when
signed is used to represent variables that cannot be negative) is, of
course, incorrect. Using signed for the die value gives rise to questions
about what kind of concoction the programmer developed where a die value
CAN be negative. Such software may not be worth investigating further,
for it can be some bizzarro design, and not worth the time to figure out
or to verify. Using unsigned precludes all such fears (fears that a die
value can be negative).
 
Ö

Öö Tiib

   [...]
Using unsigned to indicate the parameter must be positive
Is not a valid argument for anything, unless the valid range is
exactlyl [0...UINT_MAX].  C++ doesn't have ranged integral
types, so you have to make do.  Any type which will hold the
desired range is equally good from that point of view, and the
type of an argument can never be assumed to tell you anything
about the range.
Suppose I want to represent the value of a six sided die which
has been thrown.  The legal range is 1...6.  Which type,
unsigned or int, tells me more about this?  The answer, of
course, is neither.

Focusing on an exact range is a strawman, for it is not relevant to the
discussion of the relative semantic value of using signed or unsigned.
Any suggestion that unsigned does not give more meaning than signed (when
signed is used to represent variables that cannot be negative) is, of
course, incorrect. Using signed for the die value gives rise to questions
about what kind of concoction the programmer developed where a die value
CAN be negative. Such software may not be worth investigating further,
for it can be some bizzarro design, and not worth the time to figure out
or to verify. Using unsigned precludes all such fears (fears that a die
value can be negative).

You have strangest fears and nightmares that i have ever heard about.
Why it is not more fearful that die value may be 42? It is very simple
to construct a physical die with sides -2, -1, 0, 1, 2 and 42 for some
custom board game. Emulating such game on some web site is by your
description "bizzarro" and not worth the time.
 
G

Gerhard Fiedler

MikeP said:
James said:
[...]
Using unsigned to indicate the parameter must be positive

Is not a valid argument for anything, unless the valid range is
exactlyl [0...UINT_MAX]. C++ doesn't have ranged integral
types, so you have to make do. Any type which will hold the
desired range is equally good from that point of view, and the
type of an argument can never be assumed to tell you anything
about the range.

Suppose I want to represent the value of a six sided die which
has been thrown. The legal range is 1...6. Which type,
unsigned or int, tells me more about this? The answer, of
course, is neither.

Focusing on an exact range is a strawman, for it is not relevant to
the discussion of the relative semantic value of using signed or
unsigned. Any suggestion that unsigned does not give more meaning
than signed (when signed is used to represent variables that cannot
be negative) is, of course, incorrect. Using signed for the die value
gives rise to questions about what kind of concoction the programmer
developed where a die value CAN be negative. Such software may not be
worth investigating further, for it can be some bizzarro design, and
not worth the time to figure out or to verify. Using unsigned
precludes all such fears (fears that a die value can be negative).

I don't know about you, but I wouldn't want the value to be 0, either.
So the "fear" is not really that it is negative, it is that it is out of
the valid range. Unsigned integers don't help in this case; the only
thing that helps somewhat are freely distributed assertions on the
correct range -- and the assertions look exactly the same, whether
signed or unsigned.

If you have a range that starts at 0, the two arguments for the use of
unsigned I've seen so far are the semantic value of the declaration and
that the assertions are simpler. But both really only apply to ranges
0..max.

IMO the most important argument is always convenience when interfacing
with libraries (including the standard lib).

Gerhard
 
J

James Kanze

James said:
On 22 mar, 18:04, Leigh Johnston <[email protected]> wrote:
[...]
Using unsigned to indicate the parameter must be positive
Is not a valid argument for anything, unless the valid range is
exactlyl [0...UINT_MAX]. C++ doesn't have ranged integral
types, so you have to make do. Any type which will hold the
desired range is equally good from that point of view, and the
type of an argument can never be assumed to tell you anything
about the range.
Suppose I want to represent the value of a six sided die which
has been thrown. The legal range is 1...6. Which type,
unsigned or int, tells me more about this? The answer, of
course, is neither.
Focusing on an exact range is a strawman, for it is not relevant to the
discussion of the relative semantic value of using signed or unsigned.

Which is exactly what I said. Arguments concerning the range
are vacuous, since C++ doesn't support ranged variables.
Any suggestion that unsigned does not give more meaning than
signed (when signed is used to represent variables that cannot
be negative) is, of course, incorrect.

But it doesn't give any additional useful information, in most
cases.
Using signed for the die value gives rise to questions
about what kind of concoction the programmer developed where a die value
CAN be negative.

No. In the places I've worked, using unsigned for the die value
gives rise to questions as to why the programmer wanted modulo
arithmetic, or what sort of bitwise operations he had in mind.

It's a question of expectations, not a language issue, nor a
techical issue. The expectations of most programmers I've met
(and practically all I've worked with) are that unsigned means
bitwise or modulo. I've also pointed out where those
expectations come from.
Such software may not be worth investigating further,
for it can be some bizzarro design, and not worth the time to figure out
or to verify. Using unsigned precludes all such fears (fears that a die
value can be negative).

But leaves the fear that it can be 0? You're being ridiculous.
What the programmer understands from a declaration is what he
learned to understand. In some cases, there are strong reasons
for trying to teach him otherwise: write "char const*", and not
"const char*"; use .hpp or .hh for headers, and not .h. And
even in those cases, I've almost given up. In the case of int
vs. unsigned, there are only very weak reasons either way (and
they favor int), so it's not worth bucking the trend. It only
confuses the reader.
 
I

Ian Collins

Again you are uttering nonsense (or bullshit: you choose); unsigned
types are not just for "bitwise or modulo": std::size_t is an unsigned
type and is used to represent sizes and indices; char can be an unsigned
type and char is used for representing string characters.

Wow, is this argument still going on?

I think I'll go back to clearing earthquake debris...
 
M

Michael Doubez

James Kanze wrote:
    [...]
Using unsigned to indicate the parameter must be positive
Is not a valid argument for anything, unless the valid range is
exactlyl [0...UINT_MAX].  C++ doesn't have ranged integral
types, so you have to make do.  Any type which will hold the
desired range is equally good from that point of view, and the
type of an argument can never be assumed to tell you anything
about the range.
Suppose I want to represent the value of a six sided die which
has been thrown.  The legal range is 1...6.  Which type,
unsigned or int, tells me more about this?  The answer, of
course, is neither.
Focusing on an exact range is a strawman, for it is not relevant to the
discussion of the relative semantic value of using signed or unsigned.
Which is exactly what I said.  Arguments concerning the range
are vacuous, since C++ doesn't support ranged variables.
But it doesn't give any additional useful information, in most
cases.
No.  In the places I've worked, using unsigned for the die value
gives rise to questions as to why the programmer wanted modulo
arithmetic, or what sort of bitwise operations he had in mind.
It's a question of expectations, not a language issue, nor a
techical issue.  The expectations of most programmers I've met
(and practically all I've worked with) are that unsigned means
bitwise or modulo.  I've also pointed out where those
expectations come from.

Again you are uttering nonsense (or bullshit: you choose); unsigned
types are not just for "bitwise or modulo": std::size_t is an unsigned
type and is used to represent sizes and indices; char can be an unsigned
type and char is used for representing string characters.

IMO, the fact that a size type is used for indices is part of the
problem. Shouldn't it be an index type ?
An index is likely to be computed and eventually become negative at
some point (when computing a difference); so it should be a signed
type

But we want to compare an index to a size, that led to having index
type the same as size type.

I would have prefered the other way around: the size type same as
index type - i.e signed.
 
W

werasm

If you investigate the tcmalloc code (by Google), you will find the
following warning:

// NOTE: unsigned types are DANGEROUS in loops and other arithmetical
// places. Use the signed types unless your variable represents a bit
// pattern (eg a hash value) or you really need the extra bit. Do NOT
// use 'unsigned' to express "this value should always be positive";
// use assertions for this.

Is it just their idiom? What's the problem with using unsigned ints in
loops (it seems natural to do so)? Are C++ unsigned ints "broken"
somehow?

This reminds me of a little piece of code that bit me recently (it
crashed):

void ScanPatternDataMdl::moveCursorToNextPeak( double& cursor )
{
bool peakFound = false;
for( size_t i = 0; !peakFound && (i < peakGraphData_->size()); ++i )
{
double peakX = peakGraphData_->x( i );
peakFound = (peakX > cursor);
if( peakFound ){ cursor = peakX; }
}
}
void ScanPatternDataMdl::moveCursorToPrevPeak( double& cursor )
{
bool peakFound = false;
for( size_t i = peakGraphData_->size()-1; !peakFound && (i >= 0); --
i )
{
double peakX = peakGraphData_->x( i );
peakFound = (peakX < cursor);
if( peakFound ){ cursor = peakX; }
}
}

Spot the bug??? Using unsigned types are certainly not symmetrical
in their use. I prefer signed but used unsigned in this case
(ignorantly)
as I wanted to hush the compiler warning (in the increment case).

I'd say the only reason for using unsigned types would be if the
signed
range cannot hold the value required.

Kind regards,

Werner
 
M

Michael Doubez

On 28/03/2011 23:20, James Kanze wrote:
James Kanze wrote:
     [...]
Using unsigned to indicate the parameter must be positive
Is not a valid argument for anything, unless the valid range is
exactlyl [0...UINT_MAX].  C++ doesn't have ranged integral
types, so you have to make do.  Any type which will hold the
desired range is equally good from that point of view, and the
type of an argument can never be assumed to tell you anything
about the range.
Suppose I want to represent the value of a six sided die which
has been thrown.  The legal range is 1...6.  Which type,
unsigned or int, tells me more about this?  The answer, of
course, is neither.
Focusing on an exact range is a strawman, for it is not relevant to the
discussion of the relative semantic value of using signed or unsigned.
Which is exactly what I said.  Arguments concerning the range
are vacuous, since C++ doesn't support ranged variables.
Any suggestion that unsigned does not give more meaning than
signed (when signed is used to represent variables that cannot
be negative) is, of course, incorrect.
But it doesn't give any additional useful information, in most
cases.
Using signed for the die value gives rise to questions
about what kind of concoction the programmer developed where a die value
CAN be negative.
No.  In the places I've worked, using unsigned for the die value
gives rise to questions as to why the programmer wanted modulo
arithmetic, or what sort of bitwise operations he had in mind.
It's a question of expectations, not a language issue, nor a
techical issue.  The expectations of most programmers I've met
(and practically all I've worked with) are that unsigned means
bitwise or modulo.  I've also pointed out where those
expectations come from.
Again you are uttering nonsense (or bullshit: you choose); unsigned
types are not just for "bitwise or modulo": std::size_t is an unsigned
type and is used to represent sizes and indices; char can be an unsigned
type and char is used for representing string characters.
IMO, the fact that a size type is used for indices is part of the
problem. Shouldn't it be an index type ?
An index is likely to be computed and eventually become negative at
some point (when computing a difference); so it should be a signed
type

No; size_type is used for indices and difference_type is used for ..
wait for it.. differences.

Between iterator. Which is much help at this point.

Does the standard guarantee that difference_type works in all case
with size_t ?

Your personal preferences are not really relevant; what is relevant are
the idioms that have been standardized by the C++ Standard.

Like vector<bool>, valarray<> and auto_ptr<> ?
 
X

xiilin

MikeP said:
If you investigate the tcmalloc code (by Google), you will find the
following warning:

// NOTE: unsigned types are DANGEROUS in loops and other arithmetical
// places. Use the signed types unless your variable represents a bit
// pattern (eg a hash value) or you really need the extra bit. Do NOT
// use 'unsigned' to express "this value should always be positive";
// use assertions for this.

Is it just their idiom? What's the problem with using unsigned ints in
loops (it seems natural to do so)? Are C++ unsigned ints "broken"
somehow?


"for (i = 0; i <100; i++)"may written as "for (i = 99 ; i > 0; i--) "
when u want to search from end to start place, if "i" is a signed
type, it gets no problems, but if the"i"is an unsigned type, which
will be an endless loop.

Some hidden overflow may happened, so "unsigned types are DANGEROUS in
loops and other arithmetical places."

Sorry for my terrible English.
 
W

werasm

"for (i = 0; i <100; i++)"may written as "for (i = 99 ; i > 0; i--) "

"for (i = 0; i <100; ++i)"may written as "for (i = 99 ; i >= 0; --i)

I suppose using pre -increment -decrement could be considered
pedantic. The "i >=0" is correct, as the expression is only executed
after the statement (see par 6.5.3 of the working draft).

My point exactly...

Regards,

Werner
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,046
Latest member
Gavizuho

Latest Threads

Top