Unsigned types are DANGEROUS??

James Kanze · Mar 18, 2011

How does the curriculum go? Do they teach unsigned as a bit-based thing
or as the complementary integer type to signed? Surely without such
instruction, the novice will assume unsigned is "the set of positive
numbers that can be represented with the given width".

[/QUOTE]

What about char? char is a distinct type *and* it is up to the
implementation as to whether char is signed or unsigned yet char, I am
sure you will agree, is used to represent characters more than for
anything else including values which require bit manipulation.

Agreed. Several coding guidelines I've seen insist that plain
char always be a character (or a part of a character, if say
you're using UTF-8), and that numeric values use signed char,
and that unsigned char be reserved for raw memory or bitmasks
and the like. And the first part isn't without problems when
plain char is signed (which it usually is), and you're using
latin-1 or UTF-8, where the top bit is often set.

James Kanze · Mar 18, 2011

Leigh Johnston wrote:

Click to expand...

[...]
Appealing to authority aside, the Standard agrees with my assertion that
an unsigned type can be used to represent non-negative values:

3.9.1/1

Click to expand...

"It is implementation-defined whether a char object can hold negative
values."

Not sure I see what you're getting at. Whether char is signed or
unsigned is implementation defined. If it is signed, a char
object can hold negative values. If it is unsigned, a char
object cannot hold negative values.

James Kanze · Mar 18, 2011

Along with the following of course:

3.9.1/1
"In any particular implementation, a plain char object can take on
either the same values as a signed char or an unsigned char; which one
is implementation-defined."

*Now* I rest my case.

Which is? I have no problem with the passages you're quoting,
but I'm not sure what you're trying to argue in quoting them.
If it's that plain char can be used to hold numeric values, no
one has ever disputed that: character codes are numeric values.
But in this subthread, we've largely been talking about
programming conventions: what people (and not the compilers)
read into your code, independently of what the standard says
about it.

MikeP · Mar 19, 2011

James said:
The original unsigned weren't modular. At least not in the
specification---both signed and unsigned are modular on most
implementations. (Also, in the original implementations, when
you compared signed with unsigned, the unsigned was converted to
signed. But I don't think the specifications at the time really
made this clear. It was just what the compilers did.)

The problem is that when the C committee started to standardize,
there was already a lot of existing practice. Often
contradictory. They did what they could, trying to break as few
existing programs as possible, and still make the language as
clean as possible, while still allowing as much performance as
possible. Almost every decision was a compromize.

And do tell, isn't all this stuff driven by the hardware? Doesn't all
this stuff jive directly with, say, the Intel Architecture Manual? Pretty
much stuck with what is implemented at the hardware level to stay
efficient, yes? Did hardware have carry and overflow flags or the like
way back when? If so, then they didn't fully exploit the possibilities.

MikeP · Mar 19, 2011

James said:
You're misreading my argument. I'm not saying that it's better
because BS does it. I'm saying that BS has a large influence,
and because he does it, a lot of other people do it; it is the
"expected" idiom for most programmers.

(Aside: Apparently the guys below didn't read Stroustrup.)

http://www.viva64.com/en/a/0050/:

"size_t type is usually used for loop counters, array indexing and
address arithmetic."

MikeP · Mar 19, 2011

MikeP said:
(Aside: Apparently the guys below didn't read Stroustrup.)

http://www.viva64.com/en/a/0050/:

"size_t type is usually used for loop counters, array indexing and
address arithmetic."

I posted that too quickly, for they go on to say:

" ptrdiff_t type is usually used for loop counters, array indexing, size
storage and address arithmetic."

MikeP · Mar 19, 2011

James said:
James said:

James Kanze wrote:
James Kanze wrote:

Click to expand...

[...]
Well there's more than that, such as loop counter rollover.
If you insist on using a for loop. When working down, I'd
normally write:
int n = top;
while (n > 0) {
-- n;
// ...
}
Which, of course, works equally well with unsigned.

Click to expand...

Click to expand...

Most of that gets subsumed by iterators now anyway, so even less of a
point is the loop control thing.

Click to expand...

And of course, with iterators, you have to write the loop as
above (or use reverse iterators). You can't decrement once
you've encountered begin.

[...]

I don't think it's a provable thing, no matter who says it.
Programmers will have to think for themselves on this one.

Click to expand...

Except that in the absence of any killer argument, you have to
go with what the majority of programmers expect. In fact, even
when there is a strong argument for something else, you may end
up having to go with what the majority of programmers expect.

[...]

No small potatoes, for simplifying maintenance is worth more
than "simplifying" original coding. The semantic correctness
and syntactic richness ("semantical" and "syntactical"?) is
what is compelling to me about unsigned.

Click to expand...

The semantic correctness is in the eye of the beholder. You can
argue for any meaning you want, but in the end, the only meaning
that counts is what the reader understands. At least if your
goal is communicating.

How does the curriculum go? Do they teach unsigned as a
bit-based thing or as the complementary integer type to
signed?

Click to expand...

They never mention it except in conjunction with the bitwise
operators.

Surely without such instruction, the novice will assume
unsigned is "the set of positive numbers that can be
represented with the given width".

Click to expand...

Without any instruction, I suspect that the novice will assume
that unsigned behaves like a cardinal. I.e. that all of the
values of an unsigned will fit in an int (integer), and that
substraction of two unsigned will result in an int (and is
guaranteed to fit in an int).

Of course, without any instruction, the novice will assume that
division of two integers results in a rational number. Or at
least it's closest approximation, a float or a double.

And even more "of course": without any instruction, a novice
won't even know that unsigned exists. What he thinks of
unsigned is 100% conditioned by his instruction.

And in my experience, something like 95% of the programmers seem
to have learned it along the lines of what Stroustrup presents.

So, summed up, your position is that IF the first sheep jumped off of the
cliff and sheep have been following that precedent ever since, then that
should be the overriding basis for decision on what action to take going
forward, rather than the alternative, which is to think about it and then
make a decision. Yes?

MikeP · Mar 19, 2011

Leigh said:
Either size_t or ptrdiff_t is fine; the difference is that one is
unsigned and one is signed.

Well that's obviously why I felt the need to retract the preceding post.
No biggie, it's the basis of the whole thread!

James Kanze · Mar 20, 2011

Which is that in C++ unsigned integral types are not just for performing
bit manipulation and modular arithmetic; they can also be used to
represent non-negative values.

I fail to see any relationship. (Also, of course: it's obvious
that unsigned types can be used to represent non-negative
values. So can signed types, and so can floating point types.
I fail to see where that leads us anywhere, however.)

James Kanze · Mar 20, 2011

On 19/03/2011 00:14, James Kanze wrote:
[...]

Agreed. Several coding guidelines I've seen insist that plain
char always be a character (or a part of a character, if say
you're using UTF-8), and that numeric values use signed char,
and that unsigned char be reserved for raw memory or bitmasks
and the like. And the first part isn't without problems when
plain char is signed (which it usually is), and you're using
latin-1 or UTF-8, where the top bit is often set.

Click to expand...

Agreed? Then you have to accept that unsigned integral types are used
for more than just bit manipulation and modular arithmetic; they can
also be used to represent non-negative values.

So can floating point values. All numeric types can be used to
represent non-negative values. So what's your point.

James Kanze · Mar 20, 2011

James Kanze wrote:
[...]

And in my experience, something like 95% of the programmers seem
to have learned it along the lines of what Stroustrup presents.

Click to expand...

So, summed up, your position is that IF the first sheep jumped off of the
cliff and sheep have been following that precedent ever since, then that
should be the overriding basis for decision on what action to take going
forward, rather than the alternative, which is to think about it and then
make a decision. Yes?

No. My point is that if there is a generally accepted
convention, and there is no good arguments against it,
readability favors going along with the convention. The most
wide spread general convention is to use int, unless there is a
good reason to do otherwise. The burden of proof is on those
who want to do otherwise.

James Kanze · Mar 20, 2011

MikeP said:
MikeP said:

James Kanze wrote:

Click to expand...

[...]

http://www.viva64.com/en/a/0050/:
"size_t type is usually used for loop counters, array indexing and
address arithmetic."

Click to expand...

I posted that too quickly, for they go on to say:

" ptrdiff_t type is usually used for loop counters, array
indexing, size storage and address arithmetic."

So they contradict themselves? I get a 404 when I try to
follow your link, so I can't say. But one web site hardly makes
a concensus.

MikeP · Mar 22, 2011

James said:
James Kanze wrote:
[...]

And in my experience, something like 95% of the programmers seem
to have learned it along the lines of what Stroustrup presents.

Click to expand...

Click to expand...

So, summed up, your position is that IF the first sheep jumped off
of the cliff and sheep have been following that precedent ever
since, then that should be the overriding basis for decision on what
action to take going forward, rather than the alternative, which is
to think about it and then make a decision. Yes?

Click to expand...

No. My point is that if there is a generally accepted
convention, and there is no good arguments against it,
readability favors going along with the convention. The most
wide spread general convention is to use int, unless there is a
good reason to do otherwise. The burden of proof is on those
who want to do otherwise.

There is no burden however unless one is trying to convince someone that
their way is somehow better. You seem to think that there is a better way
that all should follow as a rule, whereas some may scoff at jumping off
of that cliff no matter how long you try to lecture them to take that
leap of faith.

MikeP · Mar 22, 2011

James said:
MikeP said:

James Kanze wrote:

Click to expand...

[...]

http://www.viva64.com/en/a/0050/:
"size_t type is usually used for loop counters, array indexing and
address arithmetic."
If there were really
strong arguments for something else, then do so. But the burden
of proof is on the other side: most programmers will have their
expectations set by BS. Similar reasoning has made me drop my
insistence on not using .h for C++ headers, for example. In that
case, there are fairly strong technical arguments against it.
But not enough to justify bucking the expectations of the
everyday programmer.

Click to expand...

Click to expand...

I posted that too quickly, for they go on to say:

Click to expand...

" ptrdiff_t type is usually used for loop counters, array
indexing, size storage and address arithmetic."

Click to expand...

So they contradict themselves? I get a 404 when I try to
follow your link, so I can't say. But one web site hardly makes
a concensus.

I think they were just stating facts: some do it one way, others do it
the other way. They take no stance on either being inherently more
correct than the other. The link above works for me. Don't type or copy
the colon on the end. It's from the PVS-Studio website.

Noah Roberts · Mar 22, 2011

James Kanze wrote:
[...]

And in my experience, something like 95% of the programmers seem
to have learned it along the lines of what Stroustrup presents.

Click to expand...

Click to expand...

So, summed up, your position is that IF the first sheep jumped off of the
cliff and sheep have been following that precedent ever since, then that
should be the overriding basis for decision on what action to take going
forward, rather than the alternative, which is to think about it and then
make a decision. Yes?

Click to expand...

No. My point is that if there is a generally accepted
convention, and there is no good arguments against it,
readability favors going along with the convention. The most
wide spread general convention is to use int, unless there is a
good reason to do otherwise.

I don't really believe this. We've seen two well known experts agreeing
with you, but that's it, and Meyers doesn't actually do C++ development
(except for his experimental stuff when he's trying to figure things out
or writing a presentation). I've personally seen a lot more application
developers, people using C++ to make products (whom you call "experts")
who believe that 'unsigned int' offers two clear advantages when it's a
reasonable option:

1) it clearly documents that the function is expected to only accept
positive numbers.

2) it forces the issue and thus doesn't need to be validated (leveraging
the compiler for such is always a better choice). With full warnings
turned on, error on warning, etc... it's not possible to do the wrong
thing except on purpose.

In my opinion, you really need to have all conversion warnings on when
using C++ anyway.

Furthermore, even if you are right...consensus among experts is actually
not as important as consensus within a project and within code.
Obviously, if you're working for google (what started all this) then
you'll use signed numbers for everything. On the other hand, there's a
huge number of people and projects that do otherwise and this includes
the standard library. Frankly, the fact that the standard library does
a thing is more important than the fact that some "expert" disagrees.

Clearly one is more inclined to see consensus with one's own opinion
than otherwise, but I simply have not seen enough evidence for me to
agree that wide, general convention agrees with your position.

Gerhard Fiedler · Mar 22, 2011

MikeP said:
James said:

MikeP wrote:
James Kanze wrote:
[...]

http://www.viva64.com/en/a/0050/:

Click to expand...

"size_t type is usually used for loop counters, array indexing and
address arithmetic."

Click to expand...

If there were really strong arguments for something else, then do
so. But the burden of proof is on the other side: most
programmers will have their expectations set by BS. Similar
reasoning has made me drop my insistence on not using .h for C++
headers, for example. In that case, there are fairly strong
technical arguments against it. But not enough to justify bucking
the expectations of the everyday programmer.

Click to expand...

I posted that too quickly, for they go on to say:

Click to expand...

" ptrdiff_t type is usually used for loop counters, array indexing,
size storage and address arithmetic."

Click to expand...

So they contradict themselves? I get a 404 when I try to follow
your link, so I can't say. But one web site hardly makes a
concensus.

Click to expand...

I think they were just stating facts: some do it one way, others do
it the other way. They take no stance on either being inherently more
correct than the other. The link above works for me. Don't type or
copy the colon on the end. It's from the PVS-Studio website.

AIUI they don't talk about signed vs unsigned; they talk about what
types to use to store pointers in (intptr_t or ptrdiff_t rather than
int, uintptr_t or size_t rather than unsigned int). (unsigned) int is
not guaranteed to be large enough to hold a pointer; that's (IMO) the
issue of this article.

Gerhard

Michael Doubez · Mar 22, 2011

[...]

And in my experience, something like 95% of the programmers seem
to have learned it along the lines of what Stroustrup presents.
So, summed up, your position is that IF the first sheep jumped off of the
cliff and sheep have been following that precedent ever since, then that
should be the overriding basis for decision on what action to take going
forward, rather than the alternative, which is to think about it and then
make a decision. Yes?

Click to expand...

Click to expand...

No. My point is that if there is a generally accepted
convention, and there is no good arguments against it,
readability favors going along with the convention. The most
wide spread general convention is to use int, unless there is a
good reason to do otherwise.

Click to expand...

I don't really believe this. We've seen two well known experts agreeing
with you, but that's it, and Meyers doesn't actually do C++ development
(except for his experimental stuff when he's trying to figure things out
or writing a presentation). I've personally seen a lot more application
developers, people using C++ to make products (whom you call "experts")
who believe that 'unsigned int' offers two clear advantages when it's a
reasonable option:

1) it clearly documents that the function is expected to only accept
positive numbers.

And at the next iteration, you may need a singular value; keeping the
parameter signed allows you to use negative values (and you avoid the
horrible std::string::npos notation).

IMO a good example of harmful unsigned logic is strtoul() which
returns a negated unsigned for negative number representation in
input.

Noah Roberts · Mar 22, 2011

James Kanze wrote:

Click to expand...

[...]
And in my experience, something like 95% of the programmers seem
to have learned it along the lines of what Stroustrup presents.

Click to expand...

So, summed up, your position is that IF the first sheep jumped off of the
cliff and sheep have been following that precedent ever since, then that
should be the overriding basis for decision on what action to take going
forward, rather than the alternative, which is to think about it and then
make a decision. Yes?

Click to expand...

No. My point is that if there is a generally accepted
convention, and there is no good arguments against it,
readability favors going along with the convention. The most
wide spread general convention is to use int, unless there is a
good reason to do otherwise.

Click to expand...

I don't really believe this. We've seen two well known experts agreeing
with you, but that's it, and Meyers doesn't actually do C++ development
(except for his experimental stuff when he's trying to figure things out
or writing a presentation). I've personally seen a lot more application
developers, people using C++ to make products (whom you call "experts")
who believe that 'unsigned int' offers two clear advantages when it's a
reasonable option:

1) it clearly documents that the function is expected to only accept
positive numbers.

Click to expand...

And at the next iteration, you may need a singular value; keeping the
parameter signed allows you to use negative values (and you avoid the
horrible std::string::npos notation).

A function to which it is nonsensical to have a negative value suddenly
changing so that it makes sense is a severe enough semantic/design
change that it warrants the forcing of all clients to change. Switching
meaning like that should cause significant warnings and errors or the
interface to your component is simply not expressive enough.

We don't use void* as our parameters just because we might, someday down
the road, wish to pass a different type to the function. Anyone that
did so would rightly be heckled into total shame (so long as their name
is not Paul in which case they're not capable of it). Why then would
anyone expect that changing the meaning of functions and their
parameters so that a different range of values is acceptable should be
any different?

Michael Doubez · Mar 23, 2011

On 3/20/2011 7:42 AM, James Kanze wrote:
James Kanze wrote:
[...]
And in my experience, something like 95% of the programmers seem
to have learned it along the lines of what Stroustrup presents.
So, summed up, your position is that IF the first sheep jumped off of the
cliff and sheep have been following that precedent ever since, then that
should be the overriding basis for decision on what action to take going
forward, rather than the alternative, which is to think about it andthen
make a decision. Yes?
No. My point is that if there is a generally accepted
convention, and there is no good arguments against it,
readability favors going along with the convention. The most
wide spread general convention is to use int, unless there is a
good reason to do otherwise.
I don't really believe this. We've seen two well known experts agreeing
with you, but that's it, and Meyers doesn't actually do C++ development
(except for his experimental stuff when he's trying to figure things out
or writing a presentation). I've personally seen a lot more application
developers, people using C++ to make products (whom you call "experts")
who believe that 'unsigned int' offers two clear advantages when it's a
reasonable option:
1) it clearly documents that the function is expected to only accept
positive numbers.

Click to expand...

Click to expand...

And at the next iteration, you may need a singular value; keeping the
parameter signed allows you to use negative values (and you avoid the
horrible std::string::npos notation).

Click to expand...

There is nothing horrible about the std::string::npos "notation"; it is
perfectly fine.

Except that it clutters the code and cannot be used as a start
position: you have to test against it along the whole chain for simple
parsing.

There is also the modern approach of using "optional
variables" to represent "invalid" or "unknown" values rather than the
old fashioned approach of using -1.

Barton-Nackman's fallible class has been around for a while but I have
never seen it used as an optional argument.
Even for a variable, it is an overkill; I'd rather use a boolean
unless the construction of the object requires it.

However, I am talking about interface change where you cannot afford
to modify every project based on it.

Using unsigned to indicate the parameter must be positive is like
saying you prefer to hire an athlet because they are more slim. AFAIK
the purpose of unsigned is to have the extended positive range, the
fact they can represent only positive value is merely a property.

Noah Roberts · Mar 23, 2011

On 3/20/2011 7:42 AM, James Kanze wrote:

Click to expand...

James Kanze wrote:

Click to expand...

[...]
And in my experience, something like 95% of the programmers seem
to have learned it along the lines of what Stroustrup presents.

Click to expand...

So, summed up, your position is that IF the first sheep jumped off of the
cliff and sheep have been following that precedent ever since, then that
should be the overriding basis for decision on what action to take going
forward, rather than the alternative, which is to think about it and then
make a decision. Yes?

Click to expand...

No. My point is that if there is a generally accepted
convention, and there is no good arguments against it,
readability favors going along with the convention. The most
wide spread general convention is to use int, unless there is a
good reason to do otherwise.

Click to expand...

I don't really believe this. We've seen two well known experts agreeing
with you, but that's it, and Meyers doesn't actually do C++ development
(except for his experimental stuff when he's trying to figure things out
or writing a presentation). I've personally seen a lot more application
developers, people using C++ to make products (whom you call "experts")
who believe that 'unsigned int' offers two clear advantages when it's a
reasonable option:

Click to expand...

1) it clearly documents that the function is expected to only accept
positive numbers.

Click to expand...

And at the next iteration, you may need a singular value; keeping the
parameter signed allows you to use negative values (and you avoid the
horrible std::string::npos notation).

Click to expand...

There is nothing horrible about the std::string::npos "notation"; it is
perfectly fine.

Click to expand...

Except that it clutters the code and cannot be used as a start
position: you have to test against it along the whole chain for simple
parsing.

Any and all uses of symbols rather than values for constant expressions
"clutters" code. Had a boss once that told me the same thing about
new-style casts.

Barton-Nackman's fallible class has been around for a while but I have
never seen it used as an optional argument.

I use boost:

ptional on a regular basis. I probably wouldn't for the
suggested case here, but it has proved itself quite useful. You
shouldn't discount it so easily.

Unsigned types are DANGEROUS??	1	Mar 14, 2011
Types in C	117	May 22, 2011
shift, signed unsigned	5	Feb 6, 2006
rescale signed to unsigned (short) int	11	Sep 10, 2010
Types	58	Dec 10, 2006
Three questions about signed/unsigned type representations	8	Dec 4, 2004
Working with unsigned/signed types	0	Dec 20, 2006
fundamental types	8	Jul 14, 2005

Unsigned types are DANGEROUS??

James Kanze

James Kanze

James Kanze

MikeP

MikeP

MikeP

MikeP

MikeP

James Kanze

James Kanze

James Kanze

James Kanze

MikeP

MikeP

Noah Roberts

Gerhard Fiedler

Michael Doubez

Noah Roberts

Michael Doubez

Noah Roberts

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads