Unsigned types are DANGEROUS??

J

James Kanze

How does the curriculum go? Do they teach unsigned as a bit-based thing
or as the complementary integer type to signed? Surely without such
instruction, the novice will assume unsigned is "the set of positive
numbers that can be represented with the given width".
[/QUOTE]
What about char? char is a distinct type *and* it is up to the
implementation as to whether char is signed or unsigned yet char, I am
sure you will agree, is used to represent characters more than for
anything else including values which require bit manipulation.

Agreed. Several coding guidelines I've seen insist that plain
char always be a character (or a part of a character, if say
you're using UTF-8), and that numeric values use signed char,
and that unsigned char be reserved for raw memory or bitmasks
and the like. And the first part isn't without problems when
plain char is signed (which it usually is), and you're using
latin-1 or UTF-8, where the top bit is often set.
 
J

James Kanze

Leigh Johnston wrote:
[...]
Appealing to authority aside, the Standard agrees with my assertion that
an unsigned type can be used to represent non-negative values:

"It is implementation-defined whether a char object can hold negative
values."

Not sure I see what you're getting at. Whether char is signed or
unsigned is implementation defined. If it is signed, a char
object can hold negative values. If it is unsigned, a char
object cannot hold negative values.
 
J

James Kanze

Along with the following of course:
3.9.1/1
"In any particular implementation, a plain char object can take on
either the same values as a signed char or an unsigned char; which one
is implementation-defined."
*Now* I rest my case.

Which is? I have no problem with the passages you're quoting,
but I'm not sure what you're trying to argue in quoting them.
If it's that plain char can be used to hold numeric values, no
one has ever disputed that: character codes are numeric values.
But in this subthread, we've largely been talking about
programming conventions: what people (and not the compilers)
read into your code, independently of what the standard says
about it.
 
M

MikeP

James said:
The original unsigned weren't modular. At least not in the
specification---both signed and unsigned are modular on most
implementations. (Also, in the original implementations, when
you compared signed with unsigned, the unsigned was converted to
signed. But I don't think the specifications at the time really
made this clear. It was just what the compilers did.)


The problem is that when the C committee started to standardize,
there was already a lot of existing practice. Often
contradictory. They did what they could, trying to break as few
existing programs as possible, and still make the language as
clean as possible, while still allowing as much performance as
possible. Almost every decision was a compromize.

And do tell, isn't all this stuff driven by the hardware? Doesn't all
this stuff jive directly with, say, the Intel Architecture Manual? Pretty
much stuck with what is implemented at the hardware level to stay
efficient, yes? Did hardware have carry and overflow flags or the like
way back when? If so, then they didn't fully exploit the possibilities.
 
M

MikeP

James said:
You're misreading my argument. I'm not saying that it's better
because BS does it. I'm saying that BS has a large influence,
and because he does it, a lot of other people do it; it is the
"expected" idiom for most programmers.

(Aside: Apparently the guys below didn't read Stroustrup.)

http://www.viva64.com/en/a/0050/:

"size_t type is usually used for loop counters, array indexing and
address arithmetic."
 
M

MikeP

MikeP said:
(Aside: Apparently the guys below didn't read Stroustrup.)

http://www.viva64.com/en/a/0050/:

"size_t type is usually used for loop counters, array indexing and
address arithmetic."

I posted that too quickly, for they go on to say:

" ptrdiff_t type is usually used for loop counters, array indexing, size
storage and address arithmetic."
 
M

MikeP

James said:
James said:
James Kanze wrote:
James Kanze wrote:
[...]
Well there's more than that, such as loop counter rollover.
If you insist on using a for loop:). When working down, I'd
normally write:
int n = top;
while (n > 0) {
-- n;
// ...
}
Which, of course, works equally well with unsigned.
Most of that gets subsumed by iterators now anyway, so even less of a
point is the loop control thing.

And of course, with iterators, you have to write the loop as
above (or use reverse iterators). You can't decrement once
you've encountered begin.

[...]
I don't think it's a provable thing, no matter who says it.
Programmers will have to think for themselves on this one.

Except that in the absence of any killer argument, you have to
go with what the majority of programmers expect. In fact, even
when there is a strong argument for something else, you may end
up having to go with what the majority of programmers expect.

[...]
No small potatoes, for simplifying maintenance is worth more
than "simplifying" original coding. The semantic correctness
and syntactic richness ("semantical" and "syntactical"?) is
what is compelling to me about unsigned.

The semantic correctness is in the eye of the beholder. You can
argue for any meaning you want, but in the end, the only meaning
that counts is what the reader understands. At least if your
goal is communicating.
How does the curriculum go? Do they teach unsigned as a
bit-based thing or as the complementary integer type to
signed?

They never mention it except in conjunction with the bitwise
operators.
Surely without such instruction, the novice will assume
unsigned is "the set of positive numbers that can be
represented with the given width".

Without any instruction, I suspect that the novice will assume
that unsigned behaves like a cardinal. I.e. that all of the
values of an unsigned will fit in an int (integer), and that
substraction of two unsigned will result in an int (and is
guaranteed to fit in an int).

Of course, without any instruction, the novice will assume that
division of two integers results in a rational number. Or at
least it's closest approximation, a float or a double.

And even more "of course": without any instruction, a novice
won't even know that unsigned exists. What he thinks of
unsigned is 100% conditioned by his instruction.

And in my experience, something like 95% of the programmers seem
to have learned it along the lines of what Stroustrup presents.

So, summed up, your position is that IF the first sheep jumped off of the
cliff and sheep have been following that precedent ever since, then that
should be the overriding basis for decision on what action to take going
forward, rather than the alternative, which is to think about it and then
make a decision. Yes? :)
 
M

MikeP

Leigh said:
Either size_t or ptrdiff_t is fine; the difference is that one is
unsigned and one is signed.

Well that's obviously why I felt the need to retract the preceding post.
No biggie, it's the basis of the whole thread!
 
J

James Kanze

Which is that in C++ unsigned integral types are not just for performing
bit manipulation and modular arithmetic; they can also be used to
represent non-negative values.

I fail to see any relationship. (Also, of course: it's obvious
that unsigned types can be used to represent non-negative
values. So can signed types, and so can floating point types.
I fail to see where that leads us anywhere, however.)
 
J

James Kanze

On 19/03/2011 00:14, James Kanze wrote:
[...]
Agreed. Several coding guidelines I've seen insist that plain
char always be a character (or a part of a character, if say
you're using UTF-8), and that numeric values use signed char,
and that unsigned char be reserved for raw memory or bitmasks
and the like. And the first part isn't without problems when
plain char is signed (which it usually is), and you're using
latin-1 or UTF-8, where the top bit is often set.
Agreed? Then you have to accept that unsigned integral types are used
for more than just bit manipulation and modular arithmetic; they can
also be used to represent non-negative values.

So can floating point values. All numeric types can be used to
represent non-negative values. So what's your point.
 
J

James Kanze

James Kanze wrote:
[...]
And in my experience, something like 95% of the programmers seem
to have learned it along the lines of what Stroustrup presents.
So, summed up, your position is that IF the first sheep jumped off of the
cliff and sheep have been following that precedent ever since, then that
should be the overriding basis for decision on what action to take going
forward, rather than the alternative, which is to think about it and then
make a decision. Yes? :)

No. My point is that if there is a generally accepted
convention, and there is no good arguments against it,
readability favors going along with the convention. The most
wide spread general convention is to use int, unless there is a
good reason to do otherwise. The burden of proof is on those
who want to do otherwise.
 
J

James Kanze

MikeP said:
James Kanze wrote:
[...]
http://www.viva64.com/en/a/0050/:
"size_t type is usually used for loop counters, array indexing and
address arithmetic."
I posted that too quickly, for they go on to say:
" ptrdiff_t type is usually used for loop counters, array
indexing, size storage and address arithmetic."

So they contradict themselves? I get a 404 when I try to
follow your link, so I can't say. But one web site hardly makes
a concensus.
 
M

MikeP

James said:
James Kanze wrote:
[...]
And in my experience, something like 95% of the programmers seem
to have learned it along the lines of what Stroustrup presents.
So, summed up, your position is that IF the first sheep jumped off
of the cliff and sheep have been following that precedent ever
since, then that should be the overriding basis for decision on what
action to take going forward, rather than the alternative, which is
to think about it and then make a decision. Yes? :)

No. My point is that if there is a generally accepted
convention, and there is no good arguments against it,
readability favors going along with the convention. The most
wide spread general convention is to use int, unless there is a
good reason to do otherwise. The burden of proof is on those
who want to do otherwise.

There is no burden however unless one is trying to convince someone that
their way is somehow better. You seem to think that there is a better way
that all should follow as a rule, whereas some may scoff at jumping off
of that cliff no matter how long you try to lecture them to take that
leap of faith.
 
M

MikeP

James said:
MikeP said:
James Kanze wrote:
[...]
http://www.viva64.com/en/a/0050/:
"size_t type is usually used for loop counters, array indexing and
address arithmetic."
If there were really
strong arguments for something else, then do so. But the burden
of proof is on the other side: most programmers will have their
expectations set by BS. Similar reasoning has made me drop my
insistence on not using .h for C++ headers, for example. In that
case, there are fairly strong technical arguments against it.
But not enough to justify bucking the expectations of the
everyday programmer.
I posted that too quickly, for they go on to say:
" ptrdiff_t type is usually used for loop counters, array
indexing, size storage and address arithmetic."

So they contradict themselves? I get a 404 when I try to
follow your link, so I can't say. But one web site hardly makes
a concensus.

I think they were just stating facts: some do it one way, others do it
the other way. They take no stance on either being inherently more
correct than the other. The link above works for me. Don't type or copy
the colon on the end. It's from the PVS-Studio website.
 
N

Noah Roberts

James Kanze wrote:
[...]
And in my experience, something like 95% of the programmers seem
to have learned it along the lines of what Stroustrup presents.
So, summed up, your position is that IF the first sheep jumped off of the
cliff and sheep have been following that precedent ever since, then that
should be the overriding basis for decision on what action to take going
forward, rather than the alternative, which is to think about it and then
make a decision. Yes? :)

No. My point is that if there is a generally accepted
convention, and there is no good arguments against it,
readability favors going along with the convention. The most
wide spread general convention is to use int, unless there is a
good reason to do otherwise.

I don't really believe this. We've seen two well known experts agreeing
with you, but that's it, and Meyers doesn't actually do C++ development
(except for his experimental stuff when he's trying to figure things out
or writing a presentation). I've personally seen a lot more application
developers, people using C++ to make products (whom you call "experts")
who believe that 'unsigned int' offers two clear advantages when it's a
reasonable option:

1) it clearly documents that the function is expected to only accept
positive numbers.

2) it forces the issue and thus doesn't need to be validated (leveraging
the compiler for such is always a better choice). With full warnings
turned on, error on warning, etc... it's not possible to do the wrong
thing except on purpose.

In my opinion, you really need to have all conversion warnings on when
using C++ anyway.

Furthermore, even if you are right...consensus among experts is actually
not as important as consensus within a project and within code.
Obviously, if you're working for google (what started all this) then
you'll use signed numbers for everything. On the other hand, there's a
huge number of people and projects that do otherwise and this includes
the standard library. Frankly, the fact that the standard library does
a thing is more important than the fact that some "expert" disagrees.

Clearly one is more inclined to see consensus with one's own opinion
than otherwise, but I simply have not seen enough evidence for me to
agree that wide, general convention agrees with your position.
 
G

Gerhard Fiedler

MikeP said:
James said:
MikeP wrote:
James Kanze wrote:
[...]

"size_t type is usually used for loop counters, array indexing and
address arithmetic."
If there were really strong arguments for something else, then do
so. But the burden of proof is on the other side: most
programmers will have their expectations set by BS. Similar
reasoning has made me drop my insistence on not using .h for C++
headers, for example. In that case, there are fairly strong
technical arguments against it. But not enough to justify bucking
the expectations of the everyday programmer.
I posted that too quickly, for they go on to say:
" ptrdiff_t type is usually used for loop counters, array indexing,
size storage and address arithmetic."

So they contradict themselves? I get a 404 when I try to follow
your link, so I can't say. But one web site hardly makes a
concensus.

I think they were just stating facts: some do it one way, others do
it the other way. They take no stance on either being inherently more
correct than the other. The link above works for me. Don't type or
copy the colon on the end. It's from the PVS-Studio website.

AIUI they don't talk about signed vs unsigned; they talk about what
types to use to store pointers in (intptr_t or ptrdiff_t rather than
int, uintptr_t or size_t rather than unsigned int). (unsigned) int is
not guaranteed to be large enough to hold a pointer; that's (IMO) the
issue of this article.

Gerhard
 
M

Michael Doubez

     [...]
And in my experience, something like 95% of the programmers seem
to have learned it along the lines of what Stroustrup presents.
So, summed up, your position is that IF the first sheep jumped off of the
cliff and sheep have been following that precedent ever since, then that
should be the overriding basis for decision on what action to take going
forward, rather than the alternative, which is to think about it and then
make a decision. Yes? :)
No.  My point is that if there is a generally accepted
convention, and there is no good arguments against it,
readability favors going along with the convention.  The most
wide spread general convention is to use int, unless there is a
good reason to do otherwise.

I don't really believe this.  We've seen two well known experts agreeing
with you, but that's it, and Meyers doesn't actually do C++ development
(except for his experimental stuff when he's trying to figure things out
or writing a presentation).  I've personally seen a lot more application
developers, people using C++ to make products (whom you call "experts")
who believe that 'unsigned int' offers two clear advantages when it's a
reasonable option:

1) it clearly documents that the function is expected to only accept
positive numbers.

And at the next iteration, you may need a singular value; keeping the
parameter signed allows you to use negative values (and you avoid the
horrible std::string::npos notation).

IMO a good example of harmful unsigned logic is strtoul() which
returns a negated unsigned for negative number representation in
input.
 
N

Noah Roberts

James Kanze wrote:
[...]
And in my experience, something like 95% of the programmers seem
to have learned it along the lines of what Stroustrup presents.
So, summed up, your position is that IF the first sheep jumped off of the
cliff and sheep have been following that precedent ever since, then that
should be the overriding basis for decision on what action to take going
forward, rather than the alternative, which is to think about it and then
make a decision. Yes? :)
No. My point is that if there is a generally accepted
convention, and there is no good arguments against it,
readability favors going along with the convention. The most
wide spread general convention is to use int, unless there is a
good reason to do otherwise.

I don't really believe this. We've seen two well known experts agreeing
with you, but that's it, and Meyers doesn't actually do C++ development
(except for his experimental stuff when he's trying to figure things out
or writing a presentation). I've personally seen a lot more application
developers, people using C++ to make products (whom you call "experts")
who believe that 'unsigned int' offers two clear advantages when it's a
reasonable option:

1) it clearly documents that the function is expected to only accept
positive numbers.

And at the next iteration, you may need a singular value; keeping the
parameter signed allows you to use negative values (and you avoid the
horrible std::string::npos notation).

A function to which it is nonsensical to have a negative value suddenly
changing so that it makes sense is a severe enough semantic/design
change that it warrants the forcing of all clients to change. Switching
meaning like that should cause significant warnings and errors or the
interface to your component is simply not expressive enough.

We don't use void* as our parameters just because we might, someday down
the road, wish to pass a different type to the function. Anyone that
did so would rightly be heckled into total shame (so long as their name
is not Paul in which case they're not capable of it). Why then would
anyone expect that changing the meaning of functions and their
parameters so that a different range of values is acceptable should be
any different?
 
M

Michael Doubez

On 3/20/2011 7:42 AM, James Kanze wrote:
James Kanze wrote:
      [...]
And in my experience, something like 95% of the programmers seem
to have learned it along the lines of what Stroustrup presents.
So, summed up, your position is that IF the first sheep jumped off of the
cliff and sheep have been following that precedent ever since, then that
should be the overriding basis for decision on what action to take going
forward, rather than the alternative, which is to think about it andthen
make a decision. Yes? :)
No.  My point is that if there is a generally accepted
convention, and there is no good arguments against it,
readability favors going along with the convention.  The most
wide spread general convention is to use int, unless there is a
good reason to do otherwise.
I don't really believe this.  We've seen two well known experts agreeing
with you, but that's it, and Meyers doesn't actually do C++ development
(except for his experimental stuff when he's trying to figure things out
or writing a presentation).  I've personally seen a lot more application
developers, people using C++ to make products (whom you call "experts")
who believe that 'unsigned int' offers two clear advantages when it's a
reasonable option:
1) it clearly documents that the function is expected to only accept
positive numbers.
And at the next iteration, you may need a singular value; keeping the
parameter signed allows you to use negative values (and you avoid the
horrible std::string::npos notation).

There is nothing horrible about the std::string::npos "notation"; it is
perfectly fine.

Except that it clutters the code and cannot be used as a start
position: you have to test against it along the whole chain for simple
parsing.
There is also the modern approach of using "optional
variables" to represent "invalid" or "unknown" values rather than the
old fashioned approach of using -1.

Barton-Nackman's fallible class has been around for a while but I have
never seen it used as an optional argument.
Even for a variable, it is an overkill; I'd rather use a boolean
unless the construction of the object requires it.

However, I am talking about interface change where you cannot afford
to modify every project based on it.

Using unsigned to indicate the parameter must be positive is like
saying you prefer to hire an athlet because they are more slim. AFAIK
the purpose of unsigned is to have the extended positive range, the
fact they can represent only positive value is merely a property.
 
N

Noah Roberts

On 3/20/2011 7:42 AM, James Kanze wrote:
James Kanze wrote:
[...]
And in my experience, something like 95% of the programmers seem
to have learned it along the lines of what Stroustrup presents.
So, summed up, your position is that IF the first sheep jumped off of the
cliff and sheep have been following that precedent ever since, then that
should be the overriding basis for decision on what action to take going
forward, rather than the alternative, which is to think about it and then
make a decision. Yes? :)
No. My point is that if there is a generally accepted
convention, and there is no good arguments against it,
readability favors going along with the convention. The most
wide spread general convention is to use int, unless there is a
good reason to do otherwise.
I don't really believe this. We've seen two well known experts agreeing
with you, but that's it, and Meyers doesn't actually do C++ development
(except for his experimental stuff when he's trying to figure things out
or writing a presentation). I've personally seen a lot more application
developers, people using C++ to make products (whom you call "experts")
who believe that 'unsigned int' offers two clear advantages when it's a
reasonable option:
1) it clearly documents that the function is expected to only accept
positive numbers.
And at the next iteration, you may need a singular value; keeping the
parameter signed allows you to use negative values (and you avoid the
horrible std::string::npos notation).

There is nothing horrible about the std::string::npos "notation"; it is
perfectly fine.

Except that it clutters the code and cannot be used as a start
position: you have to test against it along the whole chain for simple
parsing.

Any and all uses of symbols rather than values for constant expressions
"clutters" code. Had a boss once that told me the same thing about
new-style casts.
Barton-Nackman's fallible class has been around for a while but I have
never seen it used as an optional argument.

I use boost::eek:ptional on a regular basis. I probably wouldn't for the
suggested case here, but it has proved itself quite useful. You
shouldn't discount it so easily.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top