Unsigned types are DANGEROUS??

Noah Roberts · Mar 16, 2011

No its just that nobody else can be bothered arguing with you about this
again. Because we argued about it already a few week ago, where most of
the experts disagree with you.

Just to make it clear, I think you're wrong. I'm certainly not trying
to agree with Leigh because he's annoying or whatever. I've been away
from the group too long to be familiar with either of you. It's simply
the fact that you're demonstrably wrong in almost everything you've said
in this thread to date, and that wrongness has been demonstrated.

It's unfortunate that you can't recognize the truth here. I can see why
you might want to think that we're all disagreeing with you just to make
your opponent happy but you really need to get past your wishes and face
reality.

Noah Roberts · Mar 16, 2011

Like i really give a fuc what I am typiong to you? I just batter the
keypad and deny anything you say now because I dislike you sso much.
You are an idiot and not even a very nice idiot.

*plonk*

SG · Mar 16, 2011

*plonk*

Thanks for pointing this out. Especially the "...deny anything you say
now because I dislike you..." part sounds real mature.

SG

ghartshaw · Mar 16, 2011

<prune>
Playing word games can work both ways, example:
int x;
The actual integer object is just a a piece of memory, x is just an
identifier. So are we wrong to say x is an integer? Likewise:
int* arr = new int[16];
The actual array object is just a piece of memory, arr is just an
identifier. So are we wrong to say arr is an array?
int x; /* x is an integer */
int * p = &x; /* p is a pointer to an integer */
int a[3]; /* a is an array of 3 integers */
int * pa = a; /* pa is a pointer to an integer */
int * dx = new int(); /* dx is a pointer to an integer */
int * da = new int[3]; /* da is a pointer to an integer */
Because 'pa' and 'da' both point to the first element of an array, it is
well defined to do pointer arithmatic on them as long as you stay within
the bounds of the underlying array (e.g 'pa[1]' which is equivalent to
'*(pa + 1)'). With 'a', 'a[1]' is again equivalent '*(a + 1)'. However,
since a is actually an array, a is implicitly converted to type int*
before operator+ can be applied, so it is equivalent to '*((int*)a + 1)'.
Basically what you are saying is that... yes it's well defined to create an
non zero based array in C++. ?
Look at this :
int* arr = new arr[16];
++arr;
arr[-1] = 10;

Click to expand...

Click to expand...

int * arr; /* arr has type 'int *' (eg pointer to int) */
arr = new int[16]; /* arr points to the first element of an array */
++arr; /* arr points to the second element of the array */
arr[-1] = 10; /* same as *(arr - 1) */

Click to expand...

The important thing to note here is that arr does not suddenly become an
array when we assign 'new int[16]' to it. It remains a pointer only (eg.. I
could assign the address of a normal integer to it. This would not be
smart without saving the origional value, but nonetheless can be done).

Click to expand...

I have not (and I don't think anyone else has) said that the argument to
operator[] (E2) must be nonnegative, only that it must stay within the
bounds of the underlying array object. What we have said is that if E1 *is
an array* then it is undefined behavior if you use a negative index. (For
if E1 is an array (not a pointer), any negative value will be out of
bounds (and therefore undefined behavior.))

Click to expand...

Yes what you are saying is that with a static array declared like so:
int arr[16];
...cannot have a negative array index because these arrays are always
zero-based. I agree(within the bounds of standard C++)
Good.

But I disagree with you saying E1 is not an array like so:
int* E1 = new int[16];
++E1;
E1[-1]= 6;

In the expression E1[-1].... E1 is used as an array. The name "E1" is the
name for both the pointer and the array.

Click to expand...

E1 is of type 'int *' (a pointer). The expression 'new int[16]'
dynamically
creates an (unnamed) array and return a pointer to the first element.
When
this element is assigned to E1, E1 does not suddenly become an array
(it
remains a pointer). The name "E1" identifies only the pointer. The
array
has no name.

The expression '++E1;' changes the value of the pointer (the address
that it
points to). It is equivalent to the pointer arithmetic 'E1 = E1 + 1;'.

In the expression 'E1[-1] = 6;', E1 is used as a pointer (as that is
what it is).
It is equivalent to '*(E1 - 1) = 6;'. This means that the compiler
subtracts 1
from the pointer, and then dereferences the new pointer value, and
assigns 6 to
whatever is pointed to (which happens to be the first element of an
array.

int arr[16];
arr[2] = 6;

In the expression 'arr[2] = 6;', arr is used as a pointer. This is
equivalent to
'*(arr + 2)', but as operator+ is not defined for arrays but is for
pointers, and
an array can be converted to a pointer to it's first element, it is
further
transformed to '*((int *)arr + 2)'. (ie. arr is first converted to a
pointer, and
then pointer arithmetic is done on the result).

That is why the standard says that *because of the conversion rules
defined for
operator+*, if E1 *is an array* E1[E2] refers to the E2th element of
E1. ie.
because (int *)E1 returns a pointer referring to the first (1-based)
element of E1,
doing pointer arithmetic on this value with a (positive) integer (less
than the
length of the array) E2 is well defined behavior and refers to the
E2th (0-based)
(or (E2 + 1)th (1-based) if you prefer thinking that way) element of
E1.

The text from the standard seems to have been written to explain the
behaviour of both dynamic and static arrays. Please don't try to interpret
it to mean something along the lines of ..E1 is an array and not a pointer.
The standard also states someplace that array indexing with static arrays
is identical to using pointers, so its intended meaning is obvious.

Click to expand...

int arr[4];
arr[1] = 6;

int * ptr = arr;
ptr[1] = 6;

Yes, the standard states that 'arr[1]' is the same as 'int * ptr =
arr; ptr[1]'.
Yes, this explains the behavior of both static and dynamic arrays (see
below).
No, this does not mean that a pointer to an element of an array (even
to the first
element) is itself an array. It remains a pointer.

The name/identifier in the expression arr[-1] is the name of both a pointer
and an array. The expression is converted to *((arr)+(-1)).
That is, in the context of E1[E2], E2 is the -1th member of arr.
Normal people would probably prefer to think that -1, in this case, refers
to the 1st element.
This is the only reasonable way to interpret array indexing as defined in
the C++ standard. To suggest that array indices can never be negative just
seems like a foolish misconception through misintepretations of the
standards.

Click to expand...

Click to expand...

Click to expand...

The following code is well-defined. There are two integers (one static
and one dynamic), two integer arrays of length 16 (one static and one
dynamic), and four pointers to integers (all static).

Even though the integer and integer array defined statically have
names
('i' and 'ptr' respectively) they are not known in main.cpp, and we
can
only refer to them indirectly through their address (returned by the
functions 'get_integer()' and 'get_array()'). The dynamic objects do
not
have names of there own, so we can only refer to them indirectly
through
their addresses (returned by 'new int()' and 'new int[16]').

The expressions 'ptrn[0]' (with n being one of 1, 2, 3, or 4) are
transformed into '*(ptrn + 0)' which is the same as *ptrn. I hope that
you can see that ptr1 and ptr2 are no more arrays than ptr3 and ptr4
are even though they all are used as operands to operator[].

/* foo.h */
int * get_array();
int * get_integer();

/* foo.cpp */
int arr[16];
int i;
int * get_array() {
return arr;
/* returns the address of the first element of an array */
}
int * get_integer() {
return &i;
/* returns the address of an integer */
}

/* main.cpp */
int main() {
int * ptr1, * ptr2, * ptr3, * ptr4;
ptr1 = get_array();
ptr2 = new int[16];
ptr3 = get_integer();
ptr4 = new int();

ptr1[0] = 10;
ptr2[0] = 10;
ptr3[0] = 10;
ptr4[0] = 10;
}

Gerhard Fiedler · Mar 16, 2011

MikeP said:
You were doing great until that last "go with the crowd" statement!
(Unless you meant within an existing project rather than a new one,
where either alternative can be chosen. i.e., Consistency is of
course important). Also, I'm not sure whether you consider richness
of semantics a technical thing: using signed, you have one thing
while using unsigned you have "divided and conquered".

For me, one of the most important differences is that range overflow
doesn't exist in unsigned types; arithmetic on them is defined to be
modulo arithmetic. With signed types, range overflow is UB.

Therefore, a (standard-conformant) compiler may produce code to catch
range overflow with signed types but not with unsigned types. And some
actually do. IMO this helps create more robust applications.

To me, this characteristic of unsigned types makes it pretty obvious
that they were designed for modulo arithmetic -- which has its places
and uses, but in most cases where unsigned types are used is not what is
intended. And most programmers simply forget that signed and unsigned
types are different in this (IMO important) aspect.

When James sounded as if it was a compromise to use unsigned, it's
probably because it was. In many architectures there is no signed type
that is able to enumerate the full address space with its positive
range, so an unsigned type has to be used where this is required. (Focus
on "has to be used", which already sounds like "can't do better, even if
I wanted to"

Gerhard

MikeP · Mar 16, 2011

puppi said:
If you investigate the tcmalloc code (by Google), you will find the
following warning:

// NOTE: unsigned types are DANGEROUS in loops and other arithmetical
// places. Use the signed types unless your variable represents a bit
// pattern (eg a hash value) or you really need the extra bit. Do NOT
// use 'unsigned' to express "this value should always be positive";
// use assertions for this.

Is it just their idiom? What's the problem with using unsigned ints
in loops (it seems natural to do so)? Are C++ unsigned ints "broken"
somehow?

Click to expand...

Unsigned integers are dangerous in DESCENDING loops due to underflow.
For instance, consider the following function, that takes an integer
array as its argument and modifies it in the following way: if n is
the number of elements in the parameter array A, then the outputted
A, for 0<=i<n, equals A+A[i+1]+A[i+2]+...+A[n-1] for their
original values. Now consider this implementation:

void sum_up(int* A, int n)
{
for(unsigned int num = n-2; num >= 0; num--)
{
A[num] += A[num+1];
}
}

That's buggy. What we intended it to do was executing from num = n-2
to 0, and then halting. But since num is unsigned, when num == 0 the
instruction num-- causes it to underflow into a very large value, and
thus num >= 0 remains true, and the loop body executes again (and
quite likely causes a segmentation fault, when A is indexed with such
a large figure). I like unsigned integers, but I always have to be
careful. I've written faulty loops such as this one more times than I
can remember...

The above is an example of bad coding, not of why unsigned is bad for
loop counters. Yes, if you want a bit more leeway to code sloppily, then
signed will give you that leeway.

MikeP · Mar 16, 2011

Gerhard said:
When James sounded as if it was a compromise to use unsigned, it's
probably because it was. In many architectures there is no signed type
that is able to enumerate the full address space with its positive
range, so an unsigned type has to be used where this is required.
(Focus on "has to be used", which already sounds like "can't do
better, even if I wanted to"

Did you just say (e.g.): "The word size is X-bits. The maximum address is
the maximum UNSIGNED integer that will fit in the word size. The maximum
SIGNED integer that can fit in the word size is half of the UNSIGNED max
(duh). Hence UNSIGNED is a compromise"?

MikeP · Mar 16, 2011

Gerhard said:
For me, one of the most important differences is that range overflow
doesn't exist in unsigned types; arithmetic on them is defined to be
modulo arithmetic. With signed types, range overflow is UB.

Therefore, a (standard-conformant) compiler may produce code to catch
range overflow with signed types but not with unsigned types. And some
actually do. IMO this helps create more robust applications.

To me, this characteristic of unsigned types makes it pretty obvious
that they were designed for modulo arithmetic -- which has its places
and uses, but in most cases where unsigned types are used is not what
is intended.

Is the real issue then not of unsigned integers, per se, but only of how
they are implemented (modular behavior)? Now THIS is exactly what I was
trying to get at. Now I have to ask everyone to give their examples of
using this "great" feature (modular unsigned integers) and how useful it
is or how rarely it is used.

MikeP · Mar 16, 2011

Gerhard said:
For me, one of the most important differences is that range overflow
doesn't exist in unsigned types; arithmetic on them is defined to be
modulo arithmetic. With signed types, range overflow is UB.

Neither seems acceptable. It seems like a sparse (half-baked) solution.

(After all the banter, I'm finally starting to see an answer to it all).

Johannes Schaub (litb) · Mar 16, 2011

Leigh said:
That is your opinion; an exact opposite of your opinion can also be
found in that same stackoverflow.com thread (topmost opinion) and it has
more votes than yours.

Eschewing unsigned integral types when programming in C++ is irrational
at worst; n00bish at best as the std::size_t C++ horse bolted long ago.

I'm not "eschewing" unsigned integer types. I'm using it for the things it
makes sense.

That includes bitops and excludes anything like "age", "size" etc.

Johannes Schaub (litb) · Mar 16, 2011

Leigh said:
Not for "size"? That makes no sense: std::size_t is used for sizes and
that is an unsigned integral type; the return type of std::array::size()
is a typedef of std::size_t.

argumentum ad verecundiam.

Peter Remmers · Mar 17, 2011

Am 16.03.2011 13:57, schrieb Paul:

This is just another example of word games the identifier is arr in
following:
int* arr = new int[16];

Yes its a pointer but it s a pointer to an array. And, as it's the only
identifier we have for the array, it's the arrays' idientifier.
A entity withiout an identifier is not a useable entity.

Click to expand...

Of course it's usable. You do have indirect access to it via a pointer to
its members. You don't need an identifier for the array itself to use it.

Click to expand...

Of course you need an identifier to use it. Show me how to use an array
without an identifier...

I await your response

You want me to repeat that sentence? Because it's the answer. Here we go:

You do have indirect access to it via a pointer to its members. You
don't need an identifier for the array itself to use it.

Ah you don;t understand that, why not hmm. Not much more i can say if you
don't ubnderstand that , its pretty simple really 1 name refers to 2
entitites

The basic property of a name is that it uniquely identifies one entity.
If a name is to refer to more than one entity, its purpose is defeated.
As has been pointed out, the only exception in C++ are overloaded
functions, because they have an additional means of distiction (their
parameter types).

Exactlly, The pointer is the only means to access the array, what don't you
understand about that?

I perfectly understand that the pointer is the only means to access the
array.
I don't understand how this automatically implies that the pointer's
identifier must then automatically also be the identifier of the array.
The array has no name and it does not need a name to be usable because
you have a pointer to it that gives you an indirect access to the array.

When a pointer points to a dynamic array , the pointer is a means of
accessing the array the array itself is the bigger entity.
You seem to think the pointer and array are unrelated, its not a pointer in
its own right , its an an array pointer becaus eof the way it is used.

Of course it's a pointer in its own right, how can it not be? The *only*
connection to the array is that it happens to point to one of the
array's elements.

int x=5;

Here, x *is not* the value 5, it is a variable that happens to hold the
value 5.

An identifiier and a name mean the same thing to me but more specifically an
identifier is something that *****drumroll***** identifies.

int *arr=new int[10];

And what does "arr" identify? Hint: It does not identify an array.
You seem to carefully stay away fom my picture analogies. Maybe they
show how ridiculous it is to think of a pointer as the identifier of a
dynamic array...

By "it" you mean the array.
Do you know the baby game where you put your hands on your eyes and
pretend that no one can see you now because if you don't see them, they
can't see you?
Poor little dynamic array. I don't have a name. No one can see me. Oh
wait. There's a pointer named "arr". And now it's pointing at me! That
means they can see me now. That must mean I have a name now! And that
name must be "arr"!!! Yay me!!

I see. A dynamic array steals the name of some pointer that points to
it. Now I get it. You will be assimilated. Resistance is futile.

Hang on is there is any code here?

I SEE NO CODE

I SEE NO CODE

I SEE NO CODE.

You cannot do it.

void foo(int *blah);

foo(new int[10]); // no identifier here

Example?

C++ arrays are *always* zero-based. If you access a one-dimensional array
directly via the array's identifier, then you can't use negative indices
without UB. I leave the case of multi-dimensional arrays open to
discussion. The only time you can use negative indices is with pointers or
with classes that implement operator[].

Click to expand...

Nonsense in C++ arrays can be non zero based. You fail to understand that
dynamic arrays are arrays too.

1. How do you create an array with an index range of, say, -5..15?
2. How would you do that with a dynamic array?

As I said....

int* arr= new int[16];

Click to expand...

Click to expand...

Step one: create a pointer.
Step two: have it point somewhere inside an array, away from index zero,
creating a new logical origin.

arr[-1]=64;

Click to expand...

Click to expand...

Step three: happily use negative offsets!! Yay!

But sensei! This is not a C++ array that has negative indices?!? It's
just pointer arithmetic!

arr[-1] is guaranteed to yield the first element of the array. It's not
UB,
it's completely defined.

Click to expand...

Right.

Click to expand...

Yes I know its right , I've argued with bigger people than you about this 20
years ago, and I got it right then.

Oh you are so wise, sensei. Please teach me the secret of the
nonzero-based C++ array!

No its free.

I always thought it wasn't. That must have changed while I wasn't looking...

The C++ language is defined in such a way that it has the flexibility to
do
this, if not I think its userbase would be a fraction of it's size. Can
you
imagine what a shitty language it would be if we couldn't even create a
non
zero based array, I would drop it like a brick.

Click to expand...

You can only emulate non-zero based arrays via pointers or operator[].
It's not like ada or pascal where you can declare your array with start
index and end index. C++ only allows you to specify a number of elements,
because that's all that is needed, because the starting index is always
implicitly at zero.

Click to expand...

You don't understand C++ arrays. If you don't understand I cant explain it
anymore.

Oh please, sensei. Forgive your ignorant pupil and lead him to the wide
ocean of your wisdom.

You're not "with me " at all . You are a million miles behind me LOL You
seem brainwashed to have tthis single track narrow mind view of everything..

Please forgive your pupil, sensei, for being so presumptuous to believe
he can talk with his sensei at the same level.

My television has two wires coming out of the back. A signal cable and a
power cable, both cables carry electricity, and both cables can be called
electric cables.

Uh. Me no understand. Two cables. Both electric but not the same? How
can be?

One cable however carries analogue or digital signals and is more
specifically a frequency carrier. You don't uderstand the complexities
because in your mind both are simply eletricity cables and not signal
cables.
You have lost all sense of connection between electriicty and signal.
Same as you lost connection between class and object
Same as you lost connection between pointer and array

Me dumb. Me no understand complex thing like cable. You dumb, too. You
no understand *difference* between pointer and array.

I don't know if you are brainwashed or what but I certainly don't like the
way you think about C++, I think you do not fully understand the language.

I don't fully understand the language. It is very complex and I havent't
yet looked in every corner. So?

You don't seem to be able to grasp simple concepts like pointers and
arrays. And that's far from the complex template metaprogramming stuff.
Arrays and pointers have already been around ages ago when there was no
C++, only C.

GL.

I don't need luck to understand that your snotty attitude is totally
inappropriate.

Peter

MikeP · Mar 17, 2011

[snipped most of good post to address a minor point]

`int' is certainly shorter to type. Perhaps int is closer to the
Platonic form in an idealized language. Those aren't prudent
characteristics from an engineering standpoint given the manifest
constraints.

Does anyone really not have a header in all projects that does stuff
like:

typedef unsigned int uint32;

// compile-time checks to see if size assumptions are correct here.

?

Peter Remmers · Mar 17, 2011

Am 17.03.2011 04:42, schrieb MikeP:

Does anyone really not have a header in all projects that does stuff
like:

typedef unsigned int uint32;

// compile-time checks to see if size assumptions are correct here.

?

#include <stdint.h>

// use uint32_t etc.

gcc has it, Microsoft's 2010 has it, and even Analog Devices' Blackfin
compiler got it in an update some time ago, and since then we use it in
all our cross-platform code.

Before that, we used to have own typedefs.

Peter

SG · Mar 17, 2011

Does anyone really not have a header in all projects that does stuff
like:

typedef unsigned int uint32;

// compile-time checks to see if size assumptions are correct here.

?

I don't. I rarely require exact width integers. Sometimes, I use the
<climits> header to query implementation-specific properties
(UINT_MAX), for example, when I need at least 32 bits but would prefer
int over long in case int is big enough already.

SG

Paul · Mar 17, 2011

<prune>
Playing word games can work both ways, example:
int x;
The actual integer object is just a a piece of memory, x is just an
identifier. So are we wrong to say x is an integer? Likewise:
int* arr = new int[16];
The actual array object is just a piece of memory, arr is just an
identifier. So are we wrong to say arr is an array?
int x; /* x is an integer */
int * p = &x; /* p is a pointer to an integer */
int a[3]; /* a is an array of 3 integers */
int * pa = a; /* pa is a pointer to an integer */
int * dx = new int(); /* dx is a pointer to an integer */
int * da = new int[3]; /* da is a pointer to an integer */
Because 'pa' and 'da' both point to the first element of an array, it is
well defined to do pointer arithmatic on them as long as you stay within
the bounds of the underlying array (e.g 'pa[1]' which is equivalent to
'*(pa + 1)'). With 'a', 'a[1]' is again equivalent '*(a + 1)'. However,
since a is actually an array, a is implicitly converted to type int*
before operator+ can be applied, so it is equivalent to '*((int*)a + 1)'.
Basically what you are saying is that... yes it's well defined to create an
non zero based array in C++. ?
Look at this :
int* arr = new arr[16];
++arr;
arr[-1] = 10;

Click to expand...

Click to expand...

int * arr; /* arr has type 'int *' (eg pointer to int) */
arr = new int[16]; /* arr points to the first element of an array */
++arr; /* arr points to the second element of the array */
arr[-1] = 10; /* same as *(arr - 1) */

Click to expand...

The important thing to note here is that arr does not suddenly become an
array when we assign 'new int[16]' to it. It remains a pointer only (eg.
I
could assign the address of a normal integer to it. This would not be
smart without saving the origional value, but nonetheless can be done).

Click to expand...

I have not (and I don't think anyone else has) said that the argument to
operator[] (E2) must be nonnegative, only that it must stay within the
bounds of the underlying array object. What we have said is that if E1
*is
an array* then it is undefined behavior if you use a negative index.
(For
if E1 is an array (not a pointer), any negative value will be out of
bounds (and therefore undefined behavior.))

Click to expand...

Yes what you are saying is that with a static array declared like so:
int arr[16];
...cannot have a negative array index because these arrays are always
zero-based. I agree(within the bounds of standard C++)
Good.

But I disagree with you saying E1 is not an array like so:
int* E1 = new int[16];
++E1;
E1[-1]= 6;

In the expression E1[-1].... E1 is used as an array. The name "E1" is the
name for both the pointer and the array.

Click to expand...

E1 is of type 'int *' (a pointer). The expression 'new int[16]'
dynamically
creates an (unnamed) array and return a pointer to the first element.
When
this element is assigned to E1, E1 does not suddenly become an array
(it
remains a pointer). The name "E1" identifies only the pointer. The
array
has no name.

The expression '++E1;' changes the value of the pointer (the address
that it
points to). It is equivalent to the pointer arithmetic 'E1 = E1 + 1;'.

In the expression 'E1[-1] = 6;', E1 is used as a pointer (as that is
what it is).
It is equivalent to '*(E1 - 1) = 6;'. This means that the compiler
subtracts 1
from the pointer, and then dereferences the new pointer value, and
assigns 6 to
whatever is pointed to (which happens to be the first element of an
array.

int arr[16];
arr[2] = 6;

In the expression 'arr[2] = 6;', arr is used as a pointer. This is
equivalent to
'*(arr + 2)', but as operator+ is not defined for arrays but is for
pointers, and
an array can be converted to a pointer to it's first element, it is
further
transformed to '*((int *)arr + 2)'. (ie. arr is first converted to a
pointer, and
then pointer arithmetic is done on the result).

That is why the standard says that *because of the conversion rules
defined for
operator+*, if E1 *is an array* E1[E2] refers to the E2th element of
E1. ie.
because (int *)E1 returns a pointer referring to the first (1-based)
element of E1,
doing pointer arithmetic on this value with a (positive) integer (less
than the
length of the array) E2 is well defined behavior and refers to the
E2th (0-based)
(or (E2 + 1)th (1-based) if you prefer thinking that way) element of
E1.

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

I am not even reading your long winded explnation of what E1 is above.
I'm simply "telling" you it's an array, I don't care what you think I'm
telling you wehat it simply is.
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

The text from the standard seems to have been written to explain the
behaviour of both dynamic and static arrays. Please don't try to interpret
it to mean something along the lines of ..E1 is an array and not a
pointer.
The standard also states someplace that array indexing with static arrays
is identical to using pointers, so its intended meaning is obvious.

Click to expand...

int arr[4];
arr[1] = 6;

int * ptr = arr;
ptr[1] = 6;

Yes, the standard states that 'arr[1]' is the same as 'int * ptr =
arr; ptr[1]'.
Yes, this explains the behavior of both static and dynamic arrays (see
below).
No, this does not mean that a pointer to an element of an array (even
to the first
element) is itself an array. It remains a pointer.

The name/identifier in the expression arr[-1] is the name of both a pointer
and an array. The expression is converted to *((arr)+(-1)).
That is, in the context of E1[E2], E2 is the -1th member of arr.
Normal people would probably prefer to think that -1, in this case, refers
to the 1st element.
This is the only reasonable way to interpret array indexing as defined in
the C++ standard. To suggest that array indices can never be negative just
seems like a foolish misconception through misintepretations of the
standards.

Click to expand...

Click to expand...

Click to expand...

The following code is well-defined. There are two integers (one static
and one dynamic), two integer arrays of length 16 (one static and one
dynamic), and four pointers to integers (all static).

Even though the integer and integer array defined statically have
names
('i' and 'ptr' respectively) they are not known in main.cpp, and we
can
only refer to them indirectly through their address (returned by the
functions 'get_integer()' and 'get_array()'). The dynamic objects do
not
have names of there own, so we can only refer to them indirectly
through
their addresses (returned by 'new int()' and 'new int[16]').

The expressions 'ptrn[0]' (with n being one of 1, 2, 3, or 4) are
transformed into '*(ptrn + 0)' which is the same as *ptrn. I hope that
you can see that ptr1 and ptr2 are no more arrays than ptr3 and ptr4
are even though they all are used as operands to operator[].

/* foo.h */
int * get_array();
int * get_integer();

/* foo.cpp */
int arr[16];
int i;
int * get_array() {
return arr;
/* returns the address of the first element of an array */
}
int * get_integer() {
return &i;
/* returns the address of an integer */
}

/* main.cpp */
int main() {
int * ptr1, * ptr2, * ptr3, * ptr4;
ptr1 = get_array();
ptr2 = new int[16];
ptr3 = get_integer();
ptr4 = new int();

ptr1[0] = 10;
ptr2[0] = 10;
ptr3[0] = 10;
ptr4[0] = 10;
}

Gerhard Fiedler · Mar 17, 2011

William said:
Detecting addition overflow is much easier with unsigned types, so
the modulo behavior is useful even when you're not doing strictly
modulo arithmetic. If your implementation catches signed overflows,
great. If you have to detect overflow using well-defined code,
unsigned is more often preferable.

I don't know whether it's actually "more", but yes, sometimes it is.

Anyone who has worked at more than a few jobs knows that pet libraries
aren't always well received by others. The more you can stick to
vanilla standard patterns without resort to extraneous code, the
better.

I'm not sure what you mean by "pet libraries". Both GCC under Linux and
VC++ under Windows can let you know about (signed) integer overflow.
They may be my current "pet environments", but they're hardly rare, and
both together cover a quite significant part of C++ activity.

The point is that no standard-compliant compiler is even able to catch
unsigned overflow; it is defined to not exist by the standard.

The majority of programming is grunt work operating on data structures, not
numerical formula.

Exactly. Adding two unsigned values always gives you another valid
unsigned value -- even though it may be wrong, it is valid.

When you're dealing with data structures--sizes of objects, parameters
of complex structures--modulo arithemtic is a nifty thing.

If I understand what you mean here correctly, I've done my share of
modulo arithmetic in these areas, but typically in C on small embedded
processors where every byte of code space counts. On anything that is
big enough to run C++ and its standard libraries, this should IMO be
restricted to very small portions of the code that need to be highly
optimized (as shown by a profiler).

And when you get the calculations wrong by mistake, modulo behavior is
often far more forgiving because you stay within range--you may
corrupt your immediate data, but not some other data or code. Defects
are contained.

?? In most implementations today you get modulo behavior for both.
They're the same in this respect -- with the difference that you can
tell the compiler to catch signed overflow for you, but you can't tell
it to catch unsigned overflow. Defects are recognized earlier.

Simpler. Safer.

Exactly

That doesn't mean I don't use negative indices.

I'm not talking about negative indices. This is an issue that's
completely orthogonal to what I'm talking about. If for whatever
application-domain reason you want negative indices, use them -- and
when you do, you of course /have/ to use signed indices. But this is not
what I'm talking about, and since you don't have a choice here, this is
not the issue of this whole thread.

Gerhard

Gerhard Fiedler · Mar 17, 2011

MikeP said:
Did you just say (e.g.): "The word size is X-bits. The maximum
address is the maximum UNSIGNED integer that will fit in the word
size. The maximum SIGNED integer that can fit in the word size is
half of the UNSIGNED max (duh). Hence UNSIGNED is a compromise"?

Sort of. For those implementations (up to 32 bit word and pointer
sizes), they didn't have much of a choice. If they even thought about
it, it means that they must have seen strong arguments towards using a
signed integer.

However, I'm working mostly on 64-bit code these days (in C++ at least).
For such an environment, this argument doesn't hold anymore in practice
(even though it still holds in theory)... a 64-bit signed integer is
more than enough to hold the sizes of anything I may want to address.

Gerhard

Gerhard Fiedler · Mar 17, 2011

MikeP said:
Is the real issue then not of unsigned integers, per se, but only of
how they are implemented (modular behavior)?

I don't know whether it is /the/ real issue, but IMHO (I rarely use the
"H", but here I do

it is an important issue.

Now THIS is exactly what I was trying to get at. Now I have to ask
everyone to give their examples of using this "great" feature
(modular unsigned integers) and how useful it is or how rarely it is
used.

IMO it is only useful in places where modulo arithmetic is purposefully
used. In highly optimized code for address (and other) calculations, for
example. In encryption and hashing algorithms. There probably are
others; I'm by no means an expert here.

As I see it, in the dawn of the language everything was clear: you used
(signed) int for everything -- except when you needed modulo arithmetic,
which was when you used unsigned int. (Note also that "unsigned" doesn't
mean "not negative" -- it means "has no concept of sign", neither
positive nor negative. Another thing often overlooked.)

In the library things were less clear: since there was no (signed) int
to represent the whole range of addressable bytes, they sort of /had/ to
use unsigned for size_t. And bingo -- we were in the mess that we still
are and are compelled to use (unsigned) size_t for calculations that are
not meant to be modulo arithmetic.

So there are two orthogonal concepts: modulo arithmetic (the language
says unsigned is modulo, signed is not necessarily) and whether a given
variable may or may not assume negative values. Unluckily these two are
all too often mixed up in the discussion -- and they are mixed up
between the language itself (that makes the clear distinction that
unsigned is for modulo arithmetic and signed is not) and the library
(which uses unsigned for places where modulo arithmetic is not
necessarily desired).

Since all this goes way back to early C (and early C++), I don't see any
of this change any time soon (or not so soon). So we just have to live
with the mess and find our way around it. It can't be "clean" code,
because we have to live with the library and its compromise (which on
some platforms is a necessary compromise), so we have the various
opinions on how to make the code "less unclean".

The implicit conversions between signed and unsigned didn't help,
either. If I had a compiler with a switch that disabled them, I'd have
this switch on all the time. There should be a signed_cast (similar to
const_cast) for these cases. Luckily, most modern compilers are starting
to see the problem and introducing warnings for these cases -- almost as
good.

Neither seems acceptable. It seems like a sparse (half-baked)
solution.

IMO the "half-bakedness" came in when the library used unsigned in ways
the language definition didn't really support -- for practical reasons
(as so much in C and C++). Because of backwards compatibility (another
common reason for how things are in C and C++), we're stuck with this
ever since.

Also, the situation with the typically highly optimized code in
relatively slow 8- and 16-bit environments (when all this was defined)
is quite different from how it is now. So what made good sense back then
doesn't seem to make as much sense now -- but the rules from back then
are still valid.

Gerhard

Paul · Mar 17, 2011

Peter Remmers said:
Am 16.03.2011 13:57, schrieb Paul:

This is just another example of word games the identifier is arr in
following:
int* arr = new int[16];

Yes its a pointer but it s a pointer to an array. And, as it's the
only
identifier we have for the array, it's the arrays' idientifier.
A entity withiout an identifier is not a useable entity.

Of course it's usable. You do have indirect access to it via a pointer
to
its members. You don't need an identifier for the array itself to use
it.

Click to expand...

Of course you need an identifier to use it. Show me how to use an array
without an identifier...

I await your response

Click to expand...

You want me to repeat that sentence? Because it's the answer. Here we go:

You do have indirect access to it via a pointer to its members. You don't
need an identifier for the array itself to use it.

Ah you don;t understand that, why not hmm. Not much more i can say if you
don't ubnderstand that , its pretty simple really 1 name refers to 2
entitites

Click to expand...

The basic property of a name is that it uniquely identifies one entity. If
a name is to refer to more than one entity, its purpose is defeated. As
has been pointed out, the only exception in C++ are overloaded functions,
because they have an additional means of distiction (their parameter
types).

You are obvioulsy an extremely thick person , just like the other idiots who
think this idiotic way.
I don't mean to be rude or nasty but that is simply how it is , I find you
to be a dumb idiot.
GL with C++ , you're going to need it.

<snip>

Unsigned types are DANGEROUS??	1	Mar 14, 2011
Types in C	117	May 22, 2011
shift, signed unsigned	5	Feb 7, 2006
rescale signed to unsigned (short) int	11	Sep 10, 2010
Types	58	Dec 10, 2006
Three questions about signed/unsigned type representations	8	Dec 4, 2004
Working with unsigned/signed types	0	Dec 20, 2006
fundamental types	8	Jul 14, 2005

Unsigned types are DANGEROUS??

Noah Roberts

Noah Roberts

SG

ghartshaw

Gerhard Fiedler

MikeP

MikeP

MikeP

MikeP

Johannes Schaub (litb)

Johannes Schaub (litb)

Peter Remmers

MikeP

Peter Remmers

SG

Paul

Gerhard Fiedler

Gerhard Fiedler

Gerhard Fiedler

Paul

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads