Validity of pointer conversions

I

Ioannis Vranos

Are the following codes guaranteed to work always?


1.

#include <iostream>


inline void some_func(int *p, const std::size_t SIZE)
{
using namespace std;

for(size_t i=0; i<SIZE; ++i)
cout<< p<< " ";
}


int main()
{
int array[10][5]= {0};

some_func(array[0], sizeof(array)/sizeof(**array));

std::cout<< std::endl;
}


The above prints 50 zeros. I think it is guaranteed to work, since all
arrays are sequences of their elements.



2.

#include <iostream>


int main()
{
using namespace std;

int array[50]= {0};

int (*p)[5]= reinterpret_cast<int (*)[5]> (&array[0]);

for (size_t i= 0; i< 10; ++i)
for(size_t j=0; j<5; ++j)
cout<< p[j]<<" ";

cout<< endl;

}


Here p behaves as a 2-dimensional matrix, that is a 10x5 matrix. I think
it is guaranteed to work for the same reason as the first one, that is
we can treat an array (sequence) of integers as various types of integer
arrays.
 
S

Salt_Peter

Are the following codes guaranteed to work always?

1.

#include <iostream>

inline void some_func(int *p, const std::size_t SIZE)
{
using namespace std;

for(size_t i=0; i<SIZE; ++i)
cout<< p<< " ";

}

int main()
{
int array[10][5]= {0};

some_func(array[0], sizeof(array)/sizeof(**array));

std::cout<< std::endl;

}

The above prints 50 zeros. I think it is guaranteed to work, since all
arrays are sequences of their elements.


// Are you sure? try...
int array[10][5]= {99};

// and as far as a function for an array:
template< typename T,
const std::size_t Rows,
const std::size_t Columns >
void some_func(T(& arr)[Rows][Columns])
{
// do stuff
}

// this works. guarenteed
std::vector said:
2.

#include <iostream>

int main()
{
using namespace std;

int array[50]= {0};

int (*p)[5]= reinterpret_cast<int (*)[5]> (&array[0]);

for (size_t i= 0; i< 10; ++i)
for(size_t j=0; j<5; ++j)
cout<< p[j]<<" ";

cout<< endl;

}

Here p behaves as a 2-dimensional matrix, that is a 10x5 matrix. I think
it is guaranteed to work for the same reason as the first one, that is
we can treat an array (sequence) of integers as various types of integer
arrays.


Anything written in C++ that requires a reinterpret_cast sounds an
alarm here. You can only guess at what the result might be (and
possible test/check the result with typeid).
Hacking is not programming. Respect your types at all costs. Its
directive #1, no exceptions.
Anytime you fool a compiler you are preventing it to help you code.
Basicly, you'll code as if it is a 10x5 matrix and then one day
something will change thats beyond your control.
You'll come back 6 months from now, look at your code, needing to
modify it (ie: add features) and reach for the Asprin tablets (imagine
the client-user of your code trying to figure it all out). An apple is
an apple, if you threat it like an orange then you'll eventually fall
in a hole called undefined behaviour. You will, its a question of
time.
Clients/Customers don't like hacks, and sometimes - that client/
customer ... is you.
 
J

James Kanze

Are the following codes guaranteed to work always?

#include <iostream>
inline void some_func(int *p, const std::size_t SIZE)
{
using namespace std;
for(size_t i=0; i<SIZE; ++i)
cout<< p<< " ";
}

int main()
{
int array[10][5]= {0};
some_func(array[0], sizeof(array)/sizeof(**array));
std::cout<< std::endl;
}
The above prints 50 zeros. I think it is guaranteed to work,
since all arrays are sequences of their elements.

And? I don't see any relationship between what you just said
and any guarantee of working. You have an array bounds
violation, which is undeefined behavior. And there have been
(and maybe still are) implementations which detect it, and
treat it as an error condition.
#include <iostream>
int main()
{
using namespace std;
int array[50]= {0};
int (*p)[5]= reinterpret_cast<int (*)[5]> (&array[0]);
for (size_t i= 0; i< 10; ++i)
for(size_t j=0; j<5; ++j)
cout<< p[j]<<" ";
cout<< endl;
}

Here p behaves as a 2-dimensional matrix, that is a 10x5
matrix.

Almost nothing involving reinterpret_cast is guaranteed to work.
About the only thing that the standard guarantees is that if you
cast the value back to its original type, and you haven't
violated any alignment restrictions in the intermediate types,
the value will compare equal to the original value (and thus,
designate the same object designated by the original pointer).

From a quality of implementation point of view: the standard
does say that the conversion is expected to be "unsurprising"
for someone familiar with the addressing architecture of the
processor, so I would expect this to work on most
implementations.
I think it is guaranteed to work for the same reason
as the first one,

It is totally unrelated to the first. reinterpret_cast is quite
different from other conversions.
that is we can treat an array (sequence) of integers as
various types of integer arrays.

If by that you mean that you can play games with the dimensions,
as long as the total number of elements is unchanged, that is
simply false.
 
J

James Kanze

Are the following codes guaranteed to work always?
1.
#include <iostream>
inline void some_func(int *p, const std::size_t SIZE)
{
using namespace std;
for(size_t i=0; i<SIZE; ++i)
cout<< p<< " ";
}
int main()
{
int array[10][5]= {0};
some_func(array[0], sizeof(array)/sizeof(**array));
std::cout<< std::endl;
}
The above prints 50 zeros. I think it is guaranteed to work,
since all arrays are sequences of their elements.

// Are you sure? try...
int array[10][5]= {99};

What does that change? You have different initial values
(array[0][0] == 99, all other elements == 0). But there is
still an array bounds violation in the function, which is
undefined behavior.
// and as far as a function for an array:
template< typename T,
const std::size_t Rows,
const std::size_t Columns >
void some_func(T(& arr)[Rows][Columns])
{
// do stuff
}
// this works. guarenteed
std::vector< std::vector< int > > vvn(10, std::vector<int>(5, 99));

That does something different. It initializes all of the
elements with 99, rather than the first with 99, and all of the
others with 0.
2.
#include <iostream>
int main()
{
using namespace std;
int array[50]= {0};
int (*p)[5]= reinterpret_cast<int (*)[5]> (&array[0]);
for (size_t i= 0; i< 10; ++i)
for(size_t j=0; j<5; ++j)
cout<< p[j]<<" ";
cout<< endl;
}
Here p behaves as a 2-dimensional matrix, that is a 10x5
matrix. I think it is guaranteed to work for the same reason
as the first one, that is we can treat an array (sequence)
of integers as various types of integer arrays.

Anything written in C++ that requires a reinterpret_cast
sounds an alarm here. You can only guess at what the result
might be (and possible test/check the result with typeid).

That's not quite true---there are a few things you can do with
reinterpret_cast which have defined behavior. But this isn't
one of them. On the other hand, the expressed intent of
reinterpret_cast in the standard is to support type punning, in
so far as reasonable on the underlying architecture, so from a
quality of implementation point of view, I would expect it to
work on most architectures.
Hacking is not programming. Respect your types at all costs.
Its directive #1, no exceptions.

C++ has reinterpret_cast for a reason. I use it, for example,
when implementing things like malloc or garbage collection. In
such cases, it's a necessary evil.

In anything but such low level (architecture dependent)
programming, of course, it's a guaranteed problem, if only for
reasons of readability.

(For the rest, I very much agree with the part I've cut.
Anything involving reinterpret_cast is a hack, and hacks should
be reserved for the cases where they are absolutely necessary.)
 
I

Ioannis Vranos

Salt_Peter said:
Are the following codes guaranteed to work always?

1.

#include <iostream>

inline void some_func(int *p, const std::size_t SIZE)
{
using namespace std;

for(size_t i=0; i<SIZE; ++i)
cout<< p<< " ";

}

int main()
{
int array[10][5]= {0};

some_func(array[0], sizeof(array)/sizeof(**array));

std::cout<< std::endl;

}

The above prints 50 zeros. I think it is guaranteed to work, since all
arrays are sequences of their elements.


// Are you sure? try...
int array[10][5]= {99};



OK, it prints:

[john@localhost src]$ ./foobar-cpp
99 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0


as expected. When initialising a built-in array with initial values, the
rest members of the array that are not explicitly assigned with an
initial value, are initialised with 0.

// and as far as a function for an array:
template< typename T,
const std::size_t Rows,
const std::size_t Columns >
void some_func(T(& arr)[Rows][Columns])
{
// do stuff
}

// this works. guarenteed
std::vector< std::vector< int > > vvn(10, std::vector<int>(5, 99));


I am not looking for ways to do it. I am just asking if these specific
uses are guaranteed to work as expected.


Anything written in C++ that requires a reinterpret_cast sounds an
alarm here. You can only guess at what the result might be (and
possible test/check the result with typeid).
Hacking is not programming. Respect your types at all costs. Its
directive #1, no exceptions.
Anytime you fool a compiler you are preventing it to help you code.
Basicly, you'll code as if it is a 10x5 matrix and then one day
something will change thats beyond your control.
You'll come back 6 months from now, look at your code, needing to
modify it (ie: add features) and reach for the Asprin tablets (imagine
the client-user of your code trying to figure it all out). An apple is
an apple, if you threat it like an orange then you'll eventually fall
in a hole called undefined behaviour. You will, its a question of
time.
Clients/Customers don't like hacks, and sometimes - that client/
customer ... is you.


I am asking if it is a *valid* low-level behaviour, and not an undefined
behaviour. We could use

int (*p)[5]= static_cast<int (*)[5]> (static_cast<void *>(&array[0]));

instead of the reinterpret_cast instead.


My question (actually what I think I know and I want others to verify)
is, a built in array of type T, is a sequence of its members of type T,
and thus we can treat it as arrays of various forms.


Consider another example:


#include <iostream>
#include <cstdlib>


int main()
{
using namespace std;

const char *pc= "This is a test.";


// 100% guaranteed to work
const char *p1= pc;

while(*p1)
cout<< *p1++;

cout<< endl;

// 100% guaranteed to work
const char (*p2)[16]= reinterpret_cast<const char (*)[16]>(pc);

for(size_t j= 0; j<sizeof(*p2)/sizeof(**p2); ++j)
cout<< p2[0][j];

cout<< endl;


// ==> Here is my question. Is it 100% guaranteed to work? AFAIK yes.
const char (*p3)[8]= reinterpret_cast<const char (*)[8]>(pc);

for(size_t i= 0; i<2; ++i)
for(size_t j= 0; j<sizeof(*p3)/sizeof(**p3); ++j)
cout<< p3[j];

cout<< endl;
}
 
I

Ioannis Vranos

James said:
Are the following codes guaranteed to work always?

#include <iostream>
inline void some_func(int *p, const std::size_t SIZE)
{
using namespace std;
for(size_t i=0; i<SIZE; ++i)
cout<< p<< " ";
}

int main()
{
int array[10][5]= {0};
some_func(array[0], sizeof(array)/sizeof(**array));
std::cout<< std::endl;
}
The above prints 50 zeros. I think it is guaranteed to work,
since all arrays are sequences of their elements.

And? I don't see any relationship between what you just said
and any guarantee of working. You have an array bounds
violation, which is undeefined behavior. And there have been
(and maybe still are) implementations which detect it, and
treat it as an error condition.



What exact array bounds violation is there in the code above?

"int array[10][5];" is a sequence of 50 integers.


#include <iostream>
int main()
{
using namespace std;
int array[50]= {0};
int (*p)[5]= reinterpret_cast<int (*)[5]> (&array[0]);
for (size_t i= 0; i< 10; ++i)
for(size_t j=0; j<5; ++j)
cout<< p[j]<<" ";
cout<< endl;
}

Here p behaves as a 2-dimensional matrix, that is a 10x5
matrix.

Almost nothing involving reinterpret_cast is guaranteed to work.



OK, consider int (*p)[5]= static_cast<int (*)[5]> (static_cast<void
*>(&array[0])); instead.

If by that you mean that you can play games with the dimensions,
as long as the total number of elements is unchanged, that is
simply false.


Why? In all cases we have the same sequence of ints, that is

int array[50], int array[10][5], int array[5][10] are all implemented as
the same sequence of 50 ints. If they are not implemented in the same
way, where do they differ?


Thanks.
 
J

jkherciueh

Ioannis said:
James said:
Are the following codes guaranteed to work always?

#include <iostream>
inline void some_func(int *p, const std::size_t SIZE)
{
using namespace std;
for(size_t i=0; i<SIZE; ++i)
cout<< p<< " ";
}

int main()
{
int array[10][5]= {0};
some_func(array[0], sizeof(array)/sizeof(**array));
std::cout<< std::endl;
}
The above prints 50 zeros. I think it is guaranteed to work,
since all arrays are sequences of their elements.

And? I don't see any relationship between what you just said
and any guarantee of working. You have an array bounds
violation, which is undeefined behavior. And there have been
(and maybe still are) implementations which detect it, and
treat it as an error condition.



What exact array bounds violation is there in the code above?

"int array[10][5];" is a sequence of 50 integers.


No. It is a array of 10 arrays of 5 int. It so happens to be guaranteed that
the total of 50 int are arranged contiguously in memory, but that does not
magically turn int array[10][5] into an array of 50 int. Consequently, an
expression like

array[7][6]

is an out-of-bounds access to the 6th element (which does not exist) of the
7th array of 5 int.

#include <iostream>
int main()
{
using namespace std;
int array[50]= {0};
int (*p)[5]= reinterpret_cast<int (*)[5]> (&array[0]);
for (size_t i= 0; i< 10; ++i)
for(size_t j=0; j<5; ++j)
cout<< p[j]<<" ";
cout<< endl;
}

Here p behaves as a 2-dimensional matrix, that is a 10x5
matrix.

Almost nothing involving reinterpret_cast is guaranteed to work.



OK, consider int (*p)[5]= static_cast<int (*)[5]> (static_cast<void
*>(&array[0])); instead.

If by that you mean that you can play games with the dimensions,
as long as the total number of elements is unchanged, that is
simply false.


Why? In all cases we have the same sequence of ints, that is

int array[50], int array[10][5], int array[5][10] are all implemented as
the same sequence of 50 ints. If they are not implemented in the same
way, where do they differ?


They differ in type. This information is known to the compiler and the
compiler is free to detect that

array[7][6]

is an out-of-bounds access. You will not find that casting pointer makes any
guarantees that type-derived bounds can be moved via casting.



Best

Kai-Uwe Bux
 
I

Ioannis Vranos

1.
#include <iostream>
inline void some_func(int *p, const std::size_t SIZE)
{
using namespace std;
for(size_t i=0; i<SIZE; ++i)
cout<< p<< " ";
}
int main()
{
int array[10][5]= {0};
some_func(array[0], sizeof(array)/sizeof(**array));
std::cout<< std::endl;
}

What exact array bounds violation is there in the code above?

"int array[10][5];" is a sequence of 50 integers.

No. It is a array of 10 arrays of 5 int. It so happens to be guaranteed that
the total of 50 int are arranged contiguously in memory, but that does not
magically turn int array[10][5] into an array of 50 int. Consequently, an
expression like

array[7][6]

is an out-of-bounds access to the 6th element (which does not exist) of the
7th array of 5 int.




Yes I can understand that having defined an array with the name array as:

int array[10][5];

with array[7][6] we are accessing out of the array (and the
implementation sequence).


However in the above code I am using an int pointer to output all
members of the array (50 in total) in a 1-dimension array fashion. I do
not point to any element after the one past the end, or to any before
the first one. So I am not violating the boundaries of the sequence.


int array[50], int array[10][5], int array[5][10] are all implemented as
the same sequence of 50 ints. If they are not implemented in the same
way, where do they differ?

They differ in type.


Yes I know they differ in type. Also it is 100% guaranteed to consider
any of the array examples above as an array of unsigned char.




This information is known to the compiler and the
compiler is free to detect that

array[7][6]

is an out-of-bounds access. You will not find that casting pointer makes any
guarantees that type-derived bounds can be moved via casting.



I do not know to which "array" definition you are referring with that.
Can you provide the definition along with your out-of-bounds access
example, and mention where I access out-of-bounds?


Thanks.
 
J

jkherciueh

Ioannis said:
1.
#include <iostream>
inline void some_func(int *p, const std::size_t SIZE)
{
using namespace std;
for(size_t i=0; i<SIZE; ++i)
cout<< p<< " ";
}
int main()
{
int array[10][5]= {0};
some_func(array[0], sizeof(array)/sizeof(**array));
std::cout<< std::endl;
}

What exact array bounds violation is there in the code above?

"int array[10][5];" is a sequence of 50 integers.

No. It is a array of 10 arrays of 5 int. It so happens to be guaranteed
that the total of 50 int are arranged contiguously in memory, but that
does not magically turn int array[10][5] into an array of 50 int.
Consequently, an expression like

array[7][6]

is an out-of-bounds access to the 6th element (which does not exist) of
the 7th array of 5 int.




Yes I can understand that having defined an array with the name array as:

int array[10][5];

with array[7][6] we are accessing out of the array (and the
implementation sequence).


Out of the 7th array, yes; but not out of the implementation sequence:

7*5+6 = 41 < 10*5 = 50

However in the above code I am using an int pointer to output all
members of the array (50 in total) in a 1-dimension array fashion. I do
not point to any element after the one past the end, or to any before
the first one. So I am not violating the boundaries of the sequence.

You _assume_ that 50 int that lie contiguously in memory can be treated as
an array and that pointer-arithmetic can be used to access any of them
given a pointer to the first. That hypothesis, however, is not waranted by
the standard. To get rid of all the function call ramifications in your
example, consider the following:

int array [10][5] = {0};
int* p = &array[0][0]; // line 2
std::cout << p[5] << '\n'; // line 3

This compares to your code since &array[0][0] is what array[0] decays to
when passed as a parameter to some_func.

The question is whether the third line has undefined behavior. Consider the
following hypothetical implementation of pointers: a pointer is a tripple
of three addresses, the first to the pointee and the other two specifying a
valid range for pointer arithmetic. Similarly, every array (static or
dynamic) has its bounds stored somewhere. When an array decays to a
pointer, these bounds are used to deduce the range for the pointer.
Whenever pointer-arithmetic yields a pointer outside the valid range,
dereferencing triggers a segfault. Such an implementation is not ruled out
by any provisions of the standard that I know of.

Note that in line 2, the compiler has static type information about the rhs.
The rhs is a pointer to the first element in an array of 5. Thus, in the
described implementation, p will be given a range of size 5 and line 3 is
an out-of-bounds access since it dereferences the past-end position of the
5 int sequence array[0] in a way that is obtained through pointer
arithmetic from a pointer into the 5 int sequence array[0].

If you think that a range-checking implementation is not standard
conforming, please provide some language from the standard supporting your
point of view. Note that the guarantee of arrays being contiguous is met by
such an implementation. What such an implementation prevents is just the
reinterpretation of array sizes through pointer casting and pointer
arithmetic.


[snip]

Best

Kai-Uwe Bux
 
S

Salt_Peter

Salt_Peter said:
Are the following codes guaranteed to work always?
1.
#include <iostream>
inline void some_func(int *p, const std::size_t SIZE)
{
using namespace std;
for(size_t i=0; i<SIZE; ++i)
cout<< p<< " ";
}
int main()
{
int array[10][5]= {0};
some_func(array[0], sizeof(array)/sizeof(**array));
std::cout<< std::endl;
}
The above prints 50 zeros. I think it is guaranteed to work, since all
arrays are sequences of their elements.

// Are you sure? try...
int array[10][5]= {99};

OK, it prints:

[john@localhost src]$ ./foobar-cpp
99 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0


Uh, no it doesn't

99, 0, 0, 0, 0,
99, 0, 0, 0, 0,
-8 more times-

as expected. When initialising a built-in array with initial values, the
rest members of the array that are not explicitly assigned with an
initial value, are initialised with 0.

Indeed, except you have mistaken an array with a 'sequence' of arrays.
There is a critical distinction to be made.
// and as far as a function for an array:
template< typename T,
const std::size_t Rows,
const std::size_t Columns >
void some_func(T(& arr)[Rows][Columns])
{
// do stuff
}
// this works. guarenteed
std::vector< std::vector< int > > vvn(10, std::vector<int>(5, 99));

I am not looking for ways to do it. I am just asking if these specific
uses are guaranteed to work as expected.


Anything written in C++ that requires a reinterpret_cast sounds an
alarm here. You can only guess at what the result might be (and
possible test/check the result with typeid).
Hacking is not programming. Respect your types at all costs. Its
directive #1, no exceptions.
Anytime you fool a compiler you are preventing it to help you code.
Basicly, you'll code as if it is a 10x5 matrix and then one day
something will change thats beyond your control.
You'll come back 6 months from now, look at your code, needing to
modify it (ie: add features) and reach for the Asprin tablets (imagine
the client-user of your code trying to figure it all out). An apple is
an apple, if you threat it like an orange then you'll eventually fall
in a hole called undefined behaviour. You will, its a question of
time.
Clients/Customers don't like hacks, and sometimes - that client/
customer ... is you.

I am asking if it is a *valid* low-level behaviour, and not an undefined
behaviour. We could use

int (*p)[5]= static_cast<int (*)[5]> (static_cast<void *>(&array[0]));

instead of the reinterpret_cast instead.

My question (actually what I think I know and I want others to verify)
is, a built in array of type T, is a sequence of its members of type T,
and thus we can treat it as arrays of various forms.

Here is a typical story about Mr Hacker and Dr Programmer.
Its fictitious although you'ld be surprised how often it happens.

Customer approaches Mr Hacker, he requires a type X to be streameable
to file, a socket and to and from various interfaces.

struct X
{
...
};

Type X is a mix of characters, integers, floats and a few other
members, nothing special. Mr Hacker proceeds to reinterpret_cast an
array of X elements into an array of some primitive type (char, Byte)
and streams the data (including padding!) and completes the job in
about 16 hours. Mr Hacker pridefully sends the project back at which
point the Client announces he's got to stream 2 more types as well. Mr
Hacker's jaw drops, client looks at code, gives Mr Hacker shit for
streaming the padding. Mr Hacker needs 3 days to get that job done.
Client now wants a couple more classes to be streameable (and he needs
them NOW). Client sees little benefit to continue the contract.

Client goes to see Dr Programmer. Dr Programmer completes the project
in about 16 hours too. Client notices that no padding is being
transferred, nice. Client needs 15 new classes to use the system,
however. Mr Programmer looks at one class - derives it from type X,
writes an insertion operator and an extraction operator for X,
modifies the interface as the compiler complains about the required
pure-virtual functions missing.
Client asks when will the project be ready then?
Mr Programmer says: "sir, you don't need to change a single statement
in our code to make it work with your new class, as long as you derive
from this abstract class".
Mr Programmer: "in fact, the code is set up in such a way that the
compilation errors will tell you whats missing, if anything." (thats
what happens when you treat an apple like an apple)
Client goes back to office, takes out another one of his 15 derived
types, looks at Mr Programmer's modifications to X's derivative ...
and smiles: "wow, i can extend this code without calling Mr
Programmer, and i can do it effortlessly because the code is crystal
clear".

In your opinion - who gets the next contract opportunity?
Consider another example:

#include <iostream>
#include <cstdlib>

int main()
{
using namespace std;

const char *pc= "This is a test.";

// 100% guaranteed to work
const char *p1= pc;

while(*p1)
cout<< *p1++;

cout<< endl;

// 100% guaranteed to work
const char (*p2)[16]= reinterpret_cast<const char (*)[16]>(pc);

for(size_t j= 0; j<sizeof(*p2)/sizeof(**p2); ++j)
cout<< p2[0][j];

cout<< endl;

// ==> Here is my question. Is it 100% guaranteed to work? AFAIK yes.
const char (*p3)[8]= reinterpret_cast<const char (*)[8]>(pc);

for(size_t i= 0; i<2; ++i)
for(size_t j= 0; j<sizeof(*p3)/sizeof(**p3); ++j)
cout<< p3[j];

cout<< endl;

}
 
I

Ioannis Vranos

Yes I can understand that having defined an array with the name array as:

int array[10][5];

with array[7][6] we are accessing out of the array (and the
implementation sequence).

Out of the 7th array, yes; but not out of the implementation sequence:

7*5+6 = 41 < 10*5 = 50


Yes, you are right, it is out of bounds for the array since it is
defined as int array[10][5]; but my uses are not out of the bounds of
the declared pointer types.


Also consider this. We can initialise the array in both ways:

int array[10][5]=
{{1,1,1,1,1},{1,1,1,1,1},{1,1,1,1,1},{1,1,1,1,1},{1,1,1,1,1},{1,1,1,1,1},{1,1,1,1,1},{1,1,1,1,1},{1,1,1,1,1},{1,1,1,1,1}};

and

int array[10][5]=
{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1};

(as a sequence).



You _assume_ that 50 int that lie contiguously in memory can be treated as
an array and that pointer-arithmetic can be used to access any of them
given a pointer to the first. That hypothesis, however, is not waranted by
the standard. To get rid of all the function call ramifications in your
example, consider the following:

int array [10][5] = {0};
int* p = &array[0][0]; // line 2
std::cout << p[5] << '\n'; // line 3

This compares to your code since &array[0][0] is what array[0] decays to
when passed as a parameter to some_func.

The question is whether the third line has undefined behavior. Consider the
following hypothetical implementation of pointers: a pointer is a tripple
of three addresses, the first to the pointee and the other two specifying a
valid range for pointer arithmetic. Similarly, every array (static or
dynamic) has its bounds stored somewhere. When an array decays to a
pointer, these bounds are used to deduce the range for the pointer.
Whenever pointer-arithmetic yields a pointer outside the valid range,
dereferencing triggers a segfault. Such an implementation is not ruled out
by any provisions of the standard that I know of.

Note that in line 2, the compiler has static type information about the rhs.
The rhs is a pointer to the first element in an array of 5. Thus, in the
described implementation, p will be given a range of size 5 and line 3 is
an out-of-bounds access since it dereferences the past-end position of the
5 int sequence array[0] in a way that is obtained through pointer
arithmetic from a pointer into the 5 int sequence array[0].

If you think that a range-checking implementation is not standard
conforming, please provide some language from the standard supporting your
point of view. Note that the guarantee of arrays being contiguous is met by
such an implementation. What such an implementation prevents is just the
reinterpretation of array sizes through pointer casting and pointer
arithmetic.

OK, I agree that it is not explicitly guaranteed by the standard, but I
think in reality it always works. But as it was said, it is not
guaranteed by the standard.

Case closed.


I asked a similar question regarding C90 code in clc, since with very
few exceptions, C90 is a subset of C++03 and here are the question and
the answers I got:


Question:

Are the following guaranteed to work always as *C90* code?

1.

#include <stdio.h>


void some_func(int *p, const size_t SIZE)
{
size_t i;

for(i=0; i<SIZE; ++i)
printf("%d ", p);
}


int main(void)
{
int array[10][5]= {0};

some_func(array[0], sizeof(array)/sizeof(**array));

puts("");

return 0;
}



The above prints 50 zeros. I think it is guaranteed to work, since all
arrays are sequences of their elements.



2.

#include <stdio.h>
#include <stdlib.h>


int main(void)
{
size_t i, j;

int array[50]= {0};

int (*p)[5]= (int (*)[5])(&array[0]);

for (i= 0; i< 10; ++i)
for(j=0; j<5; ++j)
printf("%d ", p[j]);

puts("");

return 0;
}


Here p behaves as a 2-dimensional matrix, that is a 10x5 matrix. I think
it is guaranteed to work for the same reason as the first one, that is
we can treat an array (sequence) of integers as various types of integer
arrays.

=========================================================================

Answer 1:

It is not guaranteed to work.
The problem is that there is no array of int
with more than 5 members declared anywhere.

Pointers to char are allowed to step through the bytes
of any object, but that's by a special rule.

I can't conceive of any mechanism by which your code could fail,
but the guarantee of which you speak, is not there.

Your pointer to int,
is over running the boundaries of an array of 5 int.

-- pete

=========================================================================


Answer 2:

It's not necessarily legal, but it ought to be. There is a related
example at the end of 6.7.5.3 in the C99 Rationale (non-normative of
course),

----------------------------------------------------------------------
The following example demonstrates how to declare parameters
in any order and avoid lexical ordering issues

void g(double *ap, int n)
{
double (*a)[n]= (double (*)[n]) ap;
/* ... */ a[1][2] /* ... */
}

in this case, the parameter ap is assigned to a local
pointer that is declared to be a pointer to a variable
length array. The function g can be called as in

{
double x[10][10];
g(&x[0][0], 10);
}
----------------------------------------------------------------------

This sort of thing is common in numerical software and (some) authors
of the standard intended it to work, but the normative standard doesn't
necessarily guarantee it. Personally I would consider any failing
implementation as broken.


-- pa at panix dot com
 
I

Ioannis Vranos

Salt_Peter said:
// Are you sure? try...
int array[10][5]= {99};
OK, it prints:

[john@localhost src]$ ./foobar-cpp
99 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0

Uh, no it doesn't

99, 0, 0, 0, 0,
99, 0, 0, 0, 0,
-8 more times-


Can you provide the exact code you are using, along with its output?
 
J

James Kanze

James said:
Are the following codes guaranteed to work always?
1.
#include <iostream>
inline void some_func(int *p, const std::size_t SIZE)
{
using namespace std;
for(size_t i=0; i<SIZE; ++i)
cout<< p<< " ";
}
int main()
{
int array[10][5]= {0};
some_func(array[0], sizeof(array)/sizeof(**array));
std::cout<< std::endl;
}
The above prints 50 zeros. I think it is guaranteed to work,
since all arrays are sequences of their elements.

And? I don't see any relationship between what you just said
and any guarantee of working. You have an array bounds
violation, which is undeefined behavior. And there have been
(and maybe still are) implementations which detect it, and
treat it as an error condition.

What exact array bounds violation is there in the code above?
"int array[10][5];" is a sequence of 50 integers.

No. "int array[10][5]" is an array[10] of array[5] of int. The
standard may require that it be physically laid out as a
sequence of 50 integers, but that has nothing to do with the
type. The type is "int [10][5]". When it decays to a
pointer, the type is "int (*)[5]", a pointer to the first of
ten elements, and when you dereference said pointer, the result
is an int*, pointer to the first of five elements.

The authors of the C standard went out of there way to ensure
that an implementation which tracked bounds would be legal.
2.
#include <iostream>
int main()
{
using namespace std;
int array[50]= {0};
int (*p)[5]= reinterpret_cast<int (*)[5]> (&array[0]);
for (size_t i= 0; i< 10; ++i)
for(size_t j=0; j<5; ++j)
cout<< p[j]<<" ";
cout<< endl;
}
Here p behaves as a 2-dimensional matrix, that is a 10x5
matrix.

Almost nothing involving reinterpret_cast is guaranteed to work.

OK, consider int (*p)[5]= static_cast<int (*)[5]> (static_cast<void
*>(&array[0])); instead.
If by that you mean that you can play games with the dimensions,
as long as the total number of elements is unchanged, that is
simply false.

The short answer is because the standard says so. The rationale
behind this is to allow bounds checking implementations. Such
implementations have existed, and may still exist. (Centerline
offered one, and I believe that the Centerline compiler is still
on the market.)
In all cases we have the same sequence of ints,

So? What does physical layout have to do with type?
that is int array[50], int array[10][5], int array[5][10] are
all implemented as the same sequence of 50 ints. If they are
not implemented in the same way, where do they differ?

The underlying physical layout may be the same, but that doesn't
mean that they have the same type.
 
J

James Kanze

(e-mail address removed) wrote:
Yes I can understand that having defined an array with the
name array as:
int array[10][5];
with array[7][6] we are accessing out of the array (and the
implementation sequence).
Out of the 7th array, yes; but not out of the implementation sequence:
7*5+6 = 41 < 10*5 = 50
Yes, you are right, it is out of bounds for the array since it is
defined as int array[10][5]; but my uses are not out of the bounds of
the declared pointer types.
Also consider this. We can initialise the array in both ways:
int array[10][5]=
{{1,1,1,1,1},{1,1,1,1,1},{1,1,1,1,1},{1,1,1,1,1},{1,1,1,1,1},{1,1,1,1,1},{1,1,1,1,1},{1,1,1,1,1},{1,1,1,1,1},{1,1,1,1,1}};

int array[10][5]=
{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1};
(as a sequence).

That's because the compiler is required to insert the missing
braces. Which is an entirely orthogonal issue (and applies to
struct's as well, where padding may mean that you don't actually
have contiguous memory).

[from comp.lang.c...]
Answer 2:
It's not necessarily legal, but it ought to be. There is a related
example at the end of 6.7.5.3 in the C99 Rationale (non-normative of
course),
void g(double *ap, int n)
{
double (*a)[n]= (double (*)[n]) ap;
/* ... */ a[1][2] /* ... */
}
in this case, the parameter ap is assigned to a local
pointer that is declared to be a pointer to a variable
length array. The function g can be called as in
{
double x[10][10];
g(&x[0][0], 10);
}
----------------------------------------------------------------------
This sort of thing is common in numerical software and (some) authors
of the standard intended it to work, but the normative standard doesn't
necessarily guarantee it.

Other authors of the C90 standard definitely intended for it to
be undefined behavior. (I've discussed this in person with
them.) There was a definite intention on the part of some to
allow bounds checking implementations (and such implementations
have existed, and probably still do exist).
Personally I would consider any failing implementation as
broken.

Thus, Centerline was broken? Thus, an implementation which did
exactly what some of the authors of the C standard hoped some
implementations would do is broken?
 
I

Ioannis Vranos

James said:
that is int array[50], int array[10][5], int array[5][10] are
all implemented as the same sequence of 50 ints. If they are
not implemented in the same way, where do they differ?

The underlying physical layout may be the same, but that doesn't
mean that they have the same type.


I didn't say that they are of the same type, I was just wondering if we
can "adjust" arrays to behave the way we want, one dimensional array,
two dimensional array and so for.

I got the idea from valarray & slice combinations to "create" matrices
as we like (2x4, 3x5 and so on) while the whole thing is really based on
a 1-dimensional valarray.

Thought perhaps we could do the same with built in arrays by using
pointers since as you said, the physical implementation is the same.

I think it would be great if this was guaranteed to work, the same way
there is a special guarantee that we can treat POD objects as sequences
of chars/unsigned chars.
 
E

Erik Wikström

James said:
that is int array[50], int array[10][5], int array[5][10] are
all implemented as the same sequence of 50 ints. If they are
not implemented in the same way, where do they differ?

The underlying physical layout may be the same, but that doesn't
mean that they have the same type.


I didn't say that they are of the same type, I was just wondering if we
can "adjust" arrays to behave the way we want, one dimensional array,
two dimensional array and so for.

I got the idea from valarray & slice combinations to "create" matrices
as we like (2x4, 3x5 and so on) while the whole thing is really based on
a 1-dimensional valarray.

Thought perhaps we could do the same with built in arrays by using
pointers since as you said, the physical implementation is the same.

I think it would be great if this was guaranteed to work, the same way
there is a special guarantee that we can treat POD objects as sequences
of chars/unsigned chars.

Why not just use a one-dimensional array of sufficient size and wrap it
in a class and use some simple calculations to get the correct element,
it is guaranteed to work and in the end it will probably generate the
same code. Using proxy classes you can even get the same syntax (but I
would prefer to simply overload operator() ).
 
N

Number774

Are the following codes guaranteed to work always?
<snip>

Ioannis,

They aren't guaranteed to work, and I can think of at least one
architecture in which they probably won't.

On the 8086 memory model, an int is usually 16 bits, and a "segment"
of memory is 64Kb - allowing 32k ints. Segments are 16 bytes apart -
a "paragraph" - and can overlap. You can tell the compiler to work in
several ways to get around this limit.

If the compiler is working in such a mode as to give a different
segment address for each of your 10 arrays, each one will be padded up
to the next paragraph boundary, so you'll have 10 bytes for the 5
integers - and then 6 bytes free (almost the same as padding). The
next array will be on the next paragraph boundary. There's no bounds
checking, so the first 5 integers will be the ones you expect, the
next 3 will be the padding, and only then will you get the 2nd 5.

Ciao
 
G

Grizlyk

James said:
int array[10][5]= {99};

What does that change?  You have different initial values
(array[0][0] == 99, all other elements == 0).  

I think it is not a best idea to trust that all other elements will be
zero. Default constructor for int will be called, and i have some
compilers, which do nothing in the case (a trash from previous memory
users will be found in the array). The same thing does for pointers
(int* for example).
C++ has reinterpret_cast for a reason.  I use it, for example,
when implementing things like malloc or garbage collection.  In
such cases, it's a necessary evil.

In anything but such low level (architecture dependent)
programming, of course, it's a guaranteed problem, if only for
reasons of readability.

One can make separated namespaces to locate code specific for each
architecture and later to write using for correct one. A general
(suitable for all platforms, but with bad performance) form of the
code often can exist also in own namespace.

Maksim A. Polyanin
old page about some C++ improvements:
http://grizlyk1.narod.ru/cpp_new
 
D

Default User

Grizlyk said:
James said:
int array[10][5]= {99};

What does that change?  You have different initial values
(array[0][0] == 99, all other elements == 0).  

I think it is not a best idea to trust that all other elements will be
zero. Default constructor for int will be called, and i have some
compilers, which do nothing in the case (a trash from previous memory
users will be found in the array). The same thing does for pointers
(int* for example).


That's not correct. If an incomplete initializer is used, all other
elements will be initialized as if it were set to 0. I don't know if
the C++ standard covers it specifically (my copy is at work), as it's
inherited from C. Here's the C99 draft standard on the issue:

[#21] If there are fewer initializers in a brace-enclosed
list than there are elements or members of an aggregate, or
fewer characters in a string literal used to initialize an
array of known size than there are elements in the array,
the remainder of the aggregate shall be initialized
implicitly the same as objects that have static storage
duration.




Brian
 
G

Grizlyk

Ioannis said:
I am asking if it is a *valid* low-level behaviour,
and not an undefined behaviour. We could use

int (*p)[5]= static_cast<int (*)[5]>
(static_cast<void *>(&array[0]));

instead of the reinterpret_cast.

My question (actually what I think I know and I want
others to verify) is, a built in array of type T,
is a sequence of its members of type T,
and thus we can treat it as arrays of various forms.

Standard requires for type "char", that hardware memory is always
logically (for C++) continuous (without holes) area. Each "char" has
own uniq address (value of "char" pointer) of memory and the addresses
are ordered as ordinary continuous numbers, like 1,2,3, etc.

As i have understood, the "char" addresses have ability to be hidden
translated into specific hardware adresses. I am not shure about
standard limits of the hidden translation, but hardware memory can be
developed: as banks (plans) of memory, as memory with holes, etc.

For example: "char" pointer with numeric value "1" could be explicitly
translated by compiler into hardware addres with other numeric value -
"23". I am not shure about return value of
reinterpret_cast<int>(char*) in the case: "1" or "23", but for other
cases the value of "char" pointer is "1".

C-style array is logically (for programmer) continuous (without holes)
area. I do not know does standard requre the condition, but do not see
compilers with other one.

In accordance with rules for "char" pointer, listed above and in
accordance with the fact, that array[1] means pointer+1, array[2]
means pointer+2 etc array of char is logically (for C++ and
programmer) continuous (without holes) area.

So the area of array of "char" can be safely divided for any logical
dimensions inside total area size, because access for each member is
independed from dimensions: a[y][x] is *(&a+y*x_size+x). I do not know
why C++ does not allow static_cast for the conversions.

For other types, with size greater than "char", there is alignment.
But once aligned each element of array could be accessible via the
initial aligned pointer - all similar to type "char". "Could" because
i do not know does stadard require the same conditions (to be
continuous area of memory) for any type and its pointer as for "char"
and "char" pointer or does not.

I think dimensions of C-style array are logical and your always have a
solid array of y_size*x_size size.

Maksim A. Polyanin
old page about some C++ improvements:
http://grizlyk1.narod.ru/cpp_new
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top