valarray <vallaray<T> > efficiency

J

john

Hi, in TC++PL 3 on pages 674-675 it is mentioned:


"Maybe your first idea for a two-dimensional vector was something like this:

class Matrix {
valarray< valarray<double> >v;
public:
// ...
};

This would also work (22.9[10]). However, it is not easy to match
efficiency and compatibility required by high performance computations
without dropping to the lower and more conventional level represented by
valarray plus slices".


However since 1998 much time has passed, and I wonder if the current
compiler implementations allow valarray<valarray<T> > to be equally
efficient (or more) than using a valarray with slices/slice_arrays.
 
K

kwikius

Hi, in TC++PL 3 on pages 674-675 it is mentioned:

"Maybe your first idea for a two-dimensional vector was something like this:

class Matrix {
      valarray< valarray<double> >v;
public:
       // ...

};

This would also work (22.9[10]). However, it is not easy to match
efficiency and compatibility required by high performance computations
without dropping to the lower and more conventional level represented by
valarray plus slices".

However since 1998 much time has passed, and I wonder if the current
compiler implementations allow valarray<valarray<T> > to be equally
efficient (or more) than using a valarray with slices/slice_arrays.

My 2p's worth.. FWIW

Because a valarray is not a compile time fixed size it must? use heap
allocation, hence a matrix as view of list approach will likely only
require one allocation, whereas matrix as sequence of sequence will
require multiple allocations.

I believe even now that compilers find it difficult to do much
optimisation on heap pointers( even wary of stack pointers and
anything complicated in references), so it probably makes sense to use
as few pointers as possible, then the compiler only needs to deal with
a single offset.

Anyway its an interesting issue :)

regards
Andy Little
 
I

Ioannis Vranos

kwikius said:
My 2p's worth.. FWIW

Because a valarray is not a compile time fixed size it must? use heap
allocation, hence a matrix as view of list approach will likely only
require one allocation, whereas matrix as sequence of sequence will
require multiple allocations.

I believe even now that compilers find it difficult to do much
optimisation on heap pointers( even wary of stack pointers and
anything complicated in references), so it probably makes sense to use
as few pointers as possible, then the compiler only needs to deal with
a single offset.

Anyway its an interesting issue :)


Given that TC++PL3 was written before the change in the standard that
*requires* that vectors, valarrays, etc are implemented as continuous
sequences,wouldn't the following be more efficient or at least the same
efficient as the usage of valarrays and slices/slice_arrays to
"simulate" a 2-dimensional matrix?


#include <iostream>
#include <valarray>



int main()
{
using namespace std;

valarray<int> vi(1, 10);

int (*p)[5]= reinterpret_cast<int (*)[5]> (&vi[0]);

for (size_t i= 0; i< 2; ++i)
for(size_t j=0; j<5; ++j)
cout<< p[j]<<" ";

cout<< endl;

}


Here p behaves a s a 2-dimensional matrix, that is a 2x5 matrix.
 
V

Victor Bazarov

Ioannis said:
Given that TC++PL3 was written before the change in the standard that
*requires* that vectors, valarrays, etc are implemented as continuous
sequences,wouldn't the following be more efficient or at least the
same efficient as the usage of valarrays and slices/slice_arrays to
"simulate" a 2-dimensional matrix?


Given that 'reinterpret_cast' is not supposed to be used that way,
the following has undefined behaviour. Aside from that...
#include <iostream>
#include <valarray>



int main()
{
using namespace std;

valarray<int> vi(1, 10);

int (*p)[5]= reinterpret_cast<int (*)[5]> (&vi[0]);

for (size_t i= 0; i< 2; ++i)
for(size_t j=0; j<5; ++j)
cout<< p[j]<<" ";

cout<< endl;

}


Here p behaves a s a 2-dimensional matrix, that is a 2x5 matrix.


V
 
I

Ioannis Vranos

Victor said:
Given that 'reinterpret_cast' is not supposed to be used that way,


Apart from the approach:

static_cast<int (*)[5]> ( static_cast<void *> (&vi[0]) );


why reinterpret_cast "is not supposed to be used that way"?
 
V

Victor Bazarov

Ioannis said:
Victor said:
Given that 'reinterpret_cast' is not supposed to be used that way,


Apart from the approach:

static_cast<int (*)[5]> ( static_cast<void *> (&vi[0]) );


why reinterpret_cast "is not supposed to be used that way"?

If you reinterpret_cast one object pointer into another object
pointer, the Standard only specifies that the resulting pointer
can be converted back to the original type. Please see Standard,
[expr.reinterpret.cast]/7. Whatever you think you can do with
the pointer, is not in the langauge specification, literally.

As to the two consequitive static_cast operations, the result
of those is also unspecified, since you're not converting the
result of static_cast<void*> back to the original type. See
[expr.static.cast].

V
 
K

kwikius

Given that TC++PL3 was written before the change in the standard that
*requires* that vectors, valarrays, etc are implemented as continuous
sequences,wouldn't the following be more efficient or at least the same
efficient as the usage of valarrays and slices/slice_arrays to
"simulate" a 2-dimensional matrix?

The more interesting question to me is how necessary runtime
resizability of matrices is. Because if they are not resizeable then
there is no need to allocate from the heap and valarray is unnecessary
(you can simple use a wrapped c-style array), in which case the
compiler has much better opportunities for optimisation.

And N.B some might call an e.g array of 3dpoints a matrix where I
would see it as an array of matrices where 1 dimension is equal to 1
dependent on the local convention.

regards
Andy Little
 
E

Erik Wikström

Hi, in TC++PL 3 on pages 674-675 it is mentioned:


"Maybe your first idea for a two-dimensional vector was something like this:

class Matrix {
valarray< valarray<double> >v;
public:
// ...
};

This would also work (22.9[10]). However, it is not easy to match
efficiency and compatibility required by high performance computations
without dropping to the lower and more conventional level represented by
valarray plus slices".


However since 1998 much time has passed, and I wonder if the current
compiler implementations allow valarray<valarray<T> > to be equally
efficient (or more) than using a valarray with slices/slice_arrays.

For various reasons valarray never became what it was meant to be, and
few people use it. Since few use it the library writers have not
bothered much with it so I would not be surprised if the code is more or
less the same as it was back in 1998. And even if they had it would
still not be able to match the efficiency of slices since a valarray of
valarrays adds an additional layer of indirection.
 
I

Ioannis Vranos

Victor said:
Ioannis said:
Victor said:
Given that 'reinterpret_cast' is not supposed to be used that way,

Apart from the approach:

static_cast<int (*)[5]> ( static_cast<void *> (&vi[0]) );


why reinterpret_cast "is not supposed to be used that way"?

If you reinterpret_cast one object pointer into another object
pointer, the Standard only specifies that the resulting pointer
can be converted back to the original type. Please see Standard,
[expr.reinterpret.cast]/7. Whatever you think you can do with
the pointer, is not in the langauge specification, literally.


There, it is mentioned:

"A pointer to an object can be explicitly converted to a pointer to an
object of different type. 65)".

"65) The types may have different cv-qualifiers, subject to the overall
restriction that a reinterpret_cast cannot cast away const-
ness".


As to the two consequitive static_cast operations, the result
of those is also unspecified, since you're not converting the
result of static_cast<void*> back to the original type. See
[expr.static.cast].


Yes, I think in the standard, conversions from "pointer to object" to
"pointer to an array of n objects" is not explicitly mentioned, but I
think there isn't any decent implementation where it does not work
whenever the pointed object is member of an array (sequence).

Actually I have the feeling the guarantee exists in the standard and is
probably "buried" in some clause. :)


So we have

int main()
{


int array[5]= {0};

int *p= &array[0];

int (*q)[5]= reinterpret_cast<int (*)[5]> (p);
}



Shouldn't the last always work?
 
I

Ioannis Vranos

kwikius said:
The more interesting question to me is how necessary runtime
resizability of matrices is. Because if they are not resizeable then
there is no need to allocate from the heap and valarray is unnecessary
(you can simple use a wrapped c-style array), in which case the
compiler has much better opportunities for optimisation.


valarray is not a typical container in terms of implementation. It is
intended to be heavily optimised, and it can use raw storage from
anywhere or something else AFAIK.
 
I

Ioannis Vranos

Erik said:
Hi, in TC++PL 3 on pages 674-675 it is mentioned:


"Maybe your first idea for a two-dimensional vector was something like this:

class Matrix {
valarray< valarray<double> >v;
public:
// ...
};

This would also work (22.9[10]). However, it is not easy to match
efficiency and compatibility required by high performance computations
without dropping to the lower and more conventional level represented by
valarray plus slices".


However since 1998 much time has passed, and I wonder if the current
compiler implementations allow valarray<valarray<T> > to be equally
efficient (or more) than using a valarray with slices/slice_arrays.

For various reasons valarray never became what it was meant to be, and
few people use it. Since few use it the library writers have not
bothered much with it so I would not be surprised if the code is more or
less the same as it was back in 1998. And even if they had it would
still not be able to match the efficiency of slices since a valarray of
valarrays adds an additional layer of indirection.



How about using "pointer to array" pointing to a member of a valarray as
I mentioned in another post in the thread instead of slices, to
simulate a matrix?
 
V

Victor Bazarov

Ioannis said:
Victor said:
Ioannis said:
Victor Bazarov wrote:
Given that 'reinterpret_cast' is not supposed to be used that way,

Apart from the approach:

static_cast<int (*)[5]> ( static_cast<void *> (&vi[0]) );


why reinterpret_cast "is not supposed to be used that way"?

If you reinterpret_cast one object pointer into another object
pointer, the Standard only specifies that the resulting pointer
can be converted back to the original type. Please see Standard,
[expr.reinterpret.cast]/7. Whatever you think you can do with
the pointer, is not in the langauge specification, literally.


There, it is mentioned:

"A pointer to an object can be explicitly converted to a pointer to an
object of different type. 65)".

"65) The types may have different cv-qualifiers, subject to the
overall restriction that a reinterpret_cast cannot cast away const-
ness".


As to the two consequitive static_cast operations, the result
of those is also unspecified, since you're not converting the
result of static_cast<void*> back to the original type. See
[expr.static.cast].


Yes, I think in the standard, conversions from "pointer to object" to
"pointer to an array of n objects" is not explicitly mentioned, but I
think there isn't any decent implementation where it does not work
whenever the pointed object is member of an array (sequence).

Actually I have the feeling the guarantee exists in the standard and
is probably "buried" in some clause. :)

Find it; everybody will be grateful.

V
 
E

Erik Wikström

Erik said:
Hi, in TC++PL 3 on pages 674-675 it is mentioned:


"Maybe your first idea for a two-dimensional vector was something like this:

class Matrix {
valarray< valarray<double> >v;
public:
// ...
};

This would also work (22.9[10]). However, it is not easy to match
efficiency and compatibility required by high performance computations
without dropping to the lower and more conventional level represented by
valarray plus slices".


However since 1998 much time has passed, and I wonder if the current
compiler implementations allow valarray<valarray<T> > to be equally
efficient (or more) than using a valarray with slices/slice_arrays.

For various reasons valarray never became what it was meant to be, and
few people use it. Since few use it the library writers have not
bothered much with it so I would not be surprised if the code is more or
less the same as it was back in 1998. And even if they had it would
still not be able to match the efficiency of slices since a valarray of
valarrays adds an additional layer of indirection.



How about using "pointer to array" pointing to a member of a valarray as
I mentioned in another post in the thread instead of slices, to
simulate a matrix?

If you want a matrix there are a number of libraries out there, or you
can write your own. In the end it usually ends up with a contiguous
piece of memory that is somehow divided into rown/columns and some
simple arithmetic is used to get the correct element. Besides, these
kinds of optimisations usually have a smaller impact than expected in
real world applications where operations are usually performed on the
whole matrix and where the real costs comes from creation of
temporaries. That is why expression templates are used by all libraries
aiming for high performance.
 
E

Erik Wikström

valarray is not a typical container in terms of implementation. It is
intended to be heavily optimised, and it can use raw storage from
anywhere or something else AFAIK.

That was the intention, however, as I mentioned elsethread, that is not
the case. I do not know of any high-performance applications (and I
doubt that they exist) where valarrays are used. My understanding is
that valarray is designed to allow for optimisations (such as utilising
SIMD instructions), but no implementation have made those optimisations.
 
I

Ioannis Vranos

Erik said:
That was the intention, however, as I mentioned elsethread, that is not
the case. I do not know of any high-performance applications (and I
doubt that they exist) where valarrays are used. My understanding is
that valarray is designed to allow for optimisations (such as utilising
SIMD instructions), but no implementation have made those optimisations.


So, what kind of container is used for high performance C++ computing
instead? Those template math libraries on the web?
 
K

kwikius

valarray is not a typical container in terms of implementation. It is
intended to be heavily optimised, and it can use raw storage from
anywhere or something else AFAIK.

In practise I reckon you will find that implementation of valarray
uses "new" to allocate, after all if this cool allocation mechanism
exists , you might as well use it for new too..

regards
Andy Little
 
E

Erik Wikström

So, what kind of container is used for high performance C++ computing
instead? Those template math libraries on the web?

For a non-sparse NxM matrix of type double I would assume that they
simply allocate an array of N*M elements. But I have not looked under
the hood on any of them.
 
I

Ioannis Vranos

Victor said:
>
Ioannis said:
There, it is mentioned:

"A pointer to an object can be explicitly converted to a pointer to an
object of different type. 65)".

"65) The types may have different cv-qualifiers, subject to the
overall restriction that a reinterpret_cast cannot cast away const-
ness". > [...]
Actually I have the feeling the guarantee exists in the standard and
is probably "buried" in some clause. :)

Find it; everybody will be grateful.


I think the above quotation from the standard is sufficient.
 
V

Victor Bazarov

Ioannis said:
Victor said:
Ioannis said:
There, it is mentioned:

"A pointer to an object can be explicitly converted to a pointer to
an object of different type. 65)".

"65) The types may have different cv-qualifiers, subject to the
overall restriction that a reinterpret_cast cannot cast away const-
ness". [...]
Actually I have the feeling the guarantee exists in the standard and
is probably "buried" in some clause. :)

Find it; everybody will be grateful.


I think the above quotation from the standard is sufficient.

No, it's not. You conveniently omitted the rest of the paragraph
7, following the "type. 65)". Read it. The result of such conversion
is unspecified. That means no guarantees are given. Where you get
your "feeling the guarantee exists", I cannot imagine. If you find it
elsewhere, do post about it. Until you do, the guarantee that you
speak of does NOT exist, and that is expressed in the second sentence
of the paragraph 7 of [expr.reinterpret.cast].

V
 
I

Ioannis Vranos

Victor said:
No, it's not. You conveniently omitted the rest of the paragraph
7, following the "type. 65)". Read it. The result of such conversion
is unspecified. That means no guarantees are given. Where you get
your "feeling the guarantee exists", I cannot imagine. If you find it
elsewhere, do post about it. Until you do, the guarantee that you
speak of does NOT exist, and that is expressed in the second sentence
of the paragraph 7 of [expr.reinterpret.cast].


Doesn't the standard mention that all kinds of automatic storage/heap
arrays are stored in a sequence?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,053
Latest member
BrodieSola

Latest Threads

Top