class operator optimization

A

aaragon

Hello everyone,

I've been working on a Matrix class that I borrowed from Bjarne
Stroustrup's book. I was following the kind of optimization he has in
Section 22.4.6 of the book (the one that deals with temporaries,
copying and loops). I tried to implement that kind of optimization
because in my computation I often compute

transpose(A) * B * A

Therefore, I thought that it would be really cool to optimize the
computation of this by using his approach. First, I changed somehow
the class in order to have a template parameter, so the class looks
like this:

// forward declarations
template <typename T> class MMmul;
template <typename T> class TM;
template <typename T> class TMMmul;

//! Class used for matrix objects
template <typename T = double>
class Matrix
{
typedef T ValueType;
valarray<ValueType>* v_; //!< Valarray object used to store the
matrix elements by rows
size_t m_; //!< Unsigned integer used to store the number of
rows
size_t n_; //!< Unsigned integer used to store the number of
columns

public:

enum MatrixType {I};

// matrix constructors (no need to show definitions for this post)
Matrix();
explicit Matrix(size_t n, ValueType val = ValueType());
explicit Matrix(size_t n, MatrixType type);
Matrix(size_t m, size_t n, ValueType val = ValueType()) ;
Matrix(const Matrix& prototype);
Matrix& operator=(const Matrix& prototype);

// more complicated constructors (for these I wanted to show also de
definitions)
template <typename U>
Matrix(const MMmul<U>& MM) {
m_ = MM.M1_.m_;
n_ = MM.M2_.n_;
v_ = new valarray<ValueType>(ValueType(),m_*n_);
if (MM.M1_.n_ != MM.M2_.m_)
throw RuntimeError("*** ERROR *** Wrong dimensions for
multiplication of matrices.");
// carry out multiplication
// loop over A.m rows
for (size_t i=0; i<MM.M1_.m_; ++i) {
// loop over B.n columns
for (size_t j=0; j<MM.M2_.n_; ++j) {
// loop over k A.n columns (or B.m rows)
for (size_t k=0; k<MM.M1_.n_; ++k) {
(*this)(i,j) += MM.M1_(i,k)*MM.M2_(k,j);
}
}
}
}
template <typename U>
Matrix(const TM<U>& tm) {
m_ = tm.M_.n_;
n_ = tm.M_.m_;
v_ = new valarray<ValueType>(ValueType(),m_*n_);
for (int i=0; i<tm.M_.m_; ++i) {
for (int j=0; j<tm.M_.n_; ++j) {
(*this)(j,i) = tm.M_(i,j);
}
}
}
template <typename U>
Matrix(const TMMmul<U>& TMM) {

m_ = TMM.M1_.m_;
n_ = TMM.M2_.n_;
v_ = new valarray<ValueType>(ValueType(),m_*n_);

// carry out multiplication
// loop over A.m rows
for (size_t i=0; i<TMM.M1_.m_; ++i) {
// loop over B.n columns
for (size_t j=0; j<TMM.M2_.n_; ++j) {
// loop over k A.n columns (or B.m rows)
for (size_t k=0; k<TMM.M1_.m_; ++k) {
(*this)(i,j) += TMM.M1_(k,i)*TMM.M2_(k,j);
}
}
}
}

~Matrix() {delete v_;}

....
....
etc...

and then the operators:

// operator for multiplication of two matrices
template<typename T> inline Matrix<T> operator*(const Matrix<T>& A,
const Matrix<T>& B) {
return MMmul<T>(A,B);
}

// operator for multiplication of two matrices, one of them transposed
template<typename T> inline Matrix<T> operator*(const TM<T>& A, const
Matrix<T>& B) {
return TMMmul<T>(A,B);
}

// transpose matrix
template<typename T> inline Matrix<T> transpose(const Matrix<T>& A) {
return TM<T>(A);
}

and the special classes:

template <typename T>
struct MMmul {
const Matrix<T>& M1_;
const Matrix<T>& M2_;
MMmul(const Matrix<T>& m1, const Matrix<T>& m2) : M1_(m1), M2_(m2)
{cout<<"inside MMmul constructor"<<endl;}
};

template <typename T>
struct TM {
const Matrix<T>& M_;
TM(const Matrix<T>& m1) : M_(m1) {cout<<"inside TM
constructor"<<endl;}
};

template <typename T>
struct TMMmul {
const Matrix<T>& M1_;
const Matrix<T>& M2_;
TMMmul(const TM<T>& tm, const Matrix<T>& m) : M1_(tm.M_), M2_(m)
{cout<<"inside TMMmul constructor"<<endl;}
};

Now, the question is as follows, I have defined three classes, one
that creates a transpose object, another that multiplies the transpose
by a matrix, and another that multiplies two matrices. The problem is
that if I do

transpose(A) * B

I get the following output:
inside TM constructor // this is good
inside MMmul constructor // why?????? it converted TM to a regular
matrix and entered MMmul! =/

instead of getting
inside TM constructor
inside TMMmul constructor // this is what I want

because I assume that the compiler is so dumb that it doesn't realize
what I want to do. Since the result of TM is also a Matrix, it is
using regular MMmul multiplication instead of the more optimized
TMMmul that I want to use. Is there a way to tell the compiler "Hey!
Could you please enter HERE??? and not THERE???"

Sorry for the huge post and thanks for any suggestions.

 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top