[WPI] [cs2223] [cs2223 text] [News] [Syllabus] [Classes]
Matrix multiplication is an associative binary operation. That means that matrices are multiplied two at at time and the order in which a string of matrix multiplications is performed does not alter the result. We usually show this by means of parentheses:
Also, the number of matrix multiplications remains invariant - N matrices require N-1 matrix multiplications, no mater what order. What does change with the ordering of the matrix multiplications, however, is the number of individual matrix element multiplications required.
In multiplying two matrices, the number of columns in the first and the number of rows in the second must match - this means that matrix multiplication is generally not commutative (the order counts).
The numbers in parentheses are the orders of the respective matrices - the number of rows and columns, respectively. The value of each of the elements in the product is formed by the sum of individual products - the number of which is equal to the common number of columns in the first or rows in the second.
And, since there are three possible values of i and three possible values of j, the number of multiplications of matrix elements is.
In general, multiplying amatrix by amatrix, requires individual multiplications.
The order in which we mutiply the matrices in a chain of matrix multiplications can greatly influence the number of multiplications which are required. An an example, let's multiply these five matrices:
matrix |
order |
A |
(100 x 1) |
B |
(1 x 99) |
C |
(99 x 1) |
D |
(1 x 85) |
E |
(85 x 1) |
If we consider two ways of splitting up the multiplications, we get two very different numbers of element multiplications:
We want to find the multiplication order which optimizes matrix multiplication - the one which requires the fewest number of binary matrix elements multiplication. To get an idea how this works, look at the number of ways to do the final multiplication: We use the notation mij to denote the optimal number of multiplications of the subsequence of matrices from i to j: And let the vector dk denote the dimensions of the matrices in the list - which will be one more than the number of matrices. In our example, here is the d vector.
Then there are four possible ways to do the last multiplication and there are four corresponding values for the optimal number of element multiplications:
Since we want the optimal value for m15, we need to take the minimum of the above for values. And, of course, we need to find the optimimum way to place the parentheses in each of the two sub-sequences to calculate the restof the values which are combined to find m15. This problem has two of the elements of dynamic programming: a cost function which is calculated recursively (to avoid having to recalculate intermediate values) and a way to conbine optimal sub-solutions to form an optimal solution. The only thing missing is a way to work backwards from the optimal value of m15 and determine which parantheses placement produced the optimal value. We cannot do it from the optimal value alone. The above case shows there is one of four possible routes which led to the optimal solution. In a problem with N matrices,there are N-1 possible last steps. So, we will use two matrices - one to keep track of the optimal values as we calculate them recursively, and one to keep track of which path led to the optimal solution.
The above solution for m15 can be generalized to calculate mij, the optimal number of element multiplications necessary to multiply the subsequence of matrices i through j.
The intermediate value k can take on all values between 0 and. If we plug the values
into the above equation, we get the four expressions for the last multiplication in our example:
We cannot calculate these four expressions to find the minimum without knowing the other values of mij. We find those values recursively and fill in our m matrix:
We've left off everything below the main diagonal because a subsequence cannot start before it begins! Note, also, that we have explicitly shown how subsequences of the same length all lie on the same diagonal because the length is just.
Now we fill in the matrix, recursively. The subsequences of length one (main diagonal) all have value zero. No multiplication is involved in dealing with a subsequence of length one (all the magic happens in producing the other subsequence and in the multiplication of the two subsequences).
The coefficients of length 2 all involve just the d vector values.
Now we use the recursive equation above to find the rest of the values, one length at a time.
These calculations will all use k values between 0 and 1:
The optimal values (minima) are:
The first equation means that the optimal way to do the multiplication of the first three matrices is
and 199 is the number of element multiplications required. The value k=0 is what tells us to put the interior parentheses after matrix A since matrix is the first matrix in the subsequence and k+1 = 1. Similarly, the other two results tell us the best way to do the other two submatrices of length 3 are:
As we stated above, we need to keep track of how we got the minimum so let's form a k matrix, too:
We only need to use the k values for lengths 2 and above, so even more of it is left blank.
These calculations will use k values between 0 and 2:
The optimal values (minima) are:
These correspend, respectively, to the subsequences of length 4:
Put these values into the m and k matrices:
This calculations will use k values between 0 and 3: The equations were given above so we only have to put in the values, which have now been calculated:
The optimal value (minima) is:
The final matrices are:
This tells us that a solution to multiplying the five matrices can be found which involves 284 multiplications of matrix elements. The k=2 tells us that the first set of internal parentheses come after the first matrix:
Next the subsequences are recursively parsed into their subsequences using the same matrices. The first subsequence corresponds to m11 and m25. The first one cannot be subdivided any further (that is what the m11=0 is telling us). The m25 has k=1, which tells us that the next set of internal parentheses goes after matrix 3:
The next parsing is of m23, but the fact that there is no entry in the k matrix tells us that we need go no further.
[cs2223 text] [News] [Syllabus] [Classes] |