In linear algebra, the Strassen algorithm, named for Volker Strassen, is an algorithm for matrix multiplication. It is faster than the standard matrix multiplication algorithm and is useful in practice for large arrays, but it would be slower than the fastest algorithms known for extremely large arrays.
The Strassen algorithm works for any ring, such as plus / multiply, but not for all semi-trailers, such as min-plus or Boolean algebra, where the naive algorithm still works and what is known as combinatorial matrix multiplication.
Strassen’s Algorithm
The most mind-blowing fact of this algorithm is that someone attempted it.
Let me show you what I mean.
First, we need to know about matrix multiplication. Let's write down two 3-by-3 matrices:
Let's call the left matrix and the right . We define matrix multiplication between these two as a calculation which results in another matrix with the same numbers of rows as and the same number of columns as . The definition dictates we calculate each element of the result as:
Let's call the resulting matrix . So if you'd like to calculate an element of that's in the second row and first column, you first select out the second row from and first column of . Then you multiply each element of these together and sum them up. This is similarly defined for all other elements of . The result is:
To understand why Strassen's Algorithm is interesting, we need to count how many simple add/multiply calculations are involved in producing . Well, that's not hard. To produce that number 48, we had 3 multiplications and 2 additions, for a total of 5 simple calcs. Since there are 9 elements in , we have a total of calculations.
To make progress, we need to get more general. Let's say and are -by-matrices[1]. Then the formula for the total number of calculations is:
Now, when some mathy people look at this, they want to drop the specifics and consider only the dominate behavior as gets large. In other words, they care about the generic speed with which this function grows. For this case, they would say grows in proportion to . More technically, they say:
T(n) is in the set
which means you could choose a constant such that for all beyond a certain point. Just think of this as a rigorous way of pointing out the highest exponent term in .
So let's think about that statement: grows in proportion to . It has to, right? The result, , has elements and we have to calculate each in turn. For each turn, we have some near-multiple of calculations to do. Therefore, this must amount in something on the order of calculations... right?
No.
Strassen's algorithm was the first strike to chip away at the shell, showing us that we could get the cost of matrix multiplication to grow at a rate less than .
That is absolutely wild. How the hell are we avoiding the work of matrix multiplication that seems baked into its definition?
I have no clue and no one suspected it was worth an attempt until Volker Strassen came along. His algorithm showed it could be done with a cost near . This sparked a flurry of research, and we've since made significant progress:
I’ll say it again, that is absolutely wild.
Footnotes
[1] Strassen’s algorithm isn’t specific to -by- matrices. It just simplifies my explanation.
1 Comments
You keep publishing such good information which will be useful to all.
ReplyDeleteMumbai | Andheri | Vashi