Strassen’s Algorithm - Explained

In linear algebra, the Strassen algorithm, named for Volker Strassen, is an algorithm for matrix multiplication. It is faster than the standard matrix multiplication algorithm and is useful in practice for large arrays, but it would be slower than the fastest algorithms known for extremely large arrays.
The Strassen algorithm works for any ring, such as plus / multiply, but not for all semi-trailers, such as min-plus or Boolean algebra, where the naive algorithm still works and what is known as combinatorial matrix multiplication.

Strassen’s Algorithm
Strassen’s Algorithm - Explained
The most mind-blowing fact of this algorithm is that someone attempted it.
Let me show you what I mean.
First, we need to know about matrix multiplication. Let's write down two 3-by-3 matrices:
Strassen’s Algorithm - Explained
Let's call the left matrix A and the right B. We define matrix multiplication between these two as a calculation which results in another matrix with the same numbers of rows as A and the same number of columns as B. The definition dictates we calculate each element of the result as:
Strassen’s Algorithm - Explained


Let's call the resulting matrix C. So if you'd like to calculate an element of Cthat's in the second row and first column, you first select out the second row from A and first column of B. Then you multiply each element of these together and sum them up. This is similarly defined for all other elements of C. The result is:
Strassen’s Algorithm - Explained
To understand why Strassen's Algorithm is interesting, we need to count how many simple add/multiply calculations are involved in producing C. Well, that's not hard. To produce that number 48, we had 3 multiplications and 2 additions, for a total of 5 simple calcs. Since there are 9 elements in C, we have a total of 9×5=45 calculations.
To make progress, we need to get more general. Let's say A and B are n-by-nmatrices[1]. Then the formula for the total number of calculations is:
T(n)=n2(n+(n1))=2n3n2
Now, when some mathy people look at this, they want to drop the specifics and consider only the dominate behavior as n gets large. In other words, they care about the generic speed with which this function grows. For this case, they would say T(n) grows in proportion to n3. More technically, they say:
T(n) is in the set O(n3)
which means you could choose a constant c such that T(n)<cn3 for all nbeyond a certain point. Just think of this as a rigorous way of pointing out the highest exponent term in T(n).

So let's think about that statement: T(n) grows in proportion to n3. It has to, right? The result, C, has n2 elements and we have to calculate each in turn. For each turn, we have some near-multiple of n calculations to do. Therefore, this must amount in something on the order of n3 calculations... right?
No.
Strassen's algorithm was the first strike to chip away at the n3 shell, showing us that we could get the cost of matrix multiplication to grow at a rate less than n3.
That is absolutely wild. How the hell are we avoiding the work of matrix multiplication that seems baked into its definition?
I have no clue and no one suspected it was worth an attempt until Volker Strassen came along. His algorithm showed it could be done with a cost near O(n2.8). This sparked a flurry of research, and we've since made significant progress:
Strassen’s Algorithm - Explained
I’ll say it again, that is absolutely wild.
Footnotes
[1] Strassen’s algorithm isn’t specific to n-by-n matrices. It just simplifies my explanation.

Post a Comment

1 Comments

  1. You keep publishing such good information which will be useful to all.
    Mumbai | Andheri | Vashi

    ReplyDelete