Skip to main content

Matrix multiplication testing

We start with a simple computational problem. Given three nn-by-nn matrices A,B,CA,B,C, our goal is to design an algorithm that checks whether AB=CA B = C. Notice that one way to solve this problem is by performing the matrix multiplication ABA B and checking whether the result is the same as CC. This takes O(nω)O(n^{\omega}) time where ω\omega is defined as the minimum value τ\tau such that matrix multiplication can be done in O(nτ)O(n^{\tau}) time. The current best known value is ω2.371\omega \approx 2.371\ldots (proving or disproving that ω=2\omega =2 is a long-standing open question). While we are still far from proving that ω=2\omega=2, the problem of testing whether AB=CAB = C can be solved by a simple randomized algorithm using O(n2)O(n^2) (item) multiplications.

The algorithm is simple and illustrates the idea of "random checks". Take a random vector x=(x1,,xn){0,1}n\textbf{x} =(x_1,\ldots, x_n) \in \{0,1\}^n and check whether ABx=CxAB \textbf{x} = C\textbf{x}; answer "AB=CA B = C" if they are equal, and otherwise answer "ABCA B \neq C". This algorithm only needs O(n2)O(n^2) item multiplications (i.e., to compute ABxAB\textbf{x}, we first compute BxB\textbf{x} and multiply AA with the result). We prove that the probability of this algorithm failing at most 1/21/2.

Lemma 18

Let x{0,1}n\textbf{x} \in \{0,1\}^n be a random vector where each coordinate is chosen independently and uniformly. If ABCAB \neq C, then Px[ABx=Cx]1/2{\mathbb P}_{\textbf{x}}[AB \textbf{x} = C\textbf{x}] \leq 1/2.

Proof:

The proof here will illustrate an important concept in probabilistic analysis, namely, the principle of deferred decision. Assume that ABCAB \neq C, so there must exist i[n]i \in [n] where the ii-th row of ABAB differs from the ii-th row of CC, or equivalently the ii-th row of ABCAB-C is non-zero. Denote by di\textbf{d}_i the (non-zero) row vector that corresponds to the ii-th row of ABCAB-C. We know that P[(ABC)x=0]P[dix=0]{\mathbb P}[(AB -C)\textbf{x} =0 ] \leq {\mathbb P}[d_i \textbf{x} = 0] (if the vector is 00 then its ii-th coordinate is zero).

Let jj be an entry for which di,j0d_{i,j} \neq 0. We assume that all xkx_k's were already sampled for kjk \neq j. So we have that dix=kjdi,kxk+di,jxjd_i \textbf{x} = \sum_{k \neq j} d_{i,k} x_k + d_{i,j} x_j. In other words, the probability that dix=0d_i \textbf{x} = 0 is equal to the probability that xj=kjdi,kxkdi,jx_j = - \frac{\sum_{k \neq j} d_{i,k} x_k}{d_{i,j}}. This probability is at most 1/21/2.

One way of decreasing the probability of failure (at the cost of an increased running time) is by repeating the algorithm several times using independent choices of x\textbf{x} each time. In particular, if we run the algorithm ii times and report AB=CA B = C if and only if all executions report AB=CA B = C, then the running time is O(in2)O(i\cdot n^2) and the probability of failure is at most 1/2i1/2^i. This is a common method used in randomized algorithms known as boosting (the success probability).

Exercise 124

Revisit the previous proof. Now, we pick x{1,,k}nx \in \{1, \ldots, k\}^n where each coordinate is drawn uniformly at random. Prove: If ABCAB \neq C, then Px[ABx=Cx]1/k{\mathbb P}_{\textbf{x}}[AB \textbf{x} = C\textbf{x}] \leq 1/k.

Exercise 125

Let a{0,1}n\textbf{a} \in \{0,1\}^n be such that not all coordinates of a\textbf{a} are zero. Pick x{0,1}n\textbf{x} \in \{0,1\}^n where each coordinate is drawn uniformly at random. Prove:

Px[x1a1x2a2xnan=0]=1/2{\mathbb P}_{\textbf{x}}[x_1 a_1 \oplus x_2 a_2 \oplus \ldots \oplus x_n a_n = 0] = 1/2

Hint: Use the principle of deferred decision.