Matrix multiplication testing

We start with a simple computational problem. Given three $n$ -by- $n$ matrices $A,B,C$ , our goal is to design an algorithm that checks whether $A B = C$ . Notice that one way to solve this problem is by performing the matrix multiplication $A B$ and checking whether the result is the same as $C$ . This takes $O(n^{\omega})$ time where $\omega$ is defined as the minimum value $\tau$ such that matrix multiplication can be done in $O(n^{\tau})$ time. The current best known value is $\omega \approx 2.371\ldots$ (proving or disproving that $\omega =2$ is a long-standing open question). While we are still far from proving that $\omega=2$ , the problem of testing whether $AB = C$ can be solved by a simple randomized algorithm using $O(n^2)$ (item) multiplications.

The algorithm is simple and illustrates the idea of "random checks". Take a random vector $\textbf{x} =(x_1,\ldots, x_n) \in \{0,1\}^n$ and check whether $AB \textbf{x} = C\textbf{x}$ ; answer " $A B = C$ " if they are equal, and otherwise answer " $A B \neq C$ ". This algorithm only needs $O(n^2)$ item multiplications (i.e., to compute $AB\textbf{x}$ , we first compute $B\textbf{x}$ and multiply $A$ with the result). We prove that the probability of this algorithm failing at most $1/2$ .

Lemma 18

Let $\textbf{x} \in \{0,1\}^n$ be a random vector where each coordinate is chosen independently and uniformly. If $AB \neq C$ , then ${\mathbb P}_{\textbf{x}}[AB \textbf{x} = C\textbf{x}] \leq 1/2$ .

Proof:

The proof here will illustrate an important concept in probabilistic analysis, namely, the principle of deferred decision. Assume that $AB \neq C$ , so there must exist $i \in [n]$ where the $i$ -th row of $AB$ differs from the $i$ -th row of $C$ , or equivalently the $i$ -th row of $AB-C$ is non-zero. Denote by $\textbf{d}_i$ the (non-zero) row vector that corresponds to the $i$ -th row of $AB-C$ . We know that ${\mathbb P}[(AB -C)\textbf{x} =0 ] \leq {\mathbb P}[d_i \textbf{x} = 0]$ (if the vector is $0$ then its $i$ -th coordinate is zero).

Let $j$ be an entry for which $d_{i,j} \neq 0$ . We assume that all $x_k$ 's were already sampled for $k \neq j$ . So we have that $d_i \textbf{x} = \sum_{k \neq j} d_{i,k} x_k + d_{i,j} x_j$ . In other words, the probability that $d_i \textbf{x} = 0$ is equal to the probability that $x_j = - \frac{\sum_{k \neq j} d_{i,k} x_k}{d_{i,j}}$ . This probability is at most $1/2$ .

One way of decreasing the probability of failure (at the cost of an increased running time) is by repeating the algorithm several times using independent choices of $\textbf{x}$ each time. In particular, if we run the algorithm $i$ times and report $A B = C$ if and only if all executions report $A B = C$ , then the running time is $O(i\cdot n^2)$ and the probability of failure is at most $1/2^i$ . This is a common method used in randomized algorithms known as boosting (the success probability).

Exercise 124

Revisit the previous proof. Now, we pick $x \in \{1, \ldots, k\}^n$ where each coordinate is drawn uniformly at random. Prove: If $AB \neq C$ , then ${\mathbb P}_{\textbf{x}}[AB \textbf{x} = C\textbf{x}] \leq 1/k$ .

Exercise 125

Let $\textbf{a} \in \{0,1\}^n$ be such that not all coordinates of $\textbf{a}$ are zero. Pick $\textbf{x} \in \{0,1\}^n$ where each coordinate is drawn uniformly at random. Prove:

${\mathbb P}_{\textbf{x}}[x_1 a_1 \oplus x_2 a_2 \oplus \ldots \oplus x_n a_n = 0] = 1/2$

Hint: Use the principle of deferred decision.