What is Randomized Algorithm?
Randomness is a basic and extremely useful concept in algorithm design. Introducing randomness allows us to deal with some very hard questions. Although we cannot provide an accurate result for these hard ones, probablisitic techniques can help us to give a approximated result, and tell to what extent we can believe the given result is closed to the ground-truth.
Intuitive Example - Matrix Multiplication Verification
An intuitive example is the matrix multiplication verification problem. The problem is described as follows: suppose there are three \(n \times n\) matrices \(A,B,C\) , and we want to determine whether the multiplication of first two matrices equals the third one, i.e., \(AB=C\) . A straight-forward strategy is to first calculate the matrix multiplication and do the comparison, which depends on how fast we can multiply two \(n \times n\) matrices. Apparently, doing it directly requires \(O(n^3)\) time. (The current fastest algorithm for matrix multiplication runs in \(O(n^{2.37...})\) time.) The question is, can we find a way to do this job more quickly?
The answer is yes! If we use the randomized algorithm, we only need \(O(n^2)\) time to do the matrix multiplication verification.
Here we use an auxiliary n-dimentional vector \(x\) , and try to verify whether \(ABx=Cx\) . The reason in doing so is that calculate the multiplication of a matrix and a vector only need \(O(n^2)\) time! Notice that \((AB)x = A(Bx)\), we can first calculate \(Bx\) and then multiply the result (the result is also a n-dimentional vector) with \(A\). So, verifying \(ABx=Cx\) is much faster than verifying \(AB=C\) .
It is eazy to induce that if \(AB=C\) , then \(ABx=Cx\) stands for any \(x\) . But once \(AB \neq C\), the equality between \(ABx\) and \(Cx\) fully depends on the choosen of the vector \(x\).
We can design an algorithm to randomly choose this auxiliary vector \(x\) from 0-1 vectors. In other words, each dimension value of \(x\) is decided by flipping a coin - head means 1 and tail means 0. Then this algorihtm verifies whether \(ABx=Cx\) and outputs “YES” if so, or “NO” otherwise. This randomized algorithm (we call it “randomized” because \(x\) is randomly chosen) runs in \(O(n^2)\) time. Interestingly, we can calculate that the probability of algorithm output under different situation, which is summarized in the table below. (The calculation process is omitted of course :) )
Pr(return YES) | Pr(return NO) | |
---|---|---|
\(AB=C\) | 1 | 0 |
\(AB \neq C\) | \(\leq \frac{1}{2}\) | \(\geq \frac{1}{2}\) |
It shows that when \(AB \neq C\) , the probability of algorithm outputing “NO” is at least one half. This is a very useful conclusion, because it provides a lower bound for the algorithm to produce the correct output. Here you may argue that this algorithm is not practical at all! If one algorithm only has half chance to say the right thing, how could we believe the output of this algorithm? In essence, it is not different than randomly choosing the answer from “YES” and “NO”.
So, here comes the magic part, which is crucial in any randomized algorithm - amplification. During the amplification process, we repeat the randomized algorithm several times, and compare their results. By doing so, the probability of correctness can be greatly boosted by sacrificing the runtime. Take our above algorithm for example, suppose we run the algorithm for \(k\) times independently, then the success probability is summarized below.
Pr(return YES in all \(k\) trials) | Pr(return NO in one of \(k\) trials) | |
---|---|---|
\(AB=C\) | 1 | 0 |
\(AB \neq C\) | \(\leq \frac{1}{2^k}\) | \(\geq \frac{1}{2^k}\) |
When \(k\) equals ten, the success probability is already at least 0.999. Basically, you can choose the value of \(k\) by yourself to satisfy the any accuracy requirement.
This randomized algorithm we disucssed is called Frievalds’ Algorithm, taking a runtime of \(O(kn^2)\) , where \(k\) is the numbe r of times the algorithm is executed. It is performed assuming a probability of failure of less than \(2^{-k}\). Here provides the formal description of the matrix multiplication verification problem and more intuitive examples.
More to Discuss
Now, after the illustration of this cute matrix example, I believe you already get the fresh feeling for randomized algorithm! This is a great start to explore more posibilties when combining randomness and algorithm.
In the end, let us recover some crucial steps in the randomized algorithm design. Firstly, we randomly pick some auxiliary struture to simplify the calculation (the vector \(x\) in our case). Secondly, we prove a lower bound for the algorithm to produce a correct result. Lastly, we boost the success probability through multiple runs. Basically, all the randomized algorithms share this structure, more or less. And the most challenging but fascinating part lies in how to get a tight bound of the success probability.