## Tuesday, February 2, 2016

### Parallelization of Pollard's rho Method

Pollard's rho method is probably the most common and well-known algorithm for computing discrete logarithms. However, the standard version is not very useful for large problems. We thankfully understand rather well how to adapt the algorithm in such a way that we can use it in a parallel fashion. We will focus on discrete logarithms in the elliptic curve setting.

To give credit where credit is due, the explanations were primarily adapted from those available in [1]. There are earlier explanations of the concept, but I feel this is the most readily understandable.

### The Standard Version

The discrete logarithm for elliptic curves (ECDLP) is defined as finding a scalar $n$ such that $nP = Q$ for points $P$ and $Q \in G$ where $\langle P \rangle = G$. Standard Pollard rho now works by finding a collision between two distinct linear combinations of $P$ and $Q$. That is, finding a collision in the map
$(a,b) \mapsto aP + bQ.$
Given this collision $(a, a', b, b')$ we can, with high probability, compute the scalar $n$. Finding this collision is done by iterating a function $f$ that, from one linear combination of $P, Q$, outputs another valid linear combination of $P, Q$. The standard version simply computes one long chain of iterations until it detects a cycle in this chain, yielding the necessary collision.

If the output of $f$ produces random linear combinations of $P$ and $Q$ and $l$ is the order of $P$, we expect to need $\sqrt{\pi l / 2}$ iterations to succeed, as per the standard birthday problem. This is known as a random walk.

 Figure 1: A collision in the standard version of Pollard's rho method.

### The Parallel Version

You could theoretically build a parallel version of Pollard rho by simply defining $m$ distinct starting points for $m$ chains. You then would run the algorithm in parallel on all the chains and wait for one to finish. However, we can do better.

While often Pollard's rho method is framed in terms of finding a cycle, we are actually finding a collision. This means we can apply Van Oorschot and Wiener's [2] techniques for parallel collision search. Intuitively, we still take $m$ different starting points $W_0^i$ and have them iterate a function $f$. However, every walk now sends its point to the same central location. Here we simply keep track of all incoming points and when a collision is found we are done.

 Figure 2: A collision in the parallel version of Pollard's rho method.
There is a key secondary consideration though. For non-trivial problem sizes we cannot simply store every point generated in each of these walks. Consider the very realistic number of 100 million points generated per second on one single machine. Then consider the fact that for a 100-bit problem we need approximately $\frac{\sqrt{\pi 2^{100}/4}}{10^8 \cdot 3600 \cdot 24} \approx 115$ days. This costs way too much storage. Supposing that we can store a point for such a problem in only 100 bits we would still require well over 10000 TiB of storage. This estimate even assumes the use of the negation map, hence the division by 4 in the earlier calculation. The negation map is an optimization technique that halves the search space, but we will not discuss it in detail.

Thankfully, we know how to solve this. We simply output only those points which have a particular property that occurs with a certain probability. We call these distinguished points and the related property the distinguished point property. The computation of this property should be efficient and it is very convenient if the probability can be tweaked.
For example, given a point $P = (x,y)$, only output $P$ if the binary representation of $x$ has at least $n$ leading zeroes. This is very fast to compute and it can be trivially adjusted by changing $n$.

Note that we can do this without loss of functionality. Once a collision occurs between two walks, every subsequent point also collides. This means we find a collision at the next distinguished point the converged walks reach, assuming we do not enter a cycle before then.

### References

 [1] Daniel J. Bernstein, Tanja Lange, and Peter Schwabe. On the correct use of the negation map in the Pollard rho method. In Dario Catalano, Nelly Fazio, Rosario Gennaro, and Antonio Nicolosi, editors, Public Key Cryptography -- PKC 2011, volume 6571 of Lecture Notes in Computer Science, pages 128--146. Springer-Verlag Berlin Heidelberg, 2011. http://cr.yp.to/papers.html#negation. [2] Paul C. Van Oorschot and Michael J. Wiener. Parallel collision search with cryptanalytic applications. Journal of cryptology, 12(1):1--28, 1999.