## Monday, January 4, 2016

### Learning problems and cryptography: LPN, LWE and LWR

MathJax TeX Test Page

The beginning of a new year is always a perfect moment to make plans for the future: one can sit down, look at the past year and figure out what to do and where to go during the upcoming one. It is legitimate, then, to take a while and think of new directions in the field of cryptography too. One of such is certainly post-quantum cryptography, that is to say what we should do when quantum computers running Shor's algorithm will break all the cryptosystems based on integer factorisation problem and (elliptic curve) discrete logarithm problem. In the literature, different approaches can be found: Code-based cryptographic systems depend on the hardness of correcting a codeword following a random error-correcting code while the security of lattice-based cryptosystems is based on the hardness of finding the shortest vector of a given lattice and of several other problems, just to mention two widely known and studied branches of post-quantum cryptography. In fact, at the present moment no quantum algorithm is known to solve them significantly better than any classic one. But there is more: hidden behind, there is a vast class of problems known as learning problems.

Essentially, they are computational problems related to learning theory, whose main issue is what kind of functions can be learned efficiently from noisy, imperfect data. Although only in recent years the link between learning theory and cryptography has been deeply studied, it existed since the early '90s. These two subjects are in a sense complimentary since the ease of learning an information, which is desirable in learning theory, excludes the possibility of hiding it, a must in cryptography. Hard learning problems are then good candidates to base cryptographic schemes on and the fact that they are thought to be quantum-resistant makes them even more appealing.

Let us begin the 2016 with a brief description of three well known learning problems: LPN, LWE and LWR.

Learning Parity with Noise (LPN)

The decisional LPN problem asks to find a secret vector $\mathbf{s} \in \mathbb{Z}_2^n$ where $n\in\mathbb{N}$, given samples of the form $$( \mathbf{a},\langle \mathbf{a} , \mathbf{s} \rangle \oplus e) \in \mathbb{Z}_2^n\times\mathbb{Z}_2$$ where $\mathbf{a}\xleftarrow{U}\mathbb{Z}_2^n$ is drawn uniformly at random and $e\leftarrow Ber_\tau$ is drawn from the Bernoulli distribution of parameter $\tau \in (0,1/2)$, i.e. $\mathbb{P}(e=1)=\tau$. Equivalently, LPN can be stated as the problem of solving a system of noisy linear equations with coefficients in $\mathbb{Z}_2$, or to decode the codeword $\mathbf{s}$ affected by error $e$ using the random code generated by $\mathbf{A}$, which is the matrix whose rows are different $\mathbf{a}$ from several samples. More formally, for $\tau\in(0,1/2)$ and $n\in\mathbb{N}$ the LPN$_{\tau,n}$ is $(q,t,\epsilon)$-hard if for every distinguisher $\mathcal{D}$ running in time $t$: $$\Big\lvert \underset{\mathbf{s},\mathbf{A},e}{\mathbb{P}}(\mathcal{D}(\mathbf{A},\mathbf{s}\mathbf{A}\oplus \mathbf{e})=1) - \underset{\mathbf{r},\mathbf{A}}{\mathbb{P}}(\mathcal{D}(\mathbf{A},\mathbf{r})=1) \Big\rvert \leq \epsilon$$ where $\mathbf{s}\xleftarrow{U}\mathbb{Z}_2^n$ is the vector to be discovered, $q\in\mathbb{N}$ is the number of samples, $\mathbf{A}\xleftarrow{U}\mathbb{Z}_2^{n\times q}$, $\mathbf{r}\xleftarrow{U}\mathbb{Z}_2^q$ and $\mathbf{e}\leftarrow Ber_\tau^q$.

Another particularly elegant way of describing the problem is the following. Given a secret $\mathbf{s} \in \mathbb{Z}_2^n$ and $\tau \in (0,1/2)$, I denote by $\Pi_{\tau,n}(\mathbf{s})$ the distribution over $\mathbb{Z}_2^n\times\mathbb{Z}_2$ sampled choosing a vector $\mathbf{a}\xleftarrow{U} \mathbb{Z}_2^n$ and outputting $( \mathbf{a},\langle \mathbf{a} , \mathbf{s} \rangle \oplus e)$ with $e\leftarrow Ber_\tau$. Hence, the hardness of LPN$_{\tau,n}$ is equivalent to $\Pi_{\tau,n}(\mathbf{s})$ and the uniform distribution over $\mathbb{Z}_2^n\times\mathbb{Z}_2$ being indistinguishable.

As an example of usage of LPN to build cryptosystems, I describe the symmetric scheme LPN-C. Let $C : \mathbb{Z}_2^r \rightarrow \mathbb{Z}_2^n$ be an $[n, r, d]$ error-correcting code (i.e. of length $n$, dimension $r$ and minimal distance $d$) with correction capacity $t = \big\lfloor \frac{d-1}{2} \big\rfloor$. The code $C$ is assumed to be publicly known. Let $\mathbf{M}$ be a secret $k\times n$ matrix constituting the secret key of the cryptosystem. To encrypt a $r$-bit vector $\mathbf{x}\in\mathbb{Z}_2^r$, the sender draws $\mathbf{a}\xleftarrow{U}\mathbb{Z}_2^k$ and computes $$\mathbf{y} = C(\mathbf{x}) \oplus \mathbf{a}\mathbf{M} \oplus \boldsymbol\nu$$ where $\boldsymbol\nu\leftarrow Ber_\tau^n$. The ciphertext is $(\mathbf{a},\mathbf{y})$. Thanks to the fact that the receiver knows $\mathbf{M}$ and that $\mathbf{a}$ is public, the decryption algorithm is just the XOR $\mathbf{y} \oplus \mathbf{a}\mathbf{M} = C(\mathbf{x}) \oplus \boldsymbol\nu$ followed by the decoding of $C(\mathbf{x}) \oplus \boldsymbol\nu$, which is always possible in the case $HW(\boldsymbol\nu)\leq t$, otherwise $\bot$ is returned. Since the noise vector $\boldsymbol\nu$ is drawn by the sender, a simple test on its Hamming weight can be done to avoid incorrect decoding.

Learning With Errors (LWE)

The Learning With Errors (LWE) problem is a machine learning problem and its hardness reduces from worst-case lattice problems. Essentially, LWE is a generalisation of LPN to larger moduli and different error distributions, i.e. the secret vector $\mathbf{s}$ and the random vector $\mathbf{a}$ live in $\mathbb{Z}_p^n$ for a certain prime $p$ and natural $n$, while the error is drawn from a generic distribution $\chi$, which is the Bernoulli one in standard LPN. In its simplest (search) formulation, LWE asks to recover a secret vector $\mathbf{s}\xleftarrow{U}\mathbb{Z}_q^n$ from a number of samples of the form (note the extreme similarity with the correspondent equation in LPN): $$( \mathbf{a},\langle \mathbf{a} , \mathbf{s} \rangle + e) \in \mathbb{Z}_q^n\times\mathbb{Z}_q$$ where $\mathbf{a}\xleftarrow{U}\mathbb{Z}_q^n$ is picked uniformly at random, $e\xleftarrow{\chi}\mathbb{Z}_q$ is chosen according a certain distribution $\chi$ and addition is performed modulo $q$.

Such description, however, does not allow the construction of schemes being efficient enough for practical purposes. For this reason, an interesting variant of LWE called Ring-LWE has been developed which significantly decreases the amount of storage needed and speeds up the computations. Informally speaking, vectors of bits in $\mathbb{Z}_q^n$ are seen as coefficients of polynomials in $R_q=\mathbb{Z}_q[x]/(f)$ where $f$ is a polynomial of degree $n$ in $\mathbb{Z}_q[x]$. Hence, one element of the ring $R_q$ can stand for $n$ elements of $\mathbb{Z}_q$, thus reducing public and private keys by a factor of $n$.

A basic public-key scheme based on Ring-LWE works as follows. Let $R=\mathbb{Z}[x]/(x^n+1)$ where $n$ is a power of two and $R_q=R/qR$. Both the secret key $s$ and the error $e$ are chosen according to an error distribution $\chi$, hence $s,e\xleftarrow{\chi}R$. The public key is $(a,b=a\cdot s + e)\in R_q^2$ where $a\xleftarrow{U}R_q$. An $n$-bit message $z\in\{0,1\}^n$ is then seen as an element of $R$ via the following transformation. $(z_1,\dots ,z_n) \mapsto \sum_{i=1}^n z_ix^{i-1}$ Then, three "small" random elements $r,e_1,e_2 \in R$ are chosen according to $\chi$ and are used to compute the ciphertext $(u,v)\in R_q^2$ as follows. \begin{align*} u &= a \cdot r + e_1 \pmod{q} \\ v &= b \cdot r + e_2 + \Big\lfloor \frac{q}{2} \Big\rceil \cdot z \pmod{q} \end{align*} The decryption works as follows (the $\pmod{q}$ is implicit). \begin{align*} v - u\cdot s &= b \cdot r + e_2 + \Big\lfloor \frac{q}{2} \Big\rceil \cdot z - a \cdot s \cdot r - e_1 \cdot s \\ &= a\cdot s \cdot r + e\cdot r + e_2 + \Big\lfloor \frac{q}{2} \Big\rceil \cdot z - a \cdot r \cdot s - e_1 \cdot s \\ & = (e\cdot r + e_2 - e_1 \cdot s)+ \Big\lfloor \frac{q}{2} \Big\rceil \cdot z \\ & = E + \Big\lfloor \frac{q}{2} \Big\rceil \cdot z \end{align*} For a suitable choice of the parameters, $E\in R$ is a polynomial whose coefficients have magnitude less than $q/4$ (this is the reason why $r,e_1,e_2 \in R$ were "small"), so that the bits of $z$ can be recovered by rounding each coefficient of $v - u \cdot s$ back to either $0$ or $\lfloor q/2 \rceil$.

Learning With Rounding (LWR)

Among these three, the Learning With Rounding (LWR) problem is the newest. With respect to LWE, instead of adding random noise to $\langle \mathbf{a} , \mathbf{s} \rangle\in\mathbb{Z}_q$, LWR consists in rounding such value to the nearest element in a subgroup $\mathbb{Z}_p$ of $\mathbb{Z}_q$, hence computing $\lfloor\langle \mathbf{a} , \mathbf{s} \rangle\rceil_p\in\mathbb{Z}_p$. The hardness guarantee follows from the fact that, with a careful choice of the parameters, the following holds with high probability. $$\lfloor\langle \mathbf{a} , \mathbf{s}\rangle +e\rceil_p=\lfloor\langle \mathbf{a} , \mathbf{s}\rangle\rceil_p$$ LWR has the immediate huge advantage that sampling the error $e$ is no longer needed. Moreover, since the operation of rounding is deterministic, a layer of uncertainty is cut down.

LWR can be directly used to build a deterministic encryption scheme, which is a cryptosystem always producing the same ciphertext for a given plaintext and key, even over separate executions of the encryption algorithm. Although this raises security flaws, in some situation is a desired feature (e.g. searchable databases). The parameters are $n$, $m$, $q$ and $p$. First of all, a matrix $\mathbf{A}\in\mathbb{Z}_q^{m\times n}$ statistically closed to uniform is chosen as public key while the secret key is a trapdoor function $T$. The existence of an algorithm $Invert$ that returns a secret $\mathbf{s}\in\mathbb{Z}_q^n$ given $\mathbf{A}$, $T$ and the LWE sample $\mathbf{c}=\mathbf{A}\mathbf{s} + \mathbf{e}\in\mathbb{Z}_q^m$ is assumed. Then, the encryption of an $n$-bits message $s\in\{0,1\}^n$ is just $\mathbf{c}=\lfloor\mathbf{A}\mathbf{s}\rceil_p$. The decryption algorithm takes as input the ciphertext $\mathbf{c}$ and the secret key $T$ and works in two steps:

1. $\mathbf{c}$ is transformed from a LWR sample to a LWE one by applying $\lfloor(q/p)\cdot \mathbf{c}\rceil\in\mathbb{Z}_q^m$. Indeed: \begin{align*} \lfloor(q/p)\cdot \mathbf{c}\rceil &= \lfloor (q/p)\lfloor \mathbf{A}\mathbf{s} \rceil_p\rceil \\ &= \lfloor (q/p)\lfloor (p/q)\mathbf{A}\mathbf{s} \rceil\rceil \\ &= \lfloor (q/p)((p/q)\mathbf{A}\mathbf{s} + \mathbf{e}')\rceil \\ &= \mathbf{A}\mathbf{s} + \mathbf{e} \end{align*}
2. the algorithm $Invert$ is applied to $\mathbf{A}$, $T$ and $\lfloor(q/p)\cdot \mathbf{c}\rceil$.

References

[Pie12] is a very broad survey on LPN problem and gives several cryptographic applications as well as security proofs. LWE was first introduced in [Reg05] and its ring variant can be found in [LPR10]. Finally, LWR was introduced in [BPR12] but an improved version, allowing the choice of a bigger class of parameters, is given in [AKPW13].

[AKPW13]
Joel Alwen, Stephan Krenn, Krzysztof Pietrzak, and DanielWichs. Learning with rounding, revisited. In Advances in Cryptology CRYPTO 2013, pages 57-74. Springer, 2013.
[BPR12]
Abhishek Banerjee, Chris Peikert, and Alon Rosen. Pseudorandom functions and lattices. In Advances in Cryptology EUROCRYPT 2012, pages 719- 737. Springer, 2012.
[LPR10]
Vadim Lyubashevsky, Chris Peikert, and Oded Regev. On ideal lattices and learning with errors over rings. In Advances in Cryptology EUROCRYPT 2010, pages 1- 23. Springer, 2010.
[Pie12]
Krzysztof Pietrzak. Cryptography from learning parity with noise. In SOFSEM 2012: Theory and Practice of Computer Science, pages 99- 114. Springer, 2012.
[Reg05]
Oded Regev. On lattices, learning with errors, random linear codes, and cryptography. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 84 -93. ACM, 2005.

Marco