## Sunday, January 10, 2016

### Threshold implementations

Ever since the discovery of differential power attacks (DPA) in late 1990's, cryptographic community has been trying to find effective methods to twarth attempts to extract secret information, i.e. secret key using such attacks. Initial proposals to protect hardware implementations used simple models. Although easy to understand these schemes can still leak sensitive information through power consumption in case of glitches. In most models we assume that Hamming weight of intermediate result is leaked through power consumption.

Masking is one of the first ideas to prevent leaking sensitive information using intermediate values during encryption. In most basic example we split the intermediate value $X$ into two randomized values $X_1$ and $X_2$ such that $X = X_1 \oplus X_2$. We assume that leakage is equal to Hamming weight of the variables $\mathcal{L}(X) = \text{HW}(X_1, X_2)$.  It turns out that first order analysis doesn't reveal input value since mean of the leakage $\mathcal{L}$ is constant for all values of input $x$.

 $x$ $x_1$ $x_2$ $\mathcal{L}(x)$ Mean$(\mathcal{L}(x))$ $0$ $0$ $0$ $0$ $1$ $1$ $1$ $2$ $1$ $1$ $0$ $1$ $1$ $0$ $1$ $1$

In case of second order analysis, however, this scheme will leak. Variance of $\text{HW}(X_1, X_2)$ is different for the case when unmasked value takes value 0 and for the case when it takes value 1. In general, to protect against $n^{\text{th}}$ order attack we would need to decompose original variables into $n + 1$ randomized uniform variables $X_1, ..., X_{n+1}$ such that

X = X_1 \oplus X_2 \oplus \cdots \oplus X_{n+1}

In practice this is done by generating $d$ variables $X_1, .... , X_d$ and deriving $X_{d+1}$ so that it satisfies the upper equation. Each of the variables $X_i$ is referred to as share. Once we have a sharing, we need to perform all the operations using these shares so that in the end we still still have correct output shares. In case of affine transform this is easy - applying linear term to all of the shares and adding constant term to one of the shares. Boolean function that have higher algebraic degree are not so trivial as we need to ensure the order in which the operations in the shares are executed. For example function $f(a, b, c) = a \oplus bc$ can be protected using Boolean masking with 2 shares in following way

f_1(a_1, b_1, c_1) = (a_1 \oplus b_1c_1) \oplus b_1c_2\\
f_2(a_2, b_2, c_2) = (a_2 \oplus b_2c_1) \oplus b_2c_2.

This sharing is not unique and there are multiple possible sharing of the same function.
As noted, order in which operations take place is important. Otherwise, we could expose the unmasked value of $c$ in function $f_1$ because $b_1c_1 \oplus b_1c_2 = b_1(c_1 \oplus c_2) = b_1c$.

Major problem with conventional masking is that it doesn't provide protection when glitches can occur in the circuit, which is inevitable in CMOS technology. Different propagation times of input signals can cause some gates to change state more than once during a single clock cycle, which causes more input current to be drawn from the power source, resulting in increased power consumption during the clock cycle in which glitch occurred. If the increased power consumption can be correlated to the value we want to protect, there is a leak in the device.

Threshold implementations

Threshold implementation, or TI, is a method of protecting against DPA that is resistant to glitches that can occur. It builds upon the classical masking scheme and threshold cryptography. For $d^{\text{th}}$ order security idea is to create shared functions in a way that any linear combination of $d$ shared function is independent of at least one input share. In other words, if we are able to probe any $d$ wires in the design, we wouldn't be able to create correlation with unshared value because we would always be missing at least one input share for correlation.

In order to create a TI of a given function $f(x) : \mathcal{F}_2^n \Rightarrow \mathcal{F}_2^m$ with $s_{in}$ input shares and $s_{out}$ output shares we apply a Boolean masking to input variable $x$, so that we create shares $x_1, ... , x_{s_{in}}$ for which holds $\sum x_i = x$. Now we have a sharing of variable $x$, $\mbox{Sh}(x), \mathbf{x} \in \mathcal{F}_2^{ns_{in}}$. Similarly, we create a sharing of function $f$ by creating $s_{out}$ function components $\mathbf{f} = f_1, ... , f_{s_{out}}, \mathbf{f} \in \mathcal{F}_2^{m s_{out}}$.
Threshold implementation satisfies four key properties:
2. Correctness
3. Non-Completness
4. Uniformity of a sharing
First property simply states that for each possible value of input variable $x$ probability of each valid input sharing is always the same. If we denote the input sharing with $\mathbf{X}$ we could rewrite the previous statement as

(\forall x), (\mathbf{x} \in \mbox{Sh}(x) \Rightarrow \mbox{Pr}(\mathbf{X} = \mathbf{x} | X = x) = p)\,  \land  \, (\mathbf{x} \notin \mbox{Sh}(x) \Rightarrow \mbox{Pr}(\mathbf{X} = \mathbf{x} | X = x) = 0) \\ \land \\ (\sum_{\mathbf{x} \in Sh(x)} \mbox{Pr}(\mathbf{X} = \mathbf{x}) = \mbox{Pr}(X = x))

where $p$ is a constant.

Second property of correctness states that for sharing of a function $f(x)$, $\mathbf{f}(\mathbf{x}) = f_1(\mathbf{x}), .... , f_{s_x}(\mathbf{x})$ XORing all output component gives back the unshared output:

(\forall x) ((\mathbf{x} \in \mbox{Sh}(x), a = f(x), a_i = f_i(\mathbf{x})) \Rightarrow a = \sum_i a_i = \sum_i f_i(\mathbf{x}))

It is simply there to ensure that once we apply the sharing we still get the same result in the output when we perform unmasking.

The previous two properties applies to all masking schemes, not only TI. Property of non-completeness is what ensures $d^{\text{th}}$ order of security:

Any combination of up to $d$ component functions $f_i$ of $\mathbf{f}$ is independent of at least one input share $x_j$.

Being independent of one input share means that even the attacker that knows values of $d$ component functions cannot recreate the original input value $x$. Without all the input shares, the information from all other input shares is independent of the input.

It can be shown that to achieve $d^{\text{th}}$ order TI we need at least $d + 1$ input shares. It can also be shown that there always exists a $d^{\text{th}}$ order TI of a function of algebraic degree $t$ with number of input shares $s_{in} \geq td + 1$ and number of output shares $s_{out} \geq \binom {s_{in}}t$. Also, in order to achieve non-completeness we need at least $t+1$ input shares if degree of the function is $t$.

The final property Threshold implementation requires that the output sharing also need to preserve output distribution, e.g. if the $f$ is a permutation then $\mathbf{f}$ is also a permutation:

$d^{\text{th}}$ order sharing of $\mathbf{f}$ if uniform if and only if

\forall x \in \mathcal{F}^n, \forall a \in \mathcal{F}^m \textit{ where } f(x) = a, \forall \mathbf{a} \in \mbox{Sh}(a) \text{ : }|\mathbf{x} \in \mbox{Sh}(x)  | \mathbf{f}(\mathbf{x}) = \mathbf{a}| = \frac{|\mathcal{F}|^{n(s_{in} - 1)}}{|\mathcal{F}|^{m(s_{out} - 1)}}

It is important when we are cascading TIs of two or more functions, because we need the input sharing of the second function to be uniform. This is almost always the the case, e.g. when we are applying TI to block cipher, output of the previous round is used as input for the next round etc.

Uniformity in practice is non-trivial to achieve and finding a methodical way to get a uniform output sharing is still extensively researched. There is a couple of heuristics, but none of them guarantee a sharing that has uniform output distribution. We will tackle some of them in a future blog.