In this blog-post, I'd like to present a use-case where an Fully Homomorphic Encryption (FHE) scheme achieves unprecedentedly fast classification of encrypted data, and makes scale-invariant homomorphic evaluation of neural networks (NN) possible.
Are privacy-preserving services in the Cloud relevant?At this point, I think, I can skip philosophizing about the ubiquitous utility of machine learning, as we can see its impact everyday all around us. On the other hand the quest for privacy-preserving application of machine learning algorithms to user data is becoming a central topic of discussions recently. It is expected that the General Data Protection Regulation (GDPR), a result of the call for European law (which could serve as paragon internationally) to protect its citizens, its economy, will push forward innovation in this direction too.
Simply put, users of Machine Learning as a Service (MLaaS) in the Cloud, want to only share & upload encrypted images as input to the companies' powerful, pre-trained cognitive models.
Clearly, encrypting content ensures data confidentiality, assuming the associated private key of the public-key encryption scheme never leaves the user’s trusted device. (As a side note; recent news reports suggest that such an assumption for user controlled devices are not always guaranteed. The Cloud operating on FHE encrypted data on the other hand is not possibly vulnerable to leak private user data through the whole class of cache attacks, i.e. Meltdown and Spectre.)
Let's briefly look at the problem setting.
First step when approaching a solution of how to use FHE for NN is defining minimal requirements of the concrete task and knowing what can be considered practical FHE.
Let's briefly look at the task.
We want to showcase fast homomorphic evaluation of a pre-trained NN to classify a depicted shape without leaking privacy of the input data at an 80 [bit] security level, e.g. images of handwritten digits from the MNIST dataset.
The output, given in less than two seconds, shall be encrypted scores assigned to each possible output and the highest score, decrypted by the user, is the most probable label of the input image.
As deep neural networks with $d$ hidden layers give good results in practice, we target this type with a scale-invariant FHE scheme.
Let's look at the problem solution.In an attempt to bringing forth FHE in practice, our C++ code builds on top of an existing Fast Fully Homomorphic Encryption Library over the Torus (TFHE) and introduces a new framework for homomorphic evaluation.
To increase the efficiency, which is an important step paving the way to practicality, the underlying FHE scheme needs to be parametrized once for a given network.
Secondly, a security analysis is another crucial step in vetting the algorithms, ensuring their use maturely resists state-of-the-art cryptanalysis and fulfills the targeted security level.
The main capability of our scheme is that when evaluating a single neuron, the output value can readily be used for the next operation as it is bootstrapped to ensure low error propagation.
|Close-up on a single neuron.|
Scale-invariance means that privacy-preserving evaluation of deep neural networks do not longer pose a hurdle, as computations carried out by every neuron in the network is independent of the total number of neurons and layers and hence scales linearly.
With this approach, we can report the performance result of an experiment to classify 10000 encrypted images from the MNIST dataset with more than 96% accuracy on average taking less than 1.7 seconds, using the TFHE library as a starting point.
|Running an experiment on a trained neural net with 784:100:10--topology deployed in the Cloud.|
I'd like to stress that the good performance of this scale-able approach is not limited to homomorphic evaluation of neural networks with one hidden layer, as depicted above, but can be applied to deep neural networks, that in practice could be composed of possibly a hundred hidden layers or an even broader class of cognitive models.
For a detailed, formal description I refer to the full version of the paper or you may try out the proof-of-concept implementation code, available online that shows how to obtain these research results, applying our generic framework to a trained NN and MNIST dataset inputs as a demonstration.
Let's look at directions for future work and open-questions.Finally, mentioning limits on the functionality of our FHE scheme, and pointing out the applicability to other well-specified domains rounds off this treatment here.
To comfort a potential concern for the service providers that their users might be sending malicious requests, to evaluate private networks with our framework is not dealt with at the moment, although it is in principal possible.
They could either try to learn the company's intellectual property (the weights and the topology of the neural network itself), or try to derive sensitive information encoded therein (which could be a breach into the privacy of the training dataset).
In this latter case a statistical databases studied in the differential privacy literature can be used in the training phase.
An open question is how further performance gains can be achieved by refining the algorithms. Also listing all general cognitive models that are possible is interesting.
So long, stay tuned for faster solutions and more general demonstrations!