In the past few months, I've learned about many different areas of securely outsourcing storage and computation. One particularly interesting idea I've come across is oblivious random-access machines (ORAMs). In this post, I explain the basics of ORAMs and present the simplest version of path ORAM (eprint, ACM DL).
First, what is a random-access machine? It's a model of computation. Essentially, it's a simple computer that has two parts—a control unit and memory—and can run a program stored in its memory by fetching instructions one at a time. The control unit has a small amount of storage, the registers, and it can read from or write to specific addresses in the memory. Already, you might see how it could model cloud storage: the data owner (or client) is like the control unit, and the remote storage is like the memory. In this post, instead of referring to a control unit and memory, I'll refer to a "client" or "data owner" and a "server."
Oblivious RAMs (ORAMs) hide access patterns to the memory or server. They were proposed by Ostrovsky and Goldreich about 25 years ago (Ostrovsky's PhD thesis). Their motivation was software protection: ORAMs can prevent reverse engineering of programs—useful for software companies with proprietary implementations (and for malware authors). But ORAM is useful in settings other than software protection, like searchable encryption and cloud storage.
If you're already familiar with oblivious transfer or PIR, you may wonder how they relate to ORAM. The following table gives an overview of the main differences.
|oblivious RAM||client, server||read and write, private||server doesn't know access pattern|
|private information retrieval (PIR)||client(s), server||read-only, public||requested item unknown to server|
|symmetrically private information retrieval (SPIR)||client, server||read-only, public||requested item unknown to server, non-requested items unknown to client|
|oblivious transfer (OT)||chooser, sender||read-only||requested item(s) unknown to sender, non-requested items unknown to chooser|
What does it mean to hide access patterns? It means that the server (or anyone who sees all client-server communication and what's stored on the server at any time) doesn't learn anything about which data is actually being accessed, whether it is being updated or only read, how frequently it's accessed, whether the same data is accessed twice, etc. The server should be oblivious to what the owner is doing with its data.
ORAM, more generally, is a cryptographic protocol with three algorithms, Init, Read, and Write, between a client and a server:
- Init(B, N): the client allocates some memory on the server to store N data blocks of size B and maybe allocates some local memory.
- Read(i): the server obliviously returns data block i to the client.
- Write(i, d): the server obliviously replaces block i's data with d.
An ORAM is pattern-hiding if an adversary who chooses two access sequences of the same length can't distinguish which one is being executed, given a complete transcript of communication between the client and server and the contents of the server's memory at any time.
For example, maybe an adversary chooses the two access sequences
( (read, 2), (read, 2), (read, 2) )
( (read, 5), (write, 7, 0x30e84a13), (read, 1) ).
It sees the resulting communication transcript, which could look like
( (read, 10), (read, 12), (read, 8),
(write, 10, 0x53d35dd9), (write, 12, 0x16701de9), (write, 8, 0x0eae293c),
(read, 6), (read, 8), (read, 1),
(write, 6, 0x30e84a13), (write, 8, 0xffe8d5a4), (write, 1, 0xc113bc8b),
(read, 7), (read, 2), (read, 6),
(write, 7, 0xbb7da98a), (write, 2, 0x08bbfc25), (write, 6, 0xcaf6d2e2) ).
Given this transcript and what's stored on the server at any time, it shouldn't be able to determine which access sequence the ORAM is actually executing with probability greater than 1/2.
ORAM schemes (like path ORAM) tend to ignore exactly how data blocks are encrypted and instead focus on hiding the access pattern. However, all data should be encrypted, otherwise the server could easily learn about the access pattern. A scheme with IND-CPA security would be sufficient.
Here's a trivial ORAM scheme to show that it is indeed possible to hide access patterns. Init(B, N): the server allocates BN bits of memory. Read(i): the client requests all N data blocks from the server, decrypts them, identifies block i among them (e.g., by always prepending a block's identifier i to its data before encrypting it), then re-encrypts all N blocks and sends them back to the server. Write(i, d): the client requests all N data blocks from the server, decrypts them, replaces the data of block i with d, then re-encrypts all N blocks, and sends them back to the server. This scheme is inefficient, but it hides the access pattern!
Suppose the client wants to store N data blocks of size B. In the simplest variant of path ORAM, the storage on the server is treated as a binary tree of height L = ⌈log N⌉ with 2L leaves. (A binary tree of height L has L + 1 node levels.) Each node in the tree is a bucket that contains Z blocks of size B, where Z is a small constant (e.g., 3–5) independent of N. Buckets are padded with dummy blocks when necessary so they all appear full.
An example of a path ORAM tree with height L = 3 and Z = 3 blocks per bucket, that can store up to N = 8 data blocks. These data blocks are indistinguishable from the dummy blocks that fill up the rest of the tree.
The client stores two data structures: a position map and a stash. The stash stores data blocks during block accesses and data blocks that have overflowed from the tree. It's initially empty. The position map associates each of the N data blocks to one of the 2L leaves, chosen independently and uniformly at random. What makes path ORAM work is the following rule: at any time, every data block is stored either in the stash or in a node along the path from the root to its assigned leaf.
An example of a position map for N = 8 items,
To read block i, the client first looks up its associated leaf, PosMap[i]. Then, it sends a read request to the server for the contents of all the buckets along the path from the root to the leaf PosMap[i].
The client decrypts all of these (L+1)Z blocks, discards any dummy blocks, and stores the rest in its stash. It identifies block i among the blocks in the stash. According to the "law" of path ORAM, the block was either already in the stash or was in the path that is now stored in the stash. The client updates block i's assigned leaf in the position map, picking another one of the 2L leaves uniformly at random.
Next, it tries to empty the stash, encrypting data blocks with fresh randomness and writing them back to the same path—the path from the root to block i's former assigned leaf. Data blocks are put as close to the leaves as they can go while still respecting their assigned leaves: a block can be stored in a bucket only if that bucket is also on the path from its assigned leaf to the root. (Any two paths will intersect at the root bucket, at least.) If there aren't enough eligible data blocks in the stash to fill a bucket, then dummy blocks are added. If a bucket is full of data blocks and there are still eligible blocks in the stash, they will be eligible for the next bucket in the path, one level up. If the root bucket is full of data blocks, then any remaining data blocks will stay in the stash.
If you've been following along closely, you might have noticed that after a block is read, it ends up in the root node with probability 1/2. However, even if this block is accessed twice in a row, the access will be indistinguishable from an access to any other block.
To write data d to block i, the client proceeds exactly as if it were reading block i, but simply changes the contents of block i to d in the stash before re-encrypting it and re-writing it to the server.
And that's the basic version of path ORAM. Simple, right? Have a look at my path ORAM simulation on GitHub.
Why is path ORAM "secure" in the sense of hiding the client's access pattern? Its security is mainly due to the property that each data block is assigned a leaf that was chosen independently and uniformly at random, and is updated after each access. Therefore, each data access consists of reading and writing the path from the root to a random leaf. Since all values are re-encrypted with fresh randomness every time they are accessed, only the client can distinguish reads and writes.
The client must store the position map and the stash. The size of the position map is NL bits, or O(Nlog N) bits. During an access, the stash must store an entire path from the root to a leaf node (transient storage). The transient storage required is (L+1)ZB bits, which is O(log N) data blocks. Between accesses, the stash must store any blocks that overflowed (persistent storage). Stefanov et al. show that for a bucket size of Z = 5, the probability that a stash exceeds its capacity of R blocks is at most 14(0.6002)R, i.e., it decreases exponentially with the stash size. (If you look at the path ORAM paper, you'll notice that the security analysis is half a page, while proving a bound on the persistent stash usage requires 8 pages!)
The server must store the tree, whose total size is (2L+1 - 1)ZB bits, while it is storing NB bits of actual data (i.e., non-dummy data), which corresponds to an overhead of about 2Z.
For each block access, the server must send an entire path from the root to a leaf node, and the client must also send it back. So, the bandwidth cost is 2(L+1)ZB bits.
There's a lot more to path ORAM than what I presented in this post. If you're interested in learning more about it, its variants, or ORAMs in general, here are a few recent papers: