Slide 1

Slide 1 text

Memory for Data-Oblivious Computation David Evans University of Virginia oblivc.org

Slide 2

Slide 2 text

Memory for Data-Oblivious Computation David Evans University of Virginia www.cs.virginia.edu/evans

Slide 3

Slide 3 text

Theory and Practice in Computing Quotes from Maurice Wilkes’s Turing Award Lecture (1967) Alan Turing

Slide 4

Slide 4 text

Secure Two-Party Computation Alice Bob r = f(a, b) a b r = f(a, b) Cryptographic Protocol learns nothing about b learns nothing about a

Slide 5

Slide 5 text

FOCS 1982 FOCS 1986 Note: neither paper actually describes Yao’s protocol. Andrew Yao

Slide 6

Slide 6 text

Yao’s Protocol: Garbled Circuits Function expressed as a Boolean Circuit Garbled evaluation: no information leaked Ridiculously expensive (but 1012 cheaper than 10 years ago) Garble Encode Evaluate Decode f garbled circuit F Y a b Generator Evaluator

Slide 7

Slide 7 text

Motivating Application: Secure Stable Matching

Slide 8

Slide 8 text

Alice Bob Colleen University Rankings A C B A B C C B A Student Preferences

Slide 9

Slide 9 text

Stable Matching Alice Bob Colleen ACB ABC CBA M = { (s1 , r1 ), (s2 , r2 ), … } is a stable matching if there is no pair (si , rj ) where both si and rj prefer this match over the given match

Slide 10

Slide 10 text

Gale-Shapley Algorithm Lloyd Shapley (1923-2016) accepting Nobel Prize (2012)

Slide 11

Slide 11 text

Stable Matching Applications Public schools in New York, Boston Singapore University Admissions Medical residents in US, Canada, others 35,000 applicants

Slide 12

Slide 12 text

Stable Matching Applications Public schools in New York, Boston Singapore University Admissions Medical residents in US, Canada, others 35,000 applicants Use Trusted Third Party to run matching algorithm: - Receives all private rankings and keeps confidential - Produces correct result - uncorrupted

Slide 13

Slide 13 text

Secure Two-Party Stable Matching Protocol Each group trusts one representative Doug Tsinghua

Slide 14

Slide 14 text

Secure Two-Party Stable Matching Protocol Each group trusts one representative XOR-share to 2 non-colluding parties Doug S T Tsinghua

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

Data-dependent lookup in size-n array

Slide 17

Slide 17 text

Data-dependent lookup in size-n array Data-dependent updates to size-n2 array

Slide 18

Slide 18 text

Data-dependent lookup in size-n array Oblivious conditionals: need to always execute all paths Data-dependent updates to size-n2 array

Slide 19

Slide 19 text

Data-Oblivious Array Access 18 a[i] = x Depends on private data

Slide 20

Slide 20 text

Circuit for Array Update 19 i == 0 a[0] x a'[0] Linear Scan: need to touch every array element to hide which one is real i == 1 a[1] x a'[1] i == 2 a[2] x a'[2] i == 3 a[3] x a'[3] …

Slide 21

Slide 21 text

Linear Scan Doesn’t Scale Writing a single 32-bit integer: 32 logic gates Raw Yao’s performance ≈ 3M gates per second Write speed ≈ 100,000 elements per second (not hiding access pattern) For hiding access pattern, N = 217 elements requires > 1 second per access

Slide 22

Slide 22 text

Traditional ORAM Client Untrusted Server [Goldreich 1987] Security property: all initialization and access sequences of the same length are indistinguishable to server. Sublinear client-side state Linear server-side encrypted state Initialize Access

Slide 23

Slide 23 text

RAM-SC [Gordon, Katz, Kolesnikov, Krell, Malkin, Raykova, Vahlis2012] Alice Bob MPC Protocol Public ORAM state Public ORAM state Encrypted Results Oblivious ORAM state Initialize Access Encrypted ORAM Data

Slide 24

Slide 24 text

Circuit ORAM Access time Xiao Wang, Hubert Chan, and Elaine Shi. Circuit ORAM: On Tightness of the Goldreich-Ostrovsky Lower Bound. In ACM CCS 2015. State-of-the- ORAM-Art in 2015 Θ log3 Linear scan

Slide 25

Slide 25 text

Classical Square-Root ORAM [Ostrovsky and Goldreich, 1992]

Slide 26

Slide 26 text

Problems with SQ-ORAM Design Requires a PRF for each ORAM access Pseudo-random function: a big circuit in MPC Initialization requires PRF evaluations Requires oblivious sort twice: Shuffling memory according to PRF Removing dummy blocks Solution strategy: use random permutation instead of PRF

Slide 27

Slide 27 text

Shuffling Network [Waksman 1968] Cost per shuffle: 5B

Slide 28

Slide 28 text

4-Block ORAM

Slide 29

Slide 29 text

4-Block ORAM Cost: 5B + B +2 B +3 B + … = 11B every 3 accesses

Slide 30

Slide 30 text

Linear scan Cost: 4B = 12B/3 Our scheme Cost: 11B/3 Less expensive than linear scan for 4 blocks (8 with overhead)

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

Logical index/4 Logical index/2

Slide 33

Slide 33 text

Logical index/4 Logical index/2 read a[8] First Access

Slide 34

Slide 34 text

Logical index/4 Logical index/2 read a[8] First Access

Slide 35

Slide 35 text

Logical index/4 Logical index/2 read a[8] First Access

Slide 36

Slide 36 text

Logical index/4 Logical index/2 read a[8] First Access

Slide 37

Slide 37 text

Logical index/4 Logical index/2 read a[8] First Access

Slide 38

Slide 38 text

After First Access Used (Public)

Slide 39

Slide 39 text

Second Access read a[9]

Slide 40

Slide 40 text

Second Access read a[9] Randomly select unused element

Slide 41

Slide 41 text

Second Access read a[9] Randomly select unused element Randomly select unused element

Slide 42

Slide 42 text

Second Access read a[9] Randomly select unused element Randomly select unused element

Slide 43

Slide 43 text

Second Access read a[9] Randomly select unused element Randomly select unused element

Slide 44

Slide 44 text

After Second Access

Slide 45

Slide 45 text

Position map 3 0 2 1 0 1 2 3 1 3 0 2 0 1 2 3

Slide 46

Slide 46 text

Creating position map

Slide 47

Slide 47 text

Creating position map

Slide 48

Slide 48 text

Inverse permutation , , ⋅ / = , ⋅ Alice picks a random masking permutation Composed permutation revealed to Bob

Slide 49

Slide 49 text

Inverse permutation , Bob computes / 12 = 12 ⋅, 12 , / 12 ⋅ , = 12 ⋅ , 12 ⋅ , = 12 / = , ⋅ / 12

Slide 50

Slide 50 text

Scheme 1. Shuffle elements 2. Recreate position map 3. Service = log accesses Amortized cost: Θ log7 per access

Slide 51

Slide 51 text

Initialization cost

Slide 52

Slide 52 text

16-byte blocks 32-byte blocks Pre-Access Cost (not counting initialization) Per-access cost Good enough for National Residency Match?

Slide 53

Slide 53 text

Cost of Matching Best previous result: 128x128 pairs in > 1000 hours [Keller & Scholl 2014] Using Square-Root ORAM: 512x512 pairs in 33 hours Scale needed for national residency match: 35,000 Need 1000x improvement…

Slide 54

Slide 54 text

Scaling to National Match Roth-Peranson: asymmetric matchings – Algorithm that is actually used for NRMP, school matchings, etc. Initialize state by permuting and interleaving Take advantage of data-independent memory patterns: locality, batching, partitioning

Slide 55

Slide 55 text

Oblivious Multilist

Slide 56

Slide 56 text

Phase Time Non-Free Gates Gates/second Initialization 2.07 hours 34 B 4.57 M Bidding 15.01 hours 173 B 3.19 M Total 17.08 hours 207 B 3.36 M Simulated 2016 US National Medical Residency Match: 35,476 prospective residents matching with 4836 programs with 30,750 total slots Running between 2 EC2.c4xlarge nodes in same region (1 Gbps)

Slide 57

Slide 57 text

University of Virginia Charlottesville, Virginia Jack Doerner Samee Zahur

Slide 58

Slide 58 text

David Evans [email protected] www.cs.virginia.edu/evans oblivc.org