Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Memory for Data-Oblivious Computation

Memory for Data-Oblivious Computation

David Evans

June 25, 2016
Tweet

More Decks by David Evans

Other Decks in Research

Transcript

  1. Memory for
    Data-Oblivious
    Computation
    David Evans
    University of Virginia
    oblivc.org

    View Slide

  2. Memory for
    Data-Oblivious Computation
    David Evans
    University of Virginia
    www.cs.virginia.edu/evans

    View Slide

  3. Theory and Practice in Computing
    Quotes from Maurice Wilkes’s Turing Award Lecture (1967)
    Alan Turing

    View Slide

  4. Secure Two-Party Computation
    Alice Bob
    r = f(a, b)
    a b
    r = f(a, b)
    Cryptographic Protocol
    learns nothing about b learns nothing about a

    View Slide

  5. FOCS 1982
    FOCS 1986
    Note: neither paper actually
    describes Yao’s protocol.
    Andrew Yao

    View Slide

  6. Yao’s Protocol: Garbled Circuits
    Function expressed as a Boolean
    Circuit
    Garbled evaluation: no
    information leaked
    Ridiculously expensive (but 1012
    cheaper than 10 years ago)
    Garble
    Encode
    Evaluate
    Decode
    f
    garbled circuit F
    Y
    a b
    Generator Evaluator

    View Slide

  7. Motivating Application:
    Secure Stable Matching

    View Slide

  8. Alice
    Bob
    Colleen
    University Rankings
    A
    C
    B
    A
    B
    C
    C
    B
    A
    Student Preferences

    View Slide

  9. Stable Matching
    Alice
    Bob
    Colleen
    ACB
    ABC
    CBA
    M = { (s1
    , r1
    ), (s2
    , r2
    ), … }
    is a stable matching if there is
    no pair (si
    , rj
    ) where both si
    and
    rj
    prefer this match over the
    given match

    View Slide

  10. Gale-Shapley Algorithm
    Lloyd Shapley (1923-2016)
    accepting Nobel Prize (2012)

    View Slide

  11. Stable Matching Applications
    Public schools in New York, Boston
    Singapore University Admissions
    Medical residents in
    US, Canada, others
    35,000 applicants

    View Slide

  12. Stable Matching Applications
    Public schools in New York, Boston
    Singapore University Admissions
    Medical residents in
    US, Canada, others
    35,000 applicants
    Use Trusted Third Party to run matching algorithm:
    - Receives all private rankings and keeps confidential
    - Produces correct result - uncorrupted

    View Slide

  13. Secure Two-Party Stable Matching Protocol
    Each group trusts one representative
    Doug Tsinghua

    View Slide

  14. Secure Two-Party Stable Matching Protocol
    Each group trusts one representative XOR-share to 2 non-colluding parties
    Doug
    S
    T
    Tsinghua

    View Slide

  15. View Slide

  16. Data-dependent
    lookup in size-n array

    View Slide

  17. Data-dependent
    lookup in size-n array
    Data-dependent
    updates to size-n2 array

    View Slide

  18. Data-dependent
    lookup in size-n array
    Oblivious conditionals:
    need to always execute
    all paths
    Data-dependent
    updates to size-n2 array

    View Slide

  19. Data-Oblivious Array Access
    18
    a[i] = x
    Depends on private data

    View Slide

  20. Circuit for Array Update
    19
    i == 0
    a[0] x
    a'[0]
    Linear Scan: need to touch every array element to hide which one is real
    i == 1
    a[1] x
    a'[1]
    i == 2
    a[2] x
    a'[2]
    i == 3
    a[3] x
    a'[3]

    View Slide

  21. Linear Scan Doesn’t Scale
    Writing a single 32-bit integer: 32 logic gates
    Raw Yao’s performance ≈ 3M gates per second
    Write speed ≈ 100,000 elements per second
    (not hiding access pattern) For hiding access pattern,
    N = 217 elements requires
    > 1 second per access

    View Slide

  22. Traditional ORAM
    Client Untrusted Server
    [Goldreich 1987]
    Security property: all initialization and access sequences of the
    same length are indistinguishable to server.
    Sublinear
    client-side
    state
    Linear
    server-side
    encrypted
    state
    Initialize
    Access

    View Slide

  23. RAM-SC
    [Gordon, Katz, Kolesnikov, Krell, Malkin, Raykova, Vahlis2012]
    Alice Bob
    MPC Protocol
    Public
    ORAM
    state
    Public
    ORAM
    state
    Encrypted
    Results
    Oblivious
    ORAM
    state
    Initialize
    Access
    Encrypted
    ORAM
    Data

    View Slide

  24. Circuit ORAM
    Access time
    Xiao Wang, Hubert Chan,
    and Elaine Shi. Circuit
    ORAM: On Tightness of the
    Goldreich-Ostrovsky Lower
    Bound. In ACM CCS 2015.
    State-of-the-
    ORAM-Art in
    2015
    Θ log3
    Linear
    scan

    View Slide

  25. Classical Square-Root ORAM
    [Ostrovsky and Goldreich, 1992]

    View Slide

  26. Problems with SQ-ORAM Design
    Requires a PRF for each ORAM access
    Pseudo-random function: a big circuit in MPC
    Initialization requires PRF evaluations
    Requires oblivious sort twice:
    Shuffling memory according to PRF
    Removing dummy blocks
    Solution strategy: use random
    permutation instead of PRF

    View Slide

  27. Shuffling Network [Waksman 1968]
    Cost per shuffle: 5B

    View Slide

  28. 4-Block ORAM

    View Slide

  29. 4-Block ORAM
    Cost:
    5B
    +
    B
    +2
    B
    +3
    B
    + …
    = 11B every 3 accesses

    View Slide

  30. Linear scan
    Cost: 4B = 12B/3
    Our scheme
    Cost: 11B/3
    Less expensive than linear scan for 4 blocks (8 with overhead)

    View Slide

  31. View Slide

  32. Logical index/4
    Logical index/2

    View Slide

  33. Logical index/4
    Logical index/2
    read a[8]
    First Access

    View Slide

  34. Logical index/4
    Logical index/2
    read a[8]
    First Access

    View Slide

  35. Logical index/4
    Logical index/2
    read a[8]
    First Access

    View Slide

  36. Logical index/4
    Logical index/2
    read a[8]
    First Access

    View Slide

  37. Logical index/4
    Logical index/2
    read a[8]
    First Access

    View Slide

  38. After First
    Access
    Used (Public)

    View Slide

  39. Second
    Access
    read a[9]

    View Slide

  40. Second
    Access
    read a[9]
    Randomly select unused element

    View Slide

  41. Second
    Access
    read a[9]
    Randomly select unused element
    Randomly select unused element

    View Slide

  42. Second
    Access
    read a[9]
    Randomly select unused element
    Randomly select unused element

    View Slide

  43. Second
    Access
    read a[9]
    Randomly select unused element
    Randomly select unused element

    View Slide

  44. After Second
    Access

    View Slide

  45. Position map
    3 0 2 1
    0 1 2 3
    1 3 0 2
    0 1 2 3

    View Slide

  46. Creating position map

    View Slide

  47. Creating position map

    View Slide

  48. Inverse permutation
    ,

    ,

    /
    = ,

    Alice picks a
    random
    masking
    permutation
    Composed
    permutation
    revealed to Bob

    View Slide

  49. Inverse permutation
    ,
    Bob computes
    /
    12 = 12 ⋅,
    12
    ,
    /
    12 ⋅ ,
    = 12 ⋅ ,
    12 ⋅ ,
    = 12
    /
    = ,

    /
    12

    View Slide

  50. Scheme
    1. Shuffle elements
    2. Recreate position map
    3. Service =
    log accesses
    Amortized cost: Θ log7 per access

    View Slide

  51. Initialization cost

    View Slide

  52. 16-byte blocks
    32-byte blocks
    Pre-Access Cost (not counting initialization)
    Per-access cost
    Good enough
    for National
    Residency
    Match?

    View Slide

  53. Cost of Matching
    Best previous result: 128x128 pairs in > 1000 hours
    [Keller & Scholl 2014]
    Using Square-Root ORAM: 512x512 pairs in 33 hours
    Scale needed for national residency match: 35,000
    Need 1000x improvement…

    View Slide

  54. Scaling to National Match
    Roth-Peranson: asymmetric matchings
    – Algorithm that is actually used for NRMP, school
    matchings, etc.
    Initialize state by permuting and interleaving
    Take advantage of data-independent memory
    patterns: locality, batching, partitioning

    View Slide

  55. Oblivious Multilist

    View Slide

  56. Phase Time Non-Free Gates Gates/second
    Initialization 2.07 hours 34 B 4.57 M
    Bidding 15.01 hours 173 B 3.19 M
    Total 17.08 hours 207 B 3.36 M
    Simulated 2016 US National Medical Residency Match:
    35,476 prospective residents matching with 4836 programs with 30,750 total slots
    Running between 2 EC2.c4xlarge nodes in same region (1 Gbps)

    View Slide

  57. University of Virginia
    Charlottesville, Virginia
    Jack
    Doerner
    Samee
    Zahur

    View Slide

  58. David Evans
    [email protected]
    www.cs.virginia.edu/evans
    oblivc.org

    View Slide