Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Differential Privacy - Data Science with Privacy at Scale

Differential Privacy - Data Science with Privacy at Scale

髙橋翼 / Tsubasa TAKAHASHI
LINE株式会社 Senior Research Scientist

2023.6.9 慶應義塾⼤学「先端研究(CI)」での講義資料です

LINE Developers
PRO

June 09, 2023
Tweet

More Decks by LINE Developers

Other Decks in Technology

Transcript

  1. Differential Privacy
    - Data Science with Privacy at Scale
    2023.6.9 慶應義塾⼤学「先端研究(CI)」
    Tsubasa TAKAHASHI
    Senior Research Scientist
    LINE Corp.

    View Slide

  2. • Self Introduction / LINE’s R&D in Privacy Techs (5min)
    • Privacy Risks, Issues, and Case-studies (10min)
    • Differential Privacy (Central Model) (30min)
    • Query Release via Laplace Mechanism
    • Machine Learning via DP-SGD
    • Local Differential Privacy (20min)
    • Stats Gathering via Randomized Response
    • Federated Learning via LDP-SGD
    • Shuffle Model – an intermediate privacy model (10min)
    • QA
    Table of Contents
    2

    View Slide

  3. Tsubasa TAKAHASHI, Ph.D.
    Senior Research Scientist at LINE Data Science Center
    3
    R&D Activity
    • R&D on Privacy Techs (LINE Data Science Center)
    • Differential Privacy / Federated Learning / …
    • R&D on Trustworthy AI
    Selected Publication
    • Differential Privacy @VLDB22 / SIGMOD22 / ICLR22 /
    ICDE21
    • DP w/ Homomorphic Encryption @BigData22
    • Adversarial Attacks @BigData19
    • Anomaly/OOD Detection @WWW17 / WACV23
    Univ.
    NEC
    LINE
    B.E. / M.E. (CS)
    from U. Tsukuba
    Ph.D. from U. Tsukuba
    Visiting Scholar @CMU
    上林奨励賞
    Central Labs
    R&D on Data Privacy2010~15
    R&D on AI Security2016~18
    R&D on Privacy Tech 2019~
    2010~18
    2018.12~

    View Slide

  4. n Publications on major database and machine learning conferences
    n These achievements are based on collaborations w/ academia
    5
    LINE’s R&D on Privacy Techs
    https://linecorp.com/ja/pr/news/ja/2022/4269

    View Slide

  5. n Released on late September 2022
    n Learning sticker recommendation feature is now on your app
    6
    Federated Learning w/ Differential Privacy
    https://www.youtube.com/watch?v=kTBshg1O7b0
    https://tech-verse.me/ja/sessions/124

    View Slide

  6. Privacy Techs is an “Innovation Triger”
    7
    https://infocert.digital/analyst-reports/2021-gartner-hype-cycle-for-privacy/
    市場動向︓the 2021 Gartner Hype Cycle for Privacy

    View Slide

  7. Privacy Risks, Issues, and
    Case-studies
    8

    View Slide

  8. n Even when only statistical information is disclosed, the "difference" will
    reveal the data of specific individuals.
    9
    Difference Attack
    avg. salary
    = 7M JPY
    avg. salary
    = 6.8M JPY
    Alice’s salary can be revealed
    by using this simple math.
    … … … …
    Alice
    Alice was
    retired.
    30 engineers 29 engineers
    700 x 30 – 680 x 29
    = 12M JPY

    View Slide

  9. n From stats delivered to advertisers, various user info could be estimated
    n The vulnerability has been fixed.
    10
    A case study of difference attack: Facebook’s PII-based Targeting
    https://www.youtube.com/watch?v=Lp-IwYvxGpk
    https://www.ftc.gov/system/files/documents/public_events
    /1223263/p155407privacyconmislove_1.pdf
    Facebook had installed thresholding, rounding, etc.
    for disclosure control, but they could be passed
    phone number estimation from e-mail address / web
    access history estimation

    View Slide

  10. n Recreation of individual-level data from tabular or aggregate data
    11
    Database Reconstruction
    Reconstruction like “sudoku”
    There are rules, algorithms and dependencies
    An Example
    https://www2.census.gov/about/training-workshops/2021/2021-05-07-das-presentation.pdf

    View Slide

  11. n US Census reports various stats for policy-making and academic research
    n For the results of 2010, it found that reconstruction attacks are possible
    12
    A case study of reconstruction: US Census 2010
    https://www2.census.gov/about/training-workshops/2021/2021-05-07-das-presentation.pdf

    View Slide

  12. n k-anonymization: any record has k-1 duplicates having the same quasi-identifiers
    n Quasi-identifiers: a (predefined) combination of attributes that has a chance to
    identify individuals
    13
    k-anonymization
    https://dataprivacylab.org/dataprivacy/projects/kanonymity/kanonymity.pdf

    View Slide

  13. n k-anonymization is only effective behind assumed adversary’s knowledge
    n By linking external knowledge, adversary can achieve re-identification
    14
    k-anonymization is vulnerable

    View Slide

  14. n Data analytics competition w/ publishing a dataset that removes identifiers
    n Unfortunately, the records can be re-identified and linked w/ public data
    15
    A case study of de-anonymization: Netflix Prize
    Pseudo ID Title Rating Review Date
    1023 xxx 5 20xx/1/1
    yyy 5 20xx/1/1
    zzz 2 20xx/1/1

    aaa 5 20xx/3/1
    20ab xxx 4
    zzz 5
    … …
    98u7 ddd 2 20xx/4/5
    fff 4 20xx/4/6
    Title Rating Comments
    xxx 3
    xxx 5 (political interests
    are included)
    yyy 5
    zzz 2

    aaa 5


    8 Ratings
    à Identify w/ 99% acc.
    2 Ratings
    à Identify w/ 68% acc.
    Mr. A
    Ms. B
    Linking
    Netflix Data IMDB Anonymous Data
    Using external knowledge
    identified as a real person

    View Slide

  15. Differential Privacy
    16

    View Slide

  16. What is Differential Privacy?
    17
    “Differential privacy is a research topic in
    the area of statistics and data analytics that
    uses hashing, subsampling and noise
    injection to enable crowdsourced learning
    while keeping the data of individual users
    completely private.”
    On WWDC2016, Craig Federighi (Apple) said,
    https://www.wired.com/2016/06/apples-differential-privacy-collecting-data/

    View Slide

  17. Disclosure Avoidance in US Census 2020
    18
    https://www.census.gov/about/policies/privacy/statistical_safeguards/
    disclosure-avoidance-2020-census.html

    View Slide

  18. n Mathematical privacy notion that can compose multiple ones
    n Guaranteed by randomized mechanisms (i.e., injecting noise)
    19
    What is Differential Privacy?
    𝜖!
    𝜖!
    𝜖"
    𝜖!
    𝜖"
    𝜖#


    Sensitive
    Database
    𝑫
    satisfying 𝝐𝟏
    -DP
    satisfying 𝝐𝒌
    -DP

    Query
    𝒒𝟏
    Query
    𝒒𝒌
    Privacy by
    Randomization
    Composable Privacy

    View Slide

  19. A randomized mechanism ℳ: 𝒟 → 𝒮 satisfies 𝜖-DP if, for any two neighboring
    databases 𝑫, 𝑫! ∈ 𝓓 such that 𝐷′ differs from 𝐷 in at most one record and any
    subset of outputs 𝑆 ⊆ 𝒮, it holds that
    20
    𝝐-Differential Privacy
    Pr ℳ 𝐷 ∈ 𝑆 ≤ exp 𝜖 Pr ℳ 𝐷! ∈ 𝑆
    𝝐 : privacy parameter, privacy budget (𝟎 ≤ 𝝐 ≤ ∞)
    C. Dwork. Differential privacy. ICALP, 2006.
    Sensitive Data 𝑫 Output
    𝑫′︓𝑫’neighboring databases
    𝜖
    0 ∞
    0.5 1 2
    strong weak
    4 8 …

    View Slide

  20. n Any pair of databases that differ only one record
    n Differential privacy aims to conceal the difference among the neighbors
    21
    Neighboring Databases
    NAME Salary
    Alice ¥10M
    Bob ¥20M
    Cynthia ¥5M
    David ¥3M

    𝑫’s neighboring databases (examples)
    NAME Salary
    Alice ¥10M
    Bob ¥20M
    Cynthia ¥5M
    David ¥3M
    Eve ¥15M
    NAME Salary
    Alice ¥10M
    Cynthia ¥5M
    David ¥3M
    NAME Salary
    Alice ¥10M
    Bob ¥20M
    David ¥3M
    NAME Salary
    Alice ¥10M
    Bob ¥20M
    Cynthia ¥5M
    David ¥3M
    Franc ¥100M
    𝑫
    𝑑"
    𝐷, 𝐷! = 1 𝑑# ⋅,⋅ ︓Humming distance
    In the most standard case, we assume adding/removing one record.

    View Slide

  21. n Most basic randomization for differential privacy
    n Parameter: sensitivity Δ#
    , privacy budget 𝜖
    22
    Laplace Mechanism and Sensitivity
    ℳ 𝐷 = 𝑓 𝐷 + Lap 0,
    Δ-
    𝜖
    𝜖 = 10, Δ#
    = 1 𝜖 = 1, Δ#
    = 1 𝜖 = 0.1, Δ# = 1
    Adding a noise sampled from the Laplace distribution
    whose mean is 0 and variance is 2 $!
    %
    &
    .
    Δ#
    = sup
    $,$!∈𝒟
    𝑓 𝐷 − 𝑓 𝐷#
    $
    ℓ𝟏
    -sensitivity
    Laplace Mechanism

    View Slide

  22. Proof: Laplace Mechanism satisfies 𝝐-DP
    23
    Pr[𝑀 𝐷 = 𝑦]
    Pr[𝑀 𝐷# = 𝑦]
    =
    Π%
    𝑃&'(
    𝑦%
    − 𝑓 𝐷 %
    Π%
    𝑃&'(
    𝑦%
    − 𝑓 𝐷#
    %
    = Π% exp 𝑏)$ 𝑦% − 𝑓 𝐷 % − 𝑦% − 𝑓 𝐷#
    %
    ≤ Π% exp 𝑏)$ 𝑓 𝐷 % − 𝑓 𝐷#
    %
    = exp 𝑏)$ D
    %
    𝑓 𝐷 % − 𝑓 𝐷#
    %
    = exp 𝑏)$ 𝑓 𝐷 − 𝑓 𝐷#
    $
    = exp
    𝜖
    Δ*
    𝑓 𝐷 − 𝑓 𝐷#
    $
    ≤ exp 𝜖
    𝑃&'( 𝑥 =
    1
    2𝑏
    exp(−𝑏)$|𝑥|) 𝑏 =
    Δ*
    𝜖
    Δ* ≥ 𝑓 𝐷 − 𝑓 𝐷#
    $
    𝑥$ − 𝑥+ ≤ |𝑥$ − 𝑥+|

    View Slide

  23. n Compute the average salary over 30 engineers
    24
    Example of Randomization
    avg. salary
    = 7M JPY
    avg. salary
    = 6.8M JPY
    … … … …
    Alice
    Alice was
    retired.
    30 engineers 29 engineers
    Assume
    • Max salary: 30M JPY
    • 𝜖: 1
    Δ$%& = '$( )$*$+,
    -
    =1M
    à (7 + 0.8)M JPY à (6.8 + 1.3)M JPY
    𝜎 = .'()
    /
    =1M
    ℳ 𝐷 = 𝑓 𝐷 + Lap 0, 1M

    View Slide

  24. n Easy to implement
    25
    Implementation of Laplace Mechanism

    View Slide

  25. n Due to generating random noise, the outputs are probabilistic.
    26
    Behavior of Laplace Mechanism
    𝜖 = 1, Δ0 = 1 𝜖 = 1, Δ0 = 1 𝜖 = 1, Δ0 = 1

    View Slide

  26. n Varying privacy parameter 𝜖
    27
    Behavior of Laplace Mechanism
    𝜖 = 0.1 𝜖 = 0.5
    𝜖 = 2
    𝜖 = 0.05
    𝜖 = 10
    Δ0 = 1

    View Slide

  27. n Well-known relaxed version of DP
    n Satisfy (𝝐,𝜹)-DP by injecting a particular Gaussian noise
    28
    (𝝐,𝜹)-DP & Gaussian Mechanism
    Pr ℳ 𝑥"
    ∈ 𝑆 ≤ exp 𝜖 Pr ℳ 𝑥#
    ∈ 𝑆 + 𝛿
    0 ≤ 𝛿 < 𝑛12
    𝜎 =
    2 log 1.25/𝛿
    𝜖
    ℳ 𝐷 = 𝑓 𝐷 + 𝒩(0, Δ-
    C𝜎C) Δ*
    = sup
    ,,,!∈𝒟
    𝑓 𝐷 − 𝑓 𝐷#
    +
    ℓ3
    -sensitivity
    Gaussian Mechanism
    𝝐, 𝜹 -differential privacy

    View Slide

  28. n An interpretation of DP in view of statistical testing
    n Assume a game that guesses the input source from the randomized output
    29
    Differential Privacy as Hypothesis Testing

    𝑦
    𝐷 or 𝐷# ?
    𝐷 or
    𝐷#
    𝐷
    𝐷′
    𝜖456 = max log
    1 − 𝛿 − FP
    FN
    , log
    1 − 𝛿 − FN
    FP
    Empirical differential privacy
    Peter Kairouz, et al. The composition theorem for differential privacy. ICML2015
    True Input Guess
    False Positive 𝐷 𝐷*
    False Negative 𝐷* 𝐷

    View Slide

  29. n Theory of composing multiple differentially private mechanisms
    30
    Privacy Composition
    𝜖!
    𝜖!
    𝜖"
    𝜖!
    𝜖"
    𝜖#

    #Queries
    𝜖!
    𝜖"
    𝜖#

    𝜖#$!

    Total
    Privacy
    Budget
    Sensitive
    Database
    𝑫
    satisfying 𝝐𝟏
    -DP
    satisfying 𝝐𝒌
    -DP

    Query
    𝒒𝟏
    Privacy
    Parameter
    𝝐𝟏
    Query
    𝒒𝒌
    Privacy
    Parameter
    𝝐𝒌

    View Slide

  30. Privacy Composition
    31
    Sequential Composition
    𝜖 = D
    %∈ /
    𝜖/
    𝜖 = max 𝜖$
    , … , 𝜖/
    Let ℳ!
    , … , ℳ#
    satisfy 𝜖!
    , … , 𝜖#
    respectively.
    𝐷’s total privacy consumption by ℳ!
    𝐷 , … , ℳ#
    𝐷 is
    Parallel Composition
    Let ℳ!
    , … , ℳ#
    satisfy 𝜖!
    , … , 𝜖#
    respectively.
    Let 𝐷 = 𝐷!
    ∪ ⋯ ∪ 𝐷#
    where 𝐷%
    ∩ 𝐷&'%
    = ∅.
    𝐷’s total privacy consumption by ℳ!
    𝐷!
    , … , ℳ#
    𝐷#
    is
    Example
    ℳ2
    ℳ3
    𝜖2 = 1
    ℳ7
    𝜖3 = 0.5
    𝜖7 = 1.5
    ℳ!
    ℳ" ℳ(
    1
    0.5 1.5
    1.5
    1
    1
    2
    2.5
    SUM
    max

    View Slide

  31. n Sequential composition is the most conservative upper-bound à very loose
    n Seeking a tighter composition theorem is a core of DP researches
    32
    Sequential Composition is Loose
    𝜖
    #compositions
    Existing composition theorems
    ・Strong Composition
    ・Advance Composition
    ・Rényi Differential Privacy (RDP)

    Sequential Com
    position
    ideal

    View Slide

  32. n Well-known tighter privacy composition on Rényi divergence
    n Recent studies consider compositions on RDP, then translate into (𝜖,𝛿)-DP
    33
    Rényi Differential Privacy (RDP)
    A randomized mechanism ℳ: 𝒟8 → 𝒮 is 𝜖-RDP of order 𝜆 ∈ (1, ∞) (or (𝜆, 𝜖)-RDP)
    if for any neighboring databases 𝐷, 𝐷9 ∈ 𝒟8, the Rényi divergence of order 𝜆
    between ℳ 𝐷 and ℳ 𝐷9 is upper-bounded by 𝜖:
    𝐷: ℳ 𝐷 || ℳ 𝐷9 =
    1
    𝜆 − 1
    log 𝔼;∼ℳ >+
    ℳ 𝐷 𝜙
    ℳ 𝐷9 𝜙
    :
    ≤ 𝜖
    where ℳ 𝐷 𝜙 denotes the probability of ℳ which takes 𝐷 as input outputting 𝜙.
    𝜖 𝜆 = 𝜖 +
    1
    𝜆 − 1
    log 𝑚 𝜆, 𝜖, 𝛿
    Where 𝑚 𝜆, 𝜖, 𝛿 = min 𝑟) 𝑟 − 𝛿 !*) + 1 − 𝑟 ) 𝑒+ − 𝑟 + 𝛿 !*)
    DP-to-RDP conversion
    Rényi Differential Privacy (RDP) https://arxiv.org/abs/1702.07476
    https://arxiv.org/abs/2008.06529

    View Slide

  33. Q: How can we explore (unforeknown) data to design data analytics while
    preserving privacy without the query limitation?
    34
    Querying without Privacy Budget Limitation
    𝜖!
    𝜖!
    𝜖"
    𝜖!
    𝜖"
    𝜖#

    #Queries
    𝜖!
    𝜖"
    𝜖#

    𝜖#$!

    Total
    Privacy
    Budget
    Sensitive
    Database
    𝑫
    satisfying 𝝐𝟏
    -DP
    satisfying 𝝐𝒌,𝟏
    -DP

    Query
    𝒒𝟏
    Privacy
    Parameter
    𝝐𝟏
    Query
    𝒒𝒌?𝟏
    Privacy
    Parameter
    𝝐𝒌?𝟏

    View Slide

  34. n Construction of intermediate privatized “view” (P-view) towards
    actualizing any query responses with smaller noise
    35
    HDPView: A Differentially Private View
    Noise
    Resistance Space
    Efficient
    Query
    Agnostic
    Analytical
    Reliability
    Accepted at VLDB2022
    https://arxiv.org/abs/2203.06791

    View Slide

  35. Partitioning Strategy
    36
    1+ 0+ 5+ 4+
    2+ 1+ 8+ 7+
    13+ 62+ 64+ 0+
    0+ 0+ 1+ 1+
    Age
    ~20
    20~30
    30~40
    40~50
    ~10M 20M 30M 40M
    4+ 24+
    13+ 126+ 0+
    2+
    Salary
    1+ 0+ 5+ 4+
    2+ 1+ 8+ 7+
    13+ 62+ 64+ 0+
    0+ 0+ 1+ 1+
    4+ 24+
    13+ 126+ 0+
    2+
    79 + 6x
    (4 + ) +
    (13 + ) +
    (126 + ) /2
    = 80 + 2.5x
    PE
    àAE=0
    PE
    àAE=1
    AE︓Aggregation Error
    PE︓Perturbation Error
    Q. How can we find a partitioning minimizing AE + PE?
    Range
    Counting
    Query
    Data-aware
    Partitioning

    View Slide

  36. n Recursive bisection–based algorithm
    n Scalable / Data-distribution-aware / Privacy budget efficiency
    37
    Algorithm & Performance of HDPView
    1 0 6 0 2 2
    2 32 8 4 3 4
    0 1 64 0 16 0
    0 0 0 0 12 1
    9 8 24 2 3 4
    6 6 0 6 3 4
    1 0 6
    2 32 8
    0 1 64
    0 2 2
    4 3 4
    0 16 0
    0 0 0 0 12 1
    9 8 24 2 3 4
    6 6 0 6 3 4
    1 0 6 0 2 2
    2 32 8 4 3 4
    0 1 64 0 16 0
    0 0 0 0 12 1
    9 8 24 2 3 4
    6 6 0 6 3 4
    Identity Privtree HDMM Privbayes
    HDPView
    (ours)
    ARR 1.94×10- 7.05 35.34 3.79 𝟏. 𝟎𝟎
    Average Relative Error over 8 datasets
    Size of p-view
    Each block runs two mechanisms
    1. Random converge: distinguish stop or not
    2. Random cut: choose a cutting point

    View Slide

  37. n Get randomized model parameters by randomizing the gradient
    n Employ gradient clipping since the sensitivity of gradients is intractable
    38
    DP-SGD: Differentially Private Stochastic Gradient Decent
    Sensitive
    Database
    𝑫
    𝑔@ = ∇A.
    ℒ(𝑥; 𝜃@) 𝜃@?2 = 𝜃@ − 𝜂𝑔@
    Sample batch Compute gradient Update parameters
    𝜃B
    Until converge
    Non-private SGD
    https://arxiv.org/abs/1607.00133

    View Slide

  38. n Get randomized model parameters by randomizing the gradient
    n Employ gradient clipping since the sensitivity of gradients is intractable
    39
    DP-SGD: Differentially Private Stochastic Gradient Decent
    Sensitive
    Database
    𝑫
    𝑔@ = ∇A.
    ℒ(𝑥; 𝜃@) 𝜃@?2 = 𝜃@ − 𝜂𝑔@
    Sample batch Compute gradient Update parameters
    𝜃B
    𝜃@?2 = 𝜃@ − 𝜂 ]
    𝑔@ 𝜃B
    Until converge
    while 𝝐 remains
    Clipping
    & Adding Noise
    ]
    𝑔@ = ^
    C∈E
    𝜋F 𝑔C,@ + 𝑁 0, 𝐶𝜎 3𝐼
    𝜋F 𝑔C,@ = 𝑔C,@ ⋅ min 1,
    𝐶
    𝑔C,@ 3
    Clipping
    Adding
    Noise
    random
    sampling
    Sensitivity is bounded
    at the constant 𝑪.
    𝝐 is computed by
    𝝈, 𝜸, 𝜹 and 𝑻.
    sampling rate: 𝛾
    Non-private SGD
    DP-SGD
    𝑔!,#
    : per-sample gradient of 𝑖
    https://arxiv.org/abs/1607.00133

    View Slide

  39. n Training a data synthesis model that imitates original sensitive dataset
    n Issue: training process is sensitive to noise since the process is complicated
    n Approach: data embedding that is robust against noise under dp constraint
    40
    Privacy Preserving Data Synthesis
    Train
    with
    Generative Model
    Synthesize
    Naïve Method
    (VAE w/ DP-SGD)
    P3GM (ours)
    ε=1.0
    ε=0.2
    PEARL (ours)
    ε=1.0 ε=1.0
    Naïve P3GM PEARL
    Embedding
    End-to-end
    w/ DP-SGD
    DP-PCA Characteristic
    Function under DP
    Reconstruction DP-SGD Non-private
    (adversarial)
    High reconstruction performances
    under practical privacy level (ε≦1)
    Accepted at
    ICDE2021 / ICLR2022
    https://arxiv.org/abs/
    2006.12101
    https://arxiv.org/abs
    /2106.04590

    View Slide

  40. Local Differential Privacy
    41

    View Slide

  41. n Privacy-preserving mechanism allows inferring statistics about populations
    while preserving the privacy of individuals
    n No trusted entity is required
    42
    Privacy-Preserving Mechanism for Collecting Data



    Server
    𝑥2
    𝑥3
    𝑥8
    ]
    𝑥2
    ]
    𝑥3
    ]
    𝑥8
    … …
    Indistinguishable
    𝑥f
    ∈ 𝒳
    𝒳 ∈ { }
    Randomized
    Original

    View Slide

  42. Local Differential Privacy
    43
    Pr ℳ 𝑥!
    ∈ 𝑆 ≤ exp 𝜖 Pr ℳ 𝑥"
    ∈ 𝑆
    A randomized mechanism ℳ: 𝒳 → 𝒮 is said to satisfy 𝜖-LDP
    if and only if, for any input pair 𝑥2, 𝑥3 ∈ 𝒳 and any output 𝑆 ⊆ 𝒮,
    it holds that:
    𝑥+
    : (1 0 0 0)
    𝑥$
    : (0 0 1 0)
    neighboring databases is
    different against CDP.
    LDP: replacement = remove & add
    𝝐-local differential privacy (𝝐-LDP)



    𝑥2
    𝑥3
    𝑥8
    ]
    𝑥2
    ]
    𝑥3
    ]
    𝑥8
    … …
    Indistinguishable

    View Slide

  43. (Central) DP vs Local DP
    44



    Server
    𝑥2
    𝑥3
    𝑥8
    ]
    𝑥2
    ]
    𝑥3
    ]
    𝑥8
    … …
    Indistinguishable
    𝑥2
    𝑥3
    𝑥8


    Server
    Trusted Not required
    to be trusted
    Neighboring DB:
    add/remove
    Neighboring DB:
    replacement
    Central DP Local DP

    View Slide

  44. n Randomize an item selection following a differentially private way
    45
    Randomized Response
    𝒳 ∈ { }
    Randomized
    Original
    𝑅𝑅 𝑥 = Y
    𝑥 𝑤. 𝑝.
    exp 𝜖
    exp 𝜖 + 𝑘 − 1
    𝑥# ∼ 𝒳 ∖ 𝑥 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

    randomly select
    𝑘: |𝒳| (#items)

    View Slide

  45. n Examples on synthetic data (N randomized reports, including 100 items)
    n Errors are significantly reduced when gathering more randomized reports
    46
    Stats Gathering w/ Privacy at Scale
    𝑁 = 10,000 𝑁 = 10,000,000

    View Slide

  46. n The probabilistic data structure is very useful to estimate frequency with
    having noise-resistance property and communication efficiency
    47
    Rand. Mech. w/ Probabilistic Data Structure
    https://petsymposium.org/2016/files/papers/Building_a_RAPPOR_with_the_Unknown__Pri
    vacy-Preserving_Learning_of_Associations_and_Data_Dictionaries.pdf
    https://machinelearning.apple.com/research/learnin
    g-with-privacy-at-scale
    RAPPOR by Google (Bloom Filter) Private Count Mean Sketch by Apple

    View Slide

  47. n Collaborative learning w/ server and clients
    n Raw data never leaves clients’ devices
    48
    Federated Learning
    Non-participants of FL
    Global Model
    https://proceedings.mlr.press/v54/mcmahan17a/mcmahan17a.pdf
    https://arxiv.org/abs/1912.04977
    First FL paper
    Survey paper

    View Slide

  48. Gradient Inversion - Privacy Issues in FL
    49
    (出典)
    “Inverting Gradients - How easy is it to break privacy
    in federated learning?”
    https://arxiv.org/abs/2003.14053
    Can we reconstruct an image
    used in training from a
    gradient?
    à Yes.

    View Slide

  49. n Central model: clients send raw grads and server aggregates them w/ noise
    n Local model: clients send randomized grads and server aggregates them
    50
    Federated Learning under Differential Privacy
    Global Model Global Model
    Raw
    Gradient
    Randomized
    Gradient
    Noise Injection
    Central Model Local Model

    View Slide

  50. n Randomize response via randomizing gradient’s direction
    n Randomly select the green zone or the white zone, and then uniformly pick
    a vector from the selected zone
    51
    LDP-SGD
    https://arxiv.org/abs/2001.03618

    View Slide

  51. n Empirical measurement with instantiated adversaries for LDP-SGD
    n The worst-case flipping the gradient direction reaches the theoretical bound
    52
    Empirical Privacy Measurement in LDP-SGD
    https://arxiv.org/abs/2206.09122

    View Slide

  52. n LDP enables us to collect users’ data in a privatized way, but the amount of
    noise tends to be prohibitable
    53
    Issues in Local DP
    Randomized
    Original
    Global Model
    Randomized
    Gradient

    View Slide

  53. Shuffle Model
    – an intermediate privacy model
    54

    View Slide

  54. n Intermediate trusted entity “shuffler” anonymizes local users’ identity
    n Each client encrypts their randomized content w/ the server’s public key,
    then shuffler only mixes their identifies w/o looking at the contents
    55
    Shuffle model – an intermediate privacy model
    ]
    𝑥2
    ]
    𝑥3
    ]
    𝑥8
    l
    𝑥2
    (l
    𝑥2, l
    𝑥3, … , l
    𝑥8)
    l
    𝑥3
    l
    𝑥8
    Server
    Randomized w/ 𝝐𝟎
    Shuffle
    Shuffler
    Send the shuffled batch
    anonymized

    View Slide

  55. n Shuffler can amplify differential privacy à possibility to decrease local noise
    n The amplification on shuffler translates LDP on clients into CDP
    56
    Privacy Amplification via Shuffling
    𝜖I = 8 (LDP) 𝛿 = 101J 𝑘 = 10
    by Hiding among clones
    Example in k-randomized response
    https://arxiv.org/abs/2012.12803
    Privacy Amplification
    8
    𝑥2
    8
    𝑥&
    8
    𝑥3
    :
    𝑥2
    (:
    𝑥2
    , :
    𝑥&
    , … , :
    𝑥3
    )
    :
    𝑥&
    :
    𝑥3
    Shuffler Server
    𝝐𝟎
    (LDP)
    𝝐 < 𝝐𝟎
    (CDP)

    View Slide

  56. n Using shuffler and sub-sampling, FL also can employ privacy amplifications
    n Clients randomly check-ins federated learning at each iteration
    57
    Shuffle Model in Federated Learning
    Higher accuracy at a strong privacy level (smaller 𝜖)
    weak privacy
    ]
    𝑥2
    ]
    𝑥K
    ]
    𝑥8
    Shuffler Aggregator
    strong privacy
    Privacy amplification
    Sub-sampling & Shuffling in FL https://arxiv.org/abs/2206.03151
    𝝐𝐥𝐝𝐩 = 𝟖
    𝝐𝐜𝐝𝐩 = 𝟏

    View Slide

  57. n Decentralized shuffling via multi-round random walks on a graph
    n In each round, every client relays her randomized reports to one of her
    neighbors (e.g., friends on a social network) via an encrypted channel
    58
    Network Shuffling Accepted at SIGMOD2022
    https://arxiv.org/abs/2204.03919
    The larger graph amplifies privacy the more.

    View Slide

  58. Conclusion
    59

    View Slide

  59. • Privacy Risks, Issues, and Case-studies
    • Differential Privacy (Central Model)
    • Query Release via Laplace Mechanism
    • Machine Learning via DP-SGD
    • Local Differential Privacy
    • Stats Gathering via Randomized Response
    • Federated Learning via LDP-SGD
    • Shuffle Model – an intermediate privacy model
    Topics in this lecture
    60

    View Slide

  60. View Slide