Slide 1

Slide 1 text

Differential Privacy - Data Science with Privacy at Scale 2023.6.9 慶應義塾⼤学「先端研究(CI)」 Tsubasa TAKAHASHI Senior Research Scientist LINE Corp.

Slide 2

Slide 2 text

• Self Introduction / LINE’s R&D in Privacy Techs (5min) • Privacy Risks, Issues, and Case-studies (10min) • Differential Privacy (Central Model) (30min) • Query Release via Laplace Mechanism • Machine Learning via DP-SGD • Local Differential Privacy (20min) • Stats Gathering via Randomized Response • Federated Learning via LDP-SGD • Shuffle Model – an intermediate privacy model (10min) • QA Table of Contents 2

Slide 3

Slide 3 text

Tsubasa TAKAHASHI, Ph.D. Senior Research Scientist at LINE Data Science Center 3 R&D Activity • R&D on Privacy Techs (LINE Data Science Center) • Differential Privacy / Federated Learning / … • R&D on Trustworthy AI Selected Publication • Differential Privacy @VLDB22 / SIGMOD22 / ICLR22 / ICDE21 • DP w/ Homomorphic Encryption @BigData22 • Adversarial Attacks @BigData19 • Anomaly/OOD Detection @WWW17 / WACV23 Univ. NEC LINE B.E. / M.E. (CS) from U. Tsukuba Ph.D. from U. Tsukuba Visiting Scholar @CMU 上林奨励賞 Central Labs R&D on Data Privacy2010~15 R&D on AI Security2016~18 R&D on Privacy Tech 2019~ 2010~18 2018.12~

Slide 4

Slide 4 text

n Publications on major database and machine learning conferences n These achievements are based on collaborations w/ academia 5 LINE’s R&D on Privacy Techs https://linecorp.com/ja/pr/news/ja/2022/4269

Slide 5

Slide 5 text

n Released on late September 2022 n Learning sticker recommendation feature is now on your app 6 Federated Learning w/ Differential Privacy https://www.youtube.com/watch?v=kTBshg1O7b0 https://tech-verse.me/ja/sessions/124

Slide 6

Slide 6 text

Privacy Techs is an “Innovation Triger” 7 https://infocert.digital/analyst-reports/2021-gartner-hype-cycle-for-privacy/ 市場動向︓the 2021 Gartner Hype Cycle for Privacy

Slide 7

Slide 7 text

Privacy Risks, Issues, and Case-studies 8

Slide 8

Slide 8 text

n Even when only statistical information is disclosed, the "difference" will reveal the data of specific individuals. 9 Difference Attack avg. salary = 7M JPY avg. salary = 6.8M JPY Alice’s salary can be revealed by using this simple math. … … … … Alice Alice was retired. 30 engineers 29 engineers 700 x 30 – 680 x 29 = 12M JPY

Slide 9

Slide 9 text

n From stats delivered to advertisers, various user info could be estimated n The vulnerability has been fixed. 10 A case study of difference attack: Facebook’s PII-based Targeting https://www.youtube.com/watch?v=Lp-IwYvxGpk https://www.ftc.gov/system/files/documents/public_events /1223263/p155407privacyconmislove_1.pdf Facebook had installed thresholding, rounding, etc. for disclosure control, but they could be passed phone number estimation from e-mail address / web access history estimation

Slide 10

Slide 10 text

n Recreation of individual-level data from tabular or aggregate data 11 Database Reconstruction Reconstruction like “sudoku” There are rules, algorithms and dependencies An Example https://www2.census.gov/about/training-workshops/2021/2021-05-07-das-presentation.pdf

Slide 11

Slide 11 text

n US Census reports various stats for policy-making and academic research n For the results of 2010, it found that reconstruction attacks are possible 12 A case study of reconstruction: US Census 2010 https://www2.census.gov/about/training-workshops/2021/2021-05-07-das-presentation.pdf

Slide 12

Slide 12 text

n k-anonymization: any record has k-1 duplicates having the same quasi-identifiers n Quasi-identifiers: a (predefined) combination of attributes that has a chance to identify individuals 13 k-anonymization https://dataprivacylab.org/dataprivacy/projects/kanonymity/kanonymity.pdf

Slide 13

Slide 13 text

n k-anonymization is only effective behind assumed adversary’s knowledge n By linking external knowledge, adversary can achieve re-identification 14 k-anonymization is vulnerable

Slide 14

Slide 14 text

n Data analytics competition w/ publishing a dataset that removes identifiers n Unfortunately, the records can be re-identified and linked w/ public data 15 A case study of de-anonymization: Netflix Prize Pseudo ID Title Rating Review Date 1023 xxx 5 20xx/1/1 yyy 5 20xx/1/1 zzz 2 20xx/1/1 … aaa 5 20xx/3/1 20ab xxx 4 zzz 5 … … 98u7 ddd 2 20xx/4/5 fff 4 20xx/4/6 Title Rating Comments xxx 3 xxx 5 (political interests are included) yyy 5 zzz 2 … aaa 5 … … 8 Ratings à Identify w/ 99% acc. 2 Ratings à Identify w/ 68% acc. Mr. A Ms. B Linking Netflix Data IMDB Anonymous Data Using external knowledge identified as a real person

Slide 15

Slide 15 text

Differential Privacy 16

Slide 16

Slide 16 text

What is Differential Privacy? 17 “Differential privacy is a research topic in the area of statistics and data analytics that uses hashing, subsampling and noise injection to enable crowdsourced learning while keeping the data of individual users completely private.” On WWDC2016, Craig Federighi (Apple) said, https://www.wired.com/2016/06/apples-differential-privacy-collecting-data/

Slide 17

Slide 17 text

Disclosure Avoidance in US Census 2020 18 https://www.census.gov/about/policies/privacy/statistical_safeguards/ disclosure-avoidance-2020-census.html

Slide 18

Slide 18 text

n Mathematical privacy notion that can compose multiple ones n Guaranteed by randomized mechanisms (i.e., injecting noise) 19 What is Differential Privacy? 𝜖! 𝜖! 𝜖" 𝜖! 𝜖" 𝜖# … … Sensitive Database 𝑫 satisfying 𝝐𝟏 -DP satisfying 𝝐𝒌 -DP … Query 𝒒𝟏 Query 𝒒𝒌 Privacy by Randomization Composable Privacy

Slide 19

Slide 19 text

A randomized mechanism ℳ: 𝒟 → 𝒮 satisfies 𝜖-DP if, for any two neighboring databases 𝑫, 𝑫! ∈ 𝓓 such that 𝐷′ differs from 𝐷 in at most one record and any subset of outputs 𝑆 ⊆ 𝒮, it holds that 20 𝝐-Differential Privacy Pr ℳ 𝐷 ∈ 𝑆 ≤ exp 𝜖 Pr ℳ 𝐷! ∈ 𝑆 𝝐 : privacy parameter, privacy budget (𝟎 ≤ 𝝐 ≤ ∞) C. Dwork. Differential privacy. ICALP, 2006. Sensitive Data 𝑫 Output 𝑫′︓𝑫’neighboring databases 𝜖 0 ∞ 0.5 1 2 strong weak 4 8 …

Slide 20

Slide 20 text

n Any pair of databases that differ only one record n Differential privacy aims to conceal the difference among the neighbors 21 Neighboring Databases NAME Salary Alice ¥10M Bob ¥20M Cynthia ¥5M David ¥3M … 𝑫’s neighboring databases (examples) NAME Salary Alice ¥10M Bob ¥20M Cynthia ¥5M David ¥3M Eve ¥15M NAME Salary Alice ¥10M Cynthia ¥5M David ¥3M NAME Salary Alice ¥10M Bob ¥20M David ¥3M NAME Salary Alice ¥10M Bob ¥20M Cynthia ¥5M David ¥3M Franc ¥100M 𝑫 𝑑" 𝐷, 𝐷! = 1 𝑑# ⋅,⋅ ︓Humming distance In the most standard case, we assume adding/removing one record.

Slide 21

Slide 21 text

n Most basic randomization for differential privacy n Parameter: sensitivity Δ# , privacy budget 𝜖 22 Laplace Mechanism and Sensitivity ℳ 𝐷 = 𝑓 𝐷 + Lap 0, Δ- 𝜖 𝜖 = 10, Δ# = 1 𝜖 = 1, Δ# = 1 𝜖 = 0.1, Δ# = 1 Adding a noise sampled from the Laplace distribution whose mean is 0 and variance is 2 $! % & . Δ# = sup $,$!∈𝒟 𝑓 𝐷 − 𝑓 𝐷# $ ℓ𝟏 -sensitivity Laplace Mechanism

Slide 22

Slide 22 text

Proof: Laplace Mechanism satisfies 𝝐-DP 23 Pr[𝑀 𝐷 = 𝑦] Pr[𝑀 𝐷# = 𝑦] = Π% 𝑃&'( 𝑦% − 𝑓 𝐷 % Π% 𝑃&'( 𝑦% − 𝑓 𝐷# % = Π% exp 𝑏)$ 𝑦% − 𝑓 𝐷 % − 𝑦% − 𝑓 𝐷# % ≤ Π% exp 𝑏)$ 𝑓 𝐷 % − 𝑓 𝐷# % = exp 𝑏)$ D % 𝑓 𝐷 % − 𝑓 𝐷# % = exp 𝑏)$ 𝑓 𝐷 − 𝑓 𝐷# $ = exp 𝜖 Δ* 𝑓 𝐷 − 𝑓 𝐷# $ ≤ exp 𝜖 𝑃&'( 𝑥 = 1 2𝑏 exp(−𝑏)$|𝑥|) 𝑏 = Δ* 𝜖 Δ* ≥ 𝑓 𝐷 − 𝑓 𝐷# $ 𝑥$ − 𝑥+ ≤ |𝑥$ − 𝑥+|

Slide 23

Slide 23 text

n Compute the average salary over 30 engineers 24 Example of Randomization avg. salary = 7M JPY avg. salary = 6.8M JPY … … … … Alice Alice was retired. 30 engineers 29 engineers Assume • Max salary: 30M JPY • 𝜖: 1 Δ$%& = '$( )$*$+, - =1M à (7 + 0.8)M JPY à (6.8 + 1.3)M JPY 𝜎 = .'() / =1M ℳ 𝐷 = 𝑓 𝐷 + Lap 0, 1M

Slide 24

Slide 24 text

n Easy to implement 25 Implementation of Laplace Mechanism

Slide 25

Slide 25 text

n Due to generating random noise, the outputs are probabilistic. 26 Behavior of Laplace Mechanism 𝜖 = 1, Δ0 = 1 𝜖 = 1, Δ0 = 1 𝜖 = 1, Δ0 = 1

Slide 26

Slide 26 text

n Varying privacy parameter 𝜖 27 Behavior of Laplace Mechanism 𝜖 = 0.1 𝜖 = 0.5 𝜖 = 2 𝜖 = 0.05 𝜖 = 10 Δ0 = 1

Slide 27

Slide 27 text

n Well-known relaxed version of DP n Satisfy (𝝐,𝜹)-DP by injecting a particular Gaussian noise 28 (𝝐,𝜹)-DP & Gaussian Mechanism Pr ℳ 𝑥" ∈ 𝑆 ≤ exp 𝜖 Pr ℳ 𝑥# ∈ 𝑆 + 𝛿 0 ≤ 𝛿 < 𝑛12 𝜎 = 2 log 1.25/𝛿 𝜖 ℳ 𝐷 = 𝑓 𝐷 + 𝒩(0, Δ- C𝜎C) Δ* = sup ,,,!∈𝒟 𝑓 𝐷 − 𝑓 𝐷# + ℓ3 -sensitivity Gaussian Mechanism 𝝐, 𝜹 -differential privacy

Slide 28

Slide 28 text

n An interpretation of DP in view of statistical testing n Assume a game that guesses the input source from the randomized output 29 Differential Privacy as Hypothesis Testing ℳ 𝑦 𝐷 or 𝐷# ? 𝐷 or 𝐷# 𝐷 𝐷′ 𝜖456 = max log 1 − 𝛿 − FP FN , log 1 − 𝛿 − FN FP Empirical differential privacy Peter Kairouz, et al. The composition theorem for differential privacy. ICML2015 True Input Guess False Positive 𝐷 𝐷* False Negative 𝐷* 𝐷

Slide 29

Slide 29 text

n Theory of composing multiple differentially private mechanisms 30 Privacy Composition 𝜖! 𝜖! 𝜖" 𝜖! 𝜖" 𝜖# … #Queries 𝜖! 𝜖" 𝜖# … 𝜖#$! … Total Privacy Budget Sensitive Database 𝑫 satisfying 𝝐𝟏 -DP satisfying 𝝐𝒌 -DP … Query 𝒒𝟏 Privacy Parameter 𝝐𝟏 Query 𝒒𝒌 Privacy Parameter 𝝐𝒌

Slide 30

Slide 30 text

Privacy Composition 31 Sequential Composition 𝜖 = D %∈ / 𝜖/ 𝜖 = max 𝜖$ , … , 𝜖/ Let ℳ! , … , ℳ# satisfy 𝜖! , … , 𝜖# respectively. 𝐷’s total privacy consumption by ℳ! 𝐷 , … , ℳ# 𝐷 is Parallel Composition Let ℳ! , … , ℳ# satisfy 𝜖! , … , 𝜖# respectively. Let 𝐷 = 𝐷! ∪ ⋯ ∪ 𝐷# where 𝐷% ∩ 𝐷&'% = ∅. 𝐷’s total privacy consumption by ℳ! 𝐷! , … , ℳ# 𝐷# is Example ℳ2 ℳ3 𝜖2 = 1 ℳ7 𝜖3 = 0.5 𝜖7 = 1.5 ℳ! ℳ" ℳ( 1 0.5 1.5 1.5 1 1 2 2.5 SUM max

Slide 31

Slide 31 text

n Sequential composition is the most conservative upper-bound à very loose n Seeking a tighter composition theorem is a core of DP researches 32 Sequential Composition is Loose 𝜖 #compositions Existing composition theorems ・Strong Composition ・Advance Composition ・Rényi Differential Privacy (RDP) … Sequential Com position ideal

Slide 32

Slide 32 text

n Well-known tighter privacy composition on Rényi divergence n Recent studies consider compositions on RDP, then translate into (𝜖,𝛿)-DP 33 Rényi Differential Privacy (RDP) A randomized mechanism ℳ: 𝒟8 → 𝒮 is 𝜖-RDP of order 𝜆 ∈ (1, ∞) (or (𝜆, 𝜖)-RDP) if for any neighboring databases 𝐷, 𝐷9 ∈ 𝒟8, the Rényi divergence of order 𝜆 between ℳ 𝐷 and ℳ 𝐷9 is upper-bounded by 𝜖: 𝐷: ℳ 𝐷 || ℳ 𝐷9 = 1 𝜆 − 1 log 𝔼;∼ℳ >+ ℳ 𝐷 𝜙 ℳ 𝐷9 𝜙 : ≤ 𝜖 where ℳ 𝐷 𝜙 denotes the probability of ℳ which takes 𝐷 as input outputting 𝜙. 𝜖 𝜆 = 𝜖 + 1 𝜆 − 1 log 𝑚 𝜆, 𝜖, 𝛿 Where 𝑚 𝜆, 𝜖, 𝛿 = min 𝑟) 𝑟 − 𝛿 !*) + 1 − 𝑟 ) 𝑒+ − 𝑟 + 𝛿 !*) DP-to-RDP conversion Rényi Differential Privacy (RDP) https://arxiv.org/abs/1702.07476 https://arxiv.org/abs/2008.06529

Slide 33

Slide 33 text

Q: How can we explore (unforeknown) data to design data analytics while preserving privacy without the query limitation? 34 Querying without Privacy Budget Limitation 𝜖! 𝜖! 𝜖" 𝜖! 𝜖" 𝜖# … #Queries 𝜖! 𝜖" 𝜖# … 𝜖#$! … Total Privacy Budget Sensitive Database 𝑫 satisfying 𝝐𝟏 -DP satisfying 𝝐𝒌,𝟏 -DP … Query 𝒒𝟏 Privacy Parameter 𝝐𝟏 Query 𝒒𝒌?𝟏 Privacy Parameter 𝝐𝒌?𝟏

Slide 34

Slide 34 text

n Construction of intermediate privatized “view” (P-view) towards actualizing any query responses with smaller noise 35 HDPView: A Differentially Private View Noise Resistance Space Efficient Query Agnostic Analytical Reliability Accepted at VLDB2022 https://arxiv.org/abs/2203.06791

Slide 35

Slide 35 text

Partitioning Strategy 36 1+ 0+ 5+ 4+ 2+ 1+ 8+ 7+ 13+ 62+ 64+ 0+ 0+ 0+ 1+ 1+ Age ~20 20~30 30~40 40~50 ~10M 20M 30M 40M 4+ 24+ 13+ 126+ 0+ 2+ Salary 1+ 0+ 5+ 4+ 2+ 1+ 8+ 7+ 13+ 62+ 64+ 0+ 0+ 0+ 1+ 1+ 4+ 24+ 13+ 126+ 0+ 2+ 79 + 6x (4 + ) + (13 + ) + (126 + ) /2 = 80 + 2.5x PE àAE=0 PE àAE=1 AE︓Aggregation Error PE︓Perturbation Error Q. How can we find a partitioning minimizing AE + PE? Range Counting Query Data-aware Partitioning

Slide 36

Slide 36 text

n Recursive bisection–based algorithm n Scalable / Data-distribution-aware / Privacy budget efficiency 37 Algorithm & Performance of HDPView 1 0 6 0 2 2 2 32 8 4 3 4 0 1 64 0 16 0 0 0 0 0 12 1 9 8 24 2 3 4 6 6 0 6 3 4 1 0 6 2 32 8 0 1 64 0 2 2 4 3 4 0 16 0 0 0 0 0 12 1 9 8 24 2 3 4 6 6 0 6 3 4 1 0 6 0 2 2 2 32 8 4 3 4 0 1 64 0 16 0 0 0 0 0 12 1 9 8 24 2 3 4 6 6 0 6 3 4 Identity Privtree HDMM Privbayes HDPView (ours) ARR 1.94×10- 7.05 35.34 3.79 𝟏. 𝟎𝟎 Average Relative Error over 8 datasets Size of p-view Each block runs two mechanisms 1. Random converge: distinguish stop or not 2. Random cut: choose a cutting point

Slide 37

Slide 37 text

n Get randomized model parameters by randomizing the gradient n Employ gradient clipping since the sensitivity of gradients is intractable 38 DP-SGD: Differentially Private Stochastic Gradient Decent Sensitive Database 𝑫 𝑔@ = ∇A. ℒ(𝑥; 𝜃@) 𝜃@?2 = 𝜃@ − 𝜂𝑔@ Sample batch Compute gradient Update parameters 𝜃B Until converge Non-private SGD https://arxiv.org/abs/1607.00133

Slide 38

Slide 38 text

n Get randomized model parameters by randomizing the gradient n Employ gradient clipping since the sensitivity of gradients is intractable 39 DP-SGD: Differentially Private Stochastic Gradient Decent Sensitive Database 𝑫 𝑔@ = ∇A. ℒ(𝑥; 𝜃@) 𝜃@?2 = 𝜃@ − 𝜂𝑔@ Sample batch Compute gradient Update parameters 𝜃B 𝜃@?2 = 𝜃@ − 𝜂 ] 𝑔@ 𝜃B Until converge while 𝝐 remains Clipping & Adding Noise ] 𝑔@ = ^ C∈E 𝜋F 𝑔C,@ + 𝑁 0, 𝐶𝜎 3𝐼 𝜋F 𝑔C,@ = 𝑔C,@ ⋅ min 1, 𝐶 𝑔C,@ 3 Clipping Adding Noise random sampling Sensitivity is bounded at the constant 𝑪. 𝝐 is computed by 𝝈, 𝜸, 𝜹 and 𝑻. sampling rate: 𝛾 Non-private SGD DP-SGD 𝑔!,# : per-sample gradient of 𝑖 https://arxiv.org/abs/1607.00133

Slide 39

Slide 39 text

n Training a data synthesis model that imitates original sensitive dataset n Issue: training process is sensitive to noise since the process is complicated n Approach: data embedding that is robust against noise under dp constraint 40 Privacy Preserving Data Synthesis Train with Generative Model Synthesize Naïve Method (VAE w/ DP-SGD) P3GM (ours) ε=1.0 ε=0.2 PEARL (ours) ε=1.0 ε=1.0 Naïve P3GM PEARL Embedding End-to-end w/ DP-SGD DP-PCA Characteristic Function under DP Reconstruction DP-SGD Non-private (adversarial) High reconstruction performances under practical privacy level (ε≦1) Accepted at ICDE2021 / ICLR2022 https://arxiv.org/abs/ 2006.12101 https://arxiv.org/abs /2106.04590

Slide 40

Slide 40 text

Local Differential Privacy 41

Slide 41

Slide 41 text

n Privacy-preserving mechanism allows inferring statistics about populations while preserving the privacy of individuals n No trusted entity is required 42 Privacy-Preserving Mechanism for Collecting Data ℳ ℳ ℳ Server 𝑥2 𝑥3 𝑥8 ] 𝑥2 ] 𝑥3 ] 𝑥8 … … Indistinguishable 𝑥f ∈ 𝒳 𝒳 ∈ { } Randomized Original

Slide 42

Slide 42 text

Local Differential Privacy 43 Pr ℳ 𝑥! ∈ 𝑆 ≤ exp 𝜖 Pr ℳ 𝑥" ∈ 𝑆 A randomized mechanism ℳ: 𝒳 → 𝒮 is said to satisfy 𝜖-LDP if and only if, for any input pair 𝑥2, 𝑥3 ∈ 𝒳 and any output 𝑆 ⊆ 𝒮, it holds that: 𝑥+ : (1 0 0 0) 𝑥$ : (0 0 1 0) neighboring databases is different against CDP. LDP: replacement = remove & add 𝝐-local differential privacy (𝝐-LDP) ℳ ℳ ℳ 𝑥2 𝑥3 𝑥8 ] 𝑥2 ] 𝑥3 ] 𝑥8 … … Indistinguishable

Slide 43

Slide 43 text

(Central) DP vs Local DP 44 ℳ ℳ ℳ Server 𝑥2 𝑥3 𝑥8 ] 𝑥2 ] 𝑥3 ] 𝑥8 … … Indistinguishable 𝑥2 𝑥3 𝑥8 … ℳ Server Trusted Not required to be trusted Neighboring DB: add/remove Neighboring DB: replacement Central DP Local DP

Slide 44

Slide 44 text

n Randomize an item selection following a differentially private way 45 Randomized Response 𝒳 ∈ { } Randomized Original 𝑅𝑅 𝑥 = Y 𝑥 𝑤. 𝑝. exp 𝜖 exp 𝜖 + 𝑘 − 1 𝑥# ∼ 𝒳 ∖ 𝑥 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 ℳ randomly select 𝑘: |𝒳| (#items)

Slide 45

Slide 45 text

n Examples on synthetic data (N randomized reports, including 100 items) n Errors are significantly reduced when gathering more randomized reports 46 Stats Gathering w/ Privacy at Scale 𝑁 = 10,000 𝑁 = 10,000,000

Slide 46

Slide 46 text

n The probabilistic data structure is very useful to estimate frequency with having noise-resistance property and communication efficiency 47 Rand. Mech. w/ Probabilistic Data Structure https://petsymposium.org/2016/files/papers/Building_a_RAPPOR_with_the_Unknown__Pri vacy-Preserving_Learning_of_Associations_and_Data_Dictionaries.pdf https://machinelearning.apple.com/research/learnin g-with-privacy-at-scale RAPPOR by Google (Bloom Filter) Private Count Mean Sketch by Apple

Slide 47

Slide 47 text

n Collaborative learning w/ server and clients n Raw data never leaves clients’ devices 48 Federated Learning Non-participants of FL Global Model https://proceedings.mlr.press/v54/mcmahan17a/mcmahan17a.pdf https://arxiv.org/abs/1912.04977 First FL paper Survey paper

Slide 48

Slide 48 text

Gradient Inversion - Privacy Issues in FL 49 (出典) “Inverting Gradients - How easy is it to break privacy in federated learning?” https://arxiv.org/abs/2003.14053 Can we reconstruct an image used in training from a gradient? à Yes.

Slide 49

Slide 49 text

n Central model: clients send raw grads and server aggregates them w/ noise n Local model: clients send randomized grads and server aggregates them 50 Federated Learning under Differential Privacy Global Model Global Model Raw Gradient Randomized Gradient Noise Injection Central Model Local Model

Slide 50

Slide 50 text

n Randomize response via randomizing gradient’s direction n Randomly select the green zone or the white zone, and then uniformly pick a vector from the selected zone 51 LDP-SGD https://arxiv.org/abs/2001.03618

Slide 51

Slide 51 text

n Empirical measurement with instantiated adversaries for LDP-SGD n The worst-case flipping the gradient direction reaches the theoretical bound 52 Empirical Privacy Measurement in LDP-SGD https://arxiv.org/abs/2206.09122

Slide 52

Slide 52 text

n LDP enables us to collect users’ data in a privatized way, but the amount of noise tends to be prohibitable 53 Issues in Local DP Randomized Original Global Model Randomized Gradient

Slide 53

Slide 53 text

Shuffle Model – an intermediate privacy model 54

Slide 54

Slide 54 text

n Intermediate trusted entity “shuffler” anonymizes local users’ identity n Each client encrypts their randomized content w/ the server’s public key, then shuffler only mixes their identifies w/o looking at the contents 55 Shuffle model – an intermediate privacy model ] 𝑥2 ] 𝑥3 ] 𝑥8 l 𝑥2 (l 𝑥2, l 𝑥3, … , l 𝑥8) l 𝑥3 l 𝑥8 Server Randomized w/ 𝝐𝟎 Shuffle Shuffler Send the shuffled batch anonymized

Slide 55

Slide 55 text

n Shuffler can amplify differential privacy à possibility to decrease local noise n The amplification on shuffler translates LDP on clients into CDP 56 Privacy Amplification via Shuffling 𝜖I = 8 (LDP) 𝛿 = 101J 𝑘 = 10 by Hiding among clones Example in k-randomized response https://arxiv.org/abs/2012.12803 Privacy Amplification 8 𝑥2 8 𝑥& 8 𝑥3 : 𝑥2 (: 𝑥2 , : 𝑥& , … , : 𝑥3 ) : 𝑥& : 𝑥3 Shuffler Server 𝝐𝟎 (LDP) 𝝐 < 𝝐𝟎 (CDP)

Slide 56

Slide 56 text

n Using shuffler and sub-sampling, FL also can employ privacy amplifications n Clients randomly check-ins federated learning at each iteration 57 Shuffle Model in Federated Learning Higher accuracy at a strong privacy level (smaller 𝜖) weak privacy ] 𝑥2 ] 𝑥K ] 𝑥8 Shuffler Aggregator strong privacy Privacy amplification Sub-sampling & Shuffling in FL https://arxiv.org/abs/2206.03151 𝝐𝐥𝐝𝐩 = 𝟖 𝝐𝐜𝐝𝐩 = 𝟏

Slide 57

Slide 57 text

n Decentralized shuffling via multi-round random walks on a graph n In each round, every client relays her randomized reports to one of her neighbors (e.g., friends on a social network) via an encrypted channel 58 Network Shuffling Accepted at SIGMOD2022 https://arxiv.org/abs/2204.03919 The larger graph amplifies privacy the more.

Slide 58

Slide 58 text

Conclusion 59

Slide 59

Slide 59 text

• Privacy Risks, Issues, and Case-studies • Differential Privacy (Central Model) • Query Release via Laplace Mechanism • Machine Learning via DP-SGD • Local Differential Privacy • Stats Gathering via Randomized Response • Federated Learning via LDP-SGD • Shuffle Model – an intermediate privacy model Topics in this lecture 60

Slide 60

Slide 60 text

No content