Value of Privacy Last Thursday: Joe Calandrino, FTC privacy abuses and regulations Today: Mechanisms for Privacy Next Tuesday: Privacy-Aware Mechanism Design 3
for any two neighboring datasets % and %&: Pr[*(+)∈-] Pr[*(+/)∈-] ≤ 12 Pr[*(+/)∈-] Pr[*(+)∈-] ≤ 12 “Neighboring” datasets differ in at most one entry: definition is symmetrical 132 ≤ Pr[*(+)∈-] Pr[*(+/)∈-] ≤ 12
holder, or curator, to a data subject: “You will not be affected, adversely or otherwise, by allowing your data to be used in any study or analysis, no matter what other studies, data sets, or information sources, are available.”
data structure, to record # ⊆ " items lookup(+): + ∈ #: always returns 789: + ∉ #: likely to return =>?@: (but ocassionaly 789:) [note: no privacy goal, and does not guarantee any useful privacy properties!]
6 7 8 9 10 11 12 13 Set of ! independent hash functions: "# : % → ' initialize: for i in 0, … , B − 1 : 3[5] = 0 insert(8): for i in {0, … , ! − 1}: 3["# 8 ] = 1 lookup(8): ⋀#<= >?@ 3["# 8 ] Does this provide differential privacy?
what is the probability a bit is still 0? 28 0 1 2 3 4 5 6 7 8 9 10 11 12 13 1 − 1 " %& For lookup of item not present, what is probability all bits are 1?
what is the probability a bit is still 0? 29 0 1 2 3 4 5 6 7 8 9 10 11 12 13 1 − 1 " %& For lookup of item not present, what is probability all bits are 1? 1 − 1 − 1 " %& % ≈ 1 − ( )%& * %
Collection Model Training Trained Model Deployed Model Hyperparameters User Machine Learning Service API User Randomized Response, Local Differential Privacy Output Perturbation Objective Perturbation Gradient Perturbation