Rest of the Talk 1. Background on Applying Differential Privacy to Machine Learning 2. Experimental Evaluation of Differentially Private Machine Learning Implementations
Applying DP to Machine Learning Data Machine Learning Define Objective Function Iterate for T epochs: Calculate Gradients Update Model M Gradient Perturbation Objective Perturbation Output Perturbation
Applying DP to Deep Learning Data Machine Learning Define Objective Function Iterate for T epochs: Calculate Gradients Update Model M Gradient Perturbation Objective Perturbation Output Perturbation
Improving Composition If each iteration is -DP By composition, model: -DP ϵ Tϵ Model is: (O( Tϵ), δ)-DP Concentrated DP Zero Concentrated DP Rènyi DP Moments Accountant [Dwork et al. (2016)] [Bun & Steinke (2016)] [Abadi et al. (2016)] [Mironov (2017)] Data Machine Learning Define Objective Function Iterate for T epochs: Calculate Gradients Update Model M Gradient Perturbation
Membership Inference Attacks Predict Membership Data Set M (TPR − FPR) M1 M2 Mk A Expected Training Loss 1 n n ∑ i=1 ℓ(di , θ) Reza Shokri, Marco Stronati, Congzheng Song, Vitaly Shmatikov (S&P 2017) Samuel Yeom, Irene Giacomelli, Matt Fredrikson, Somesh Jha (CSF 2018) Privacy Leakage
Neural Networks NN has 103,936 trainable parameters so it has more capacity to learn on training data Input Layer Hidden Layer 1 Hidden Layer 2 Output Layer 50 Neurons 256 Neurons 100 Neurons 256 Neurons
0.00 0.25 0.50 0.75 1.00 >= 0 >= 1 >= 2 >= 3 >= 4 = 5 Number of times identified as member (out of 5 runs) True Members Non Members 0.822 PPV 0.817 PPV 0.797 PPV 0.749 PPV 0.656 PPV 0.500 PPV Fraction of Data Set Random, Independent Predictions
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.01 0.05 0.1 0.5 1 5 10 50 100 500 1000 Privacy Budget ϵ Accuracy Loss Privacy Leakage Theoretical Guarantee RDP Acc Loss RDP Leakage NC Acc Loss Conclusion Non-private model has 0.12 leakage with 0.56 PPV 0.55 PPV There is privacy leakage, but not considerable, even for non-private model Logistic Regression on CIFAR-100 NC Leakage
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.01 0.05 0.1 0.5 1 5 10 50 100 500 1000 Privacy Budget ϵ Accuracy Loss Privacy Leakage Theoretical Guarantee RDP Acc Loss RDP Leakage NC Acc Loss NC Leakage Bridging the gap between theoretical bound on leakage and the leakage of practical attacks Conclusion Neural Network on CIFAR-100 Non-private model has 0.72 leakage with 0.94 PPV 0.74 PPV Privacy doesn’t come for free