LINEにおけるプライバシー保護型データ合成の研究事例

PEARL Data synthesis via private embeddings and adversarial reconstruction learning
ICLR 2022 (to appear) Seng Pei Liew with Tsubasa Takahashi & Michihiko Ueno

Summary in one slide • Sharing data among organizations or
departments may cause privacy issues (How to mitigate this issue?) • Privacy-preserving data synthesis (PPDS): we train a generative model with di ff erential privacy (rigorous privacy guarantees) and use the model to generate synthetic data for private data sharing purposes • We propose PEARL, a framework to train generative models at practical level of privacy, and overcomes issues encountered in previous works which mainly utilize DP-SGD (to be explained in later parts) Sensitive Data Privately Embedded 1 Privately Embedded 2 Privately Embedded k Aux Synthesized 1 Synthesized 2 Synthesized k Critic Adv. Recon. Learner Generator (1) (2) (3) (4) DP Flow (one-shot) Training Flow … …

Content • Di ff erential Privacy • Di ff erential
Private Data Synthesis (with generative model) • Training generative models with di ff erential privacy (and general shortcomings) • Proposal: PEARL • Realization of PEARL (embedding, generative model, critic) • Results

Differential Privacy • “An algorithm is di ff erential private
if changing a single record does not alter its output distribution by much.” [DN03, DMNS06] Sensitive data Statistics Algorithm

Differential Privacy • Two datasets are neighbors if they di
ff er in the data of a single record. • An algorithm is -di ff erentially private if for all neighboring datasets, , and all outputs, : D, D′ M ϵ D, D′ x • The parameter controls the degree of privacy, often called privacy budget. ϵ Pr[M(D) = x] ≤ eϵPr[M(D′ ) = x] Pr[M(D) = x] ≤ (1 + ϵ)Pr[M(D′ ) = x] Note: at small , we can instead write ϵ

Differentially private data synthesis Sensitive data Algorithm “Fake data” that
is private and preserves the characteristics of the real data

Differentially private data synthesis Sensitive data Algorithm “Fake data” that
is private and preserves the characteristics of the real data Allow arbitrary usage without privacy violation (Data scientist) • Training ML models • Exploratory data analysis

Differentially private data synthesis with generative models Sensitive data Algorithm
Generative Model Learning a private generative model allows the generation of “fake” data

Training deep generative models with differential privacy • The most
popular method is di ff erential private stochastic gradient descent (DP-SGD) [ACG+16] • DP-SGD ensures that each gradient update is private, which in turn guarantees that the network parameters are private Sensitive data Calculate loss Lθ Compute gradient ∇Lθ Add Gaussian noise Update parameters θ • Accumulate privacy consumption with moments accountant. Clip gradient Generative model Sample a batch of data

Some examples of Differentially private Generative Models Add noise only
to the gradient of discriminator because the generator has no access to data. Privatize both encoder and decoder. • Refs: [TTC+20], [COF20]

General shortcomings of DP-SGD 1. Training steps are limited. Each
access of data reduces the guarantees of privacy. 2. Network size is limited. Large neural networks lead to too much noises added to the gradient updates. 3. Extensive hyperparameter (clipping size) tunings are required. Sensitive data Calculate loss Lθ Compute gradient ∇Lθ Add Gaussian noise Update parameters θ Clip gradient Generative model Sample a batch of data 1. Multiple access to data 3. Tuning of clipping size (to bound sensitivity) 2. Noise proportional to network size

Proposal: PEARL Private Embeddings and Adversarial Reconstruction Learning (arXiv: 2106.04590)
1. Project sensitive data to low-dimensional embeddings and add Gaussian noises to make the embeddings di ff erentially private 2. Obtain auxiliary information useful for training in a di ff erential private manner 3. Train a generator by minimizing the embedding distance 4. Train with an adversarial objective to improve the performance Sensitive Data Privately Embedded 1 Privately Embedded 2 Privately Embedded k Aux Synthesized 1 Synthesized 2 Synthesized k Critic Adv. Recon. Learner Generator (1) (2) (3) (4) DP Flow (one-shot) Training Flow … …

Proposal: PEARL Private Embeddings and Adversarial Reconstruction Learning (arXiv: 2106.04590)
1. Project sensitive data to low-dimensional embeddings and add Gaussian noises to make the embeddings di ff erentially private 2. Obtain auxiliary information useful for training in a di ff erential private manner 3. Train a generator by minimizing the embedding distance 4. Train with an adversarial objective to improve the performance Sensitive Data Privately Embedded 1 Privately Embedded 2 Privately Embedded k Aux Synthesized 1 Synthesized 2 Synthesized k Critic Adv. Recon. Learner Generator (1) (2) (3) (4) DP Flow (one-shot) Training Flow … … One-shot data access No noise added to the gradients. No clipping required.

access of data reduces the guarantees of privacy. 2. Network size is limited. Large neural networks lead to too much noises added to the gradient updates. 3. Extensive hyperparameter (clipping size) tunings are required. Sensitive data Calculate loss Lθ Compute gradient ∇Lθ Add Gaussian noise Update parameters θ Clip gradient Generative model Sample a batch of data 1. Multiple access to data 3. Tuning of clipping size (to bound sensitivity) 2. Noise proportional to network size

access of data reduces the guarantees of privacy. 2. Network size is limited. Large neural networks lead to too much noises added to the gradient updates. 3. Extensive hyperparameter (clipping size) tunings are required. Sensitive data Calculate loss Lθ Compute gradient ∇Lθ Add Gaussian noise Update parameters θ Clip gradient Generative model Sample a batch of data 1. Multiple access to data 3. Tuning of clipping size (to bound sensitivity) 2. Noise proportional to network size Sensitive Data Privately Embedded 1 Privately Embedded 2 Privately Embedded k Aux Synthesized 1 Synthesized 2 Synthesized k Critic Adv. Recon. Learner Generator (1) (2) (3) (4) DP Flow (one-shot) Training Flow … … One-shot data access No noise added to the gradients. No clipping required.

Realization of PEARL Embedding Sensitive Data Privately Embedded 1 Privately
Embedded 2 Privately Embedded k Aux Synthesized 1 Synthesized 2 Synthesized k Critic Adv. Recon. Learner Generator (1) (2) (3) (4) DP Flow (one-shot) Training Flow … …

Realization of PEARL Characteristic Function Φℙ (t) = 𝔼 x∼ℙ
[eit⋅x] = ∫ ℝd eit⋅xdℙ ≃ ∑ x eit⋅x • Let be a random variable with probability distribution , the corresponding characteristic function is x ℙ • This mathematical operation is equivalent to Fourier transformation from the signal processing point of view. is frequency. • Also de fi ne Characteristic function distance between two distributions: t C2(ℙ, ℚ) = ∫ |Φℙ (t) − Φℚ (t)|2 ω(t)dt • It can be shown that with appropriately de fi ned density , ω(t) C(ℙ, ℚ) = 0 ⟺ ℙ = ℚ (empirical CF)

Realization of PEARL Generative model Sensitive Data Privately Embedded 1
Privately Embedded 2 Privately Embedded k Aux Synthesized 1 Synthesized 2 Synthesized k Critic Adv. Recon. Learner Generator (1) (2) (3) (4) DP Flow (one-shot) Training Flow … …

Realization of PEARL Generative model ̂ Φ ℙ (t) •
Let be the sensitive data. We sample a fi nite number of from , and add noise to the empirical CF to make it DP. x t ω(t) • De fi ne a generator that takes a latent vector (noise) as input and outputs “fake” Gθ z x inf θ∈Θ k ∑ i=1 ˜ Φℙ (ti ) − ̂ Φ ℚ (ti ) 2 • We can then train the generator with the following objective minimizing CF distance: Add noise to for sampled frequencies , t ̂ Φ ℚ (t) = ∑ y eit⋅y with y = Gθ (z) where is no. of sampled frequencies k ˜ Φℙ (t) No noise required because this term has no access to data

Realization of PEARL Critic Sensitive Data Privately Embedded 1 Privately
Embedded 2 Privately Embedded k Aux Synthesized 1 Synthesized 2 Synthesized k Critic Adv. Recon. Learner Generator (1) (2) (3) (4) DP Flow (one-shot) Training Flow … …

Realization of PEARL Critic • We have not discussed much
about so far. • The idea is to treat as an adversarial critic to provide more discriminative features for training , like in GANs • cannot be optimized directly. Known methods, e.g., reparametrization tricks require access to data, violating privacy. ω(t) ω(t) Gθ ω(t) • We propose to re-weight the CFs to choose the “best” weight for training while preserving privacy by performing minimax optimization: Gθ inf θ∈Θ sup ω∈Ω Cω (ℙ, ℚθ ) C2(ℙ, ℚ) = ∫ |Φℙ (t) − Φℚ (t)|2 ω(t)dt

Realization of PEARL Critic • More concretely, the following minimax
optimization is proposed: • Additionally, we are able to show that the above optimization has the following theoretical properties: 1. Continuity and di ff erentiability (allows generator to be trained via gradient descent) 2. Weak convergence (good for training GAN-like models [ACB’17]) 3. Consistency at in fi nite sampling limit (ensures the maximization procedure is consistent asymptotically) inf θ∈Θ sup ω∈Ω k ∑ i=1 ω(ti ) ω0 (ti ) ˜ Φℙ (ti ) − ̂ Φ ℚ (ti ) 2

Generated image data • PEARL’s quality is low at non-private
( ) limit, but the quality doesn’t change much as decreases (except at extreme value) ϵ = ∞ ϵ

Generated image data • Evaluating with metrics commonly used for
GANs

Results on tabular data • We also generate synthetic Adult
data. The frequency histogram is shown in the left (compared with another SOTA method), which can capture the pattern of the distribution well. • We use the synthetic data to train ML models for classifying real data. The result on the right also show that PEARL outperforms the SOTA method.

Wrap-up PEARL: a new approach of training deep generative models
1 Sensitive Data Privately Embedded 1 Privately Embedded 2 Privately Embedded k Aux Synthesized 1 Synthesized 2 Synthesized k Critic Adv. Recon. Learner Generator (1) (2) (3) (4) DP Flow (one-shot) Training Flow … … • Training practical models at reasonable privacy levels while avoiding di ffi culties of DP-SGD.

Appendix

Auxiliary information • Get auxiliary information privately to train better
generative model. • Tabular table: use DP-mean to train Gaussian Mixture Model to better model continuous attributes. • Class imbalance: get the number of samples in each class to perform re-weighting to train more balanced model.

Choosing ω0 • is chosen by median heuristic (pairwise median
of data points). • We estimate the mean privately instead because it is more tractable. • The privacy budget for this calculation is accounted for appropriately ω0 (t)

Implementation details

Detailed quantitative Adult results

Frequency histogram for continuous attribute of Adult data

LINEにおけるプライバシー保護型データ合成の研究事例

LINEにおけるプライバシー保護型データ合成の研究事例

LINE Developers

More Decks by LINE Developers

Other Decks in Technology

Featured

Transcript

PEARL Data synthesis via private embeddings and adversarial reconstruction learning

Summary in one slide • Sharing data among organizations or

Content • Di ff erential Privacy • Di ff erential

Differential Privacy • “An algorithm is di ff erential private

Differential Privacy • “An algorithm is di ff erential private

Differential Privacy • Two datasets are neighbors if they di

Differentially private data synthesis Sensitive data Algorithm “Fake data” that

Differentially private data synthesis Sensitive data Algorithm “Fake data” that

Differentially private data synthesis with generative models Sensitive data Algorithm

Training deep generative models with differential privacy • The most

Some examples of Differentially private Generative Models Add noise only

General shortcomings of DP-SGD 1. Training steps are limited. Each

Proposal: PEARL Private Embeddings and Adversarial Reconstruction Learning (arXiv: 2106.04590)

Proposal: PEARL Private Embeddings and Adversarial Reconstruction Learning (arXiv: 2106.04590)

General shortcomings of DP-SGD 1. Training steps are limited. Each

General shortcomings of DP-SGD 1. Training steps are limited. Each

Realization of PEARL Embedding Sensitive Data Privately Embedded 1 Privately

Realization of PEARL Characteristic Function Φℙ (t) = 𝔼 x∼ℙ

Realization of PEARL Generative model Sensitive Data Privately Embedded 1

Realization of PEARL Generative model ̂ Φ ℙ (t) •

Realization of PEARL Critic Sensitive Data Privately Embedded 1 Privately

Realization of PEARL Critic • We have not discussed much

Realization of PEARL Critic • More concretely, the following minimax

Generated image data • PEARL’s quality is low at non-private

Generated image data • Evaluating with metrics commonly used for

Results on tabular data • We also generate synthetic Adult

Wrap-up PEARL: a new approach of training deep generative models

Appendix

Auxiliary information • Get auxiliary information privately to train better

Choosing ω0 • is chosen by median heuristic (pairwise median

Implementation details

Detailed quantitative Adult results

Frequency histogram for continuous attribute of Adult data