Toward the Realization of Trustworthy AI

信頼できるAIの実現に向けて 2021.12.17 CCSE2021 Tsubasa TAKAHASHI Senior Researcher / Manager Trustworthy
AI Team

LINE AIのR&D Vision 2

A. Domain Knowledge with Certificates What types of experts do
you trust? 3 Domain Knowledge Certificate

A. Domain Knowledge with Quality Assuarance What types of AI
do you trust? 4 Domain Knowledge Quality Assurance

Towards Trustworthy AI 5 Domain Knowledge Quality Assurance How to
get sensitive dataset for training domain expert AI? How to evaluate AI’s quality including ethical issues?

Towards Trustworthy AI 6 How to get sensitive dataset for
training domain expert AI? How to evaluate AI’s quality including ethical issues? Data Synthesis Federated Learning Ethics Test Evidence-based Verification

PPDS (Privacy Preserving Data Synthesis) 7

Data Synthesis as a Data Sharing ⽬的︓⽣データの代わりに⽣成モデル (Generative Model) をシェアしたい
Train Generative Model 8 Generative Model Synthesize Data Holder Data User Deliver 乱数から合成データを⽣成データから⽣成モデルを訓練 Q︓データ共有に資するプライバシーに配慮した⽣成モデルをいかに構築するか︖

Privacy Preserving Data Synthesis ⽬的︓⽣データの代わりにプライバシー保護した⽣成モデルをシェアしたい実⽤的なPPDSのハードル • DP下ではイテレーション (データ参照回数) が制限される
• ⽣成モデルの学習は複雑さが⾼く、ノイズの影響を受けやすい Train with Generative Model Synthesize ナイーブ法 VAE+DP-SGD P3GM (ours) ICDE2021採択インターンとの共同研究 ε=1.0 ε=0.2 PEARL (ours) arXiv:2106.04590 ε=1.0 ε=1.0 実⽤的なプライバシー基準(ε≦1)下で⽐較的⾼い近似性能を達成 9 ※ ここではDifferential Privacy (DP) の保証を考える

PEARL [Liew+, 2021] • (1) (2) differentially private embeddings from
sensitive data through characteristic function representations • (3) train generator while (4) optimizing a critic to distinguish between the real and generated data 10 No limitation in learning iterations Well-reconstruction capability by critic PEARL: Private Embedding & Adversarial Reconstruction Learning

Well approximation of data distributions 11 Adult (元データ) PEARL (ours)
DP-MERF (ベースライン)

Private Federated Learning 12

Federated Learning︓Overview Non-participants of FL Global Model 13

Inverting Gradients (出典) “Inverting Gradients - How easy is it
to break privacy in federated learning?” https://arxiv.org/abs/2003.14053 勾配から訓練データ (画像) を復元できるか︖ 14

FL with Local Differential Privacy Non-participants of FL + +
Differential Privacy Differential Privacy + + + + + + + + 15 ノイズを加算することで出⼒の差異を制限 (どんな⼊⼒でも出⼒がほぼ同じに⾒える) ノイズの⼤きさは⼊⼒が出⼒に与える影響の⼤きさに依存 (ここでは勾配のL2ノルム) 多数のレポートを集約することでノイズを打ち消し合う効果がある Local Differential Privacy クライアントのプライバシとサーバーのユーティリティを両⽴

Experiments: FL + LDP + Shuffling 16 データセット • MNIST
FLの設定 • クライアント数︓10,000,000 • サンプル数/クライアント︓5 • 集計バッチサイズ︓1,000 LDP Mechanism • Fed. DP-SGD︓ガウスノイズによる⼿法 • LDPを保証するようにDP-SGDを調整 • !" = 8 • Subsampling w/ Shuffler • ノイズ加算後に匿名性を担保する仕組み • 8-Local DP à 2.7-Central DP 92%のAccuracyを達成

Ethics Test for Language Model 17

18 Ethics Test for Language Model AI Risk Assessment Counter
measures Language Model Update xxxx 5PYJDJUZ 1SJWBDZ 'BJSOFTT 3PCVTUOFTT ⼤規模⾔語モデルの信頼性担保のためテストツール・対策技術の確⽴を狙う

Adversarial Trigger [Wallance+, 2019] • ⾔語モデルに有害表現を誘発させるTriggerをAIで探索する技術 19 Trigger 外部モデル (GPT-2)
で学習・探索した Triggerにより⾔語モデルが⽣成を誘発させられた有害表現の⼀例有害表現（∼ ）の尤度を最⼤化するTrigger を探索

Self Diagnosis • ⾔語モデル⾃⾝に表現の有害度を評価させる è ⾔語モデルのプロンプトにて実現 20

Next Steps • Continue discussions and developments… • Lots of
ethical measurements • Fairness • Demographic Parity • Counterfactual Fairness • … • Robustness • Consistency against Adversarial Inputs • … • Toxicity • Credibility • … 21

Conclusion • 信頼できるAIの実現を⽬指したLINEの取り組みをご紹介 • 技術だけでなく、社会的コンセンサスを作っていく必要がある • まだまだ未開の領域であり、たくさんの議論と仲間が必要 22

Link Trustworthy AIチーム紹介記事 • https://engineering.linecorp.com/ja/interview/mlprivacy_trustworthyai/ プライバシーに配慮した新たな技術動向〜Federated Learningを中⼼に〜 • https://speakerdeck.com/line_developers/federated-learning-with-differential-
privacy ICDE2021参加報告 • https://engineering.linecorp.com/ja/blog/icde2021-participation-report/ LINE Publication List • https://engineering.linecorp.com/ja/research/ [求⼈] AIエンジニア・リサーチャー (AI Ethics) • https://linecorp.com/ja/career/position/3235 23

Toward the Realization of Trustworthy AI

Toward the Realization of Trustworthy AI

LINE Developers

More Decks by LINE Developers

Other Decks in Technology

Featured

Transcript

信頼できるAIの実現に向けて 2021.12.17 CCSE2021 Tsubasa TAKAHASHI Senior Researcher / Manager Trustworthy

LINE AIのR&D Vision 2

A. Domain Knowledge with Certificates What types of experts do

A. Domain Knowledge with Quality Assuarance What types of AI

Towards Trustworthy AI 5 Domain Knowledge Quality Assurance How to

Towards Trustworthy AI 6 How to get sensitive dataset for

PPDS (Privacy Preserving Data Synthesis) 7

Data Synthesis as a Data Sharing ⽬的︓⽣データの代わりに⽣成モデル (Generative Model) をシェアしたい

Privacy Preserving Data Synthesis ⽬的︓⽣データの代わりにプライバシー保護した⽣成モデルをシェアしたい実⽤的なPPDSのハードル • DP下ではイテレーション (データ参照回数) が制限される

PEARL [Liew+, 2021] • (1) (2) differentially private embeddings from

Well approximation of data distributions 11 Adult (元データ) PEARL (ours)

Private Federated Learning 12

Federated Learning︓Overview Non-participants of FL Global Model 13

Inverting Gradients (出典) “Inverting Gradients - How easy is it

FL with Local Differential Privacy Non-participants of FL + +

Experiments: FL + LDP + Shuffling 16 データセット • MNIST

Ethics Test for Language Model 17

18 Ethics Test for Language Model AI Risk Assessment Counter

Adversarial Trigger [Wallance+, 2019] • ⾔語モデルに有害表現を誘発させるTriggerをAIで探索する技術 19 Trigger 外部モデル (GPT-2)

Self Diagnosis • ⾔語モデル⾃⾝に表現の有害度を評価させる è ⾔語モデルのプロンプトにて実現 20

Next Steps • Continue discussions and developments… • Lots of

Conclusion • 信頼できるAIの実現を⽬指したLINEの取り組みをご紹介 • 技術だけでなく、社会的コンセンサスを作っていく必要がある • まだまだ未開の領域であり、たくさんの議論と仲間が必要 22

Link Trustworthy AIチーム紹介記事 • https://engineering.linecorp.com/ja/interview/mlprivacy_trustworthyai/ プライバシーに配慮した新たな技術動向〜Federated Learningを中⼼に〜 • https://speakerdeck.com/line_developers/federated-learning-with-differential-