AI for security, security for AI

AI for security, security for AI Jun Sakuma U. Tsukuba
/ RIKEN AIP

training data test data predi ction model learning alg. Overview
of AI/machine learning

of AI/machine learning Training the model E.g., a model is trained to recognize face images of “Alice” =Bpb =Alice =Bob =Alice

of AI/machine learning obtain recognition/prediction =??? =Alice

training data test data predi ction model learning alg. Deployment
of AI to the real world data submission data submission usage of prediction interpretation of models AI always has interfaces to humans

training data test data predi ction model learning alg. Risk
of AI in the real world data submission data submission usage of prediction interpretation of models AI might be maliciously affected by adversarial entities

Adversarial examples (AEs) of image recognition Goodfellowet al. (2015)

How AEs work? Panda-Gibbon boundary of humans Panda-Gibbon boundary of
a machine Region of AEs

Potential risk: Fooling speaker/face authentication

Image AEs in the physical world

Image AEs in the real world [Etimovet al.] • Paint
characteristic pattern on a physical road sign (AE) • Recognition model misrecognizes the “STOP” sign as “speed limit 45 mph”

Bugs that work as AEs

Adversarial patch that fools YOLO2 Thyset al., Fooling automated surveillance
cameras: adversarial patches to attack person detection

Audio AEs in the physical world • Audio AE works
inside computer but does not work over the air • Over-the-air attack is not easy; never reported with RNN • Can AE be a realistic risk for speech recognition? • If it works, it can affect a large number of devices Audio AE playback broadcast Order 100 pizza! Order 100 pizza! Order 100 pizza!

Audio AE over the air [YS IJCAI‘19] • Audio AE:
attack on audio response systems (e.g., smartphone, smart speaker) • The world first audio AE that works with deep model over the air • Audio signal has higher diffuseness (e.g., speaker, broadcast) • Audio AE can affect a larger number of devices than image AE Original audio audio AE (recognized as “Hello World”) original audio adversarial noise audio AE deepspeech https://yumetaro.info/projects/audio-ae/

Information leakage from deep models

Potential risk: Fooling face authentication

Model inversion against face recognition models • Deep learning gives
an abstract label to images • Given a model, can we extract photorealistic images recognized as a specific label? • If yes, it can be a risk (e.g., face authentication) “Keanu Reeves” model inversion? recognition

Basic Idea of GANs: Counterfeiting currency • Minimax game between
generator and discriminator • Given a random number, generator tries to generate currency that appears to be real • Given either of fake or real currency, discriminator tries to distinguish them correctly • Update generator and discriminator alternately until generator defeats discriminator Fake sample Generate fake Real sample Generator Distinguish fake and real Discriminator Random number

Base technology: Generative adversarial network • Competitive training of two
deep neural networks • generator: generate fake images look like “real” • discriminator: discriminate fake images and real images Generator discriminate fake from real Discriminator random number real face images fake face images

PreImageGAN Demo

Deep fake

Style transfer by Cycle GAN [Zhu et al., ICCV’17]

Threats of Deep Fake • Spreading disinformation via movies or
photos • Fabrication of evidences

Differential privacy •Privacy protection with statistical techniques

Apple introduced “differential privacy” into iOS 10

Neural-attentive malware analysis Yakura+, The 8thACM Conference on Data and
Application Security and Privacy (CODASPY18), to appear • What we usually do: • Disassemble samples and obtain instructions • Interpret instructions line-by-line • So time-consuming. • What we want: • Disassemble samples and obtain instructions • Find instructions characterizing the malware automatically Dis- assemble Analyze by human

Attention mechanism 2 9 • A network architecture that can
be embedded into CNN • Find regions in the image that are expected to characterize the target class Examples of correspondences between the region in images and the underlined word in captions obtained from the feature map [Xu+, 2015]

Proposal: Neural-attentive Malware Analysis disassemble Identify instructions with strong attention
Manual analysis

Evaluation: Worm:Win32/Gaobot 3 1 A function to connect to an
IRC server and enter a chat room • A malware family that spread explosively around 2004 to construct a large botnet [Liang+, 2007] • execute commands sent over IRC • intercepts HTTP / FTP communication to steal login information of PAYPAL[Goel+, 2006]. An image of Backdoor.Win32.Agobot.on, a sample belonging to Worm:Win32/Gaobot, and its attention map obtained. A function to redirect packets to designated destinations to perform DDoS attacks A function to ascertain whether the contents of the intercepted HTTP communication include strings like “PAYPAL”

AI as attacker, AI as defender System AI as attacker
Adv. Ex. Model inv. AI as defender Malware analysis Human attacker

Security of AI world AI to be secured AI as
attacker System to be secured AI as defender User to be secured

AI for security, security for AI

AI for security, security for AI

LINE Developers

More Decks by LINE Developers

Other Decks in Technology

Featured

Transcript