黑吃黑的藝術：機器學習惡意程式偵測之對抗樣本

Slide 1

Slide 1 text

⿊黑吃⿊黑的藝術機器學習惡惡意程式偵測之對抗樣本報告⼈人：林林殿智指導老師：Birdman、PK、Benson、⼤大Alan

Slide 2

Slide 2 text

Whoami • 林林殿智 / Tien-Chih Lin • 成功⼤大學電通所碩⼆二（已畢業） • Research • Car Security • AI-based Malware Detection

Slide 3

Slide 3 text

「⿊黑盒 v.s. ⿊黑盒。」

Slide 4

Slide 4 text

Machine Learning in Cyber Security https://www.blackhat.com/docs/us-15/materials/us-15-Klein-Defeating-Machine-Learning-What-Your-Security-Vendor-Is-Not-Telling-You.pdf

Slide 5

Slide 5 text

Adversarial Example in Image ＋ 0.007 x = “panda” 57.7% confidence “nematode” 8.2% confidence “gibbon” 99.3% confidence Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. "Explaining and harnessing adversarial examples." arXiv preprint arXiv:1412.6572 (2014).

Slide 6

Slide 6 text

Adversarial Example in Malware Detection + Malicious Benign

Slide 7

Slide 7 text

Problem of Perturbation + = Image + = Broken PE Random Modify

Slide 8

Slide 8 text

Attack Strategy • Modify feature vector • Modify raw PE • Specially crafted by experts • Gradient-based attack • Reinforcement learning

Slide 9

Slide 9 text

Feature Vector Perturbation Hu, Weiwei, and Ying Tan. "Generating adversarial malware examples for black-box attacks based on GAN." arXiv preprint arXiv:1702.05983 (2017).

Slide 10

Slide 10 text

Specially Crafted by Experts https://skylightcyber.com/2019/07/18/cylance-i-kill-you/

Slide 11

Slide 11 text

• AnalyzeFile hashed C: \Users\Administrator\Desktop\mimikatz_with_slight_modification.exe 143020851E35E3234DBCC879759322E8AD4D6D3E89EAE1F662BF8EA9B9898 D05 • LocalAnalyzeItem LocalInfinity.ComputeScore begin • LocalAnalyzeItem, C: \Users\Administrator\Desktop\mimikatz_with_slight_modification.exe score -852 detector execution_control • Detected as 'Unsafe'! path:'C: \Users\Administrator\Desktop\mimikatz_with_slight_modification.exe' hash: 143020851E35E3234DBCC879759322E8AD4D6D3E89EAE1F662BF8EA9B9898 D05

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

Gradient-based Attack Demetrio, Luca, et al. "Explaining Vulnerabilities of Deep Learning to Adversarial Malware Binaries." arXiv preprint arXiv:1901.03583 (2019).

Slide 14

Slide 14 text

Reinforcement Learning https://www.blackhat.com/docs/us-17/thursday/us-17-Anderson-Bot-Vs-Bot-Evading-Machine-Learning-Malware-Detection.pdf http://adsabs.harvard.edu/abs/2018arXiv180108917A

Slide 15

Slide 15 text

Reinforcement Learning Agent Environment Action Reward State Rt Rt+1 St+1 At St

Slide 16

Slide 16 text

Agent Environment Action At Reward State Rt Rt+1 St+1 St

Slide 17

Slide 17 text

Action • adding a function to the import address table that is never used • manipulating existing section names • creating new (unused) sections • appending bytes to extra space at the end of sections • creating a new entry point which immediately jumps to the original entry point • removing signer information • manipulating debug info • packing or unpacking the ﬁle • modifying (breaking) header checksum • appending bytes to the overlay(end of PE ﬁle)

Slide 18

Slide 18 text

LIEF for Modify PE https://github.com/lief-project/LIEF

Slide 19

Slide 19 text

State • Static Windows PE ﬁle features compressed to 2350 dimensions • General ﬁle information (size) • Header info • Section characteristics • Imported/exported functions • Strings • File byte and entropy histograms

Slide 20

Slide 20 text

Agent Environment Action At Reward State Rt Rt+1 St+1 St

Slide 21

Slide 21 text

Attack Target in Original Work • Static PE malware classiﬁer • gradient boosted decision tree • trained on 100,000 malicious and benign samples • ROC-AUC score is 0.993

Slide 22

Slide 22 text

Attack Windows Defender

Slide 23

Slide 23 text

Why Attack Win Defender? • Windows’ built-in antivirus • 18% of Windows 7 and Windows 8 are running Windows Defender • more than 50% of Windows 10 are running Windows Defender • Get the full score in AV-Test. https://windowsreport.com/windows-defender-enterprise-antivirus/ https://www.av-test.org/en/antivirus/home-windows/

Slide 24

Slide 24 text

WannaCry -> upx

Slide 25

Slide 25 text

Porting Win Defender to Linux https://github.com/taviso/loadlibrary

Slide 26

Slide 26 text

Start Training Agent

Slide 27

Slide 27 text

Evade Rate • After training 8 hr • < 40 actions • Evade rate : 81.2%

Slide 28

Slide 28 text

Before Modiﬁcation

Slide 29

Slide 29 text

After Modiﬁcation

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

No content

Slide 33

Slide 33 text

No content

Slide 34

Slide 34 text

https://i.blackhat.com/us-18/Thu-August-9/us-18-Bulazel-Windows-Oﬀender-Reverse-Engineering-Windows-Defenders-Antivirus-Emulator.pdf

Slide 35

Slide 35 text

Conclusion • There are blind spots / hallucinate in classiﬁer. • Avoid setting detect engine at local. • Restrict the access frequency. • Do not show the full information (score) in the log.

Slide 36

Slide 36 text

[email protected] Thanks for Listening.