Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
マルウェアを機械学習する前に
Search
Yuma Kurogome
February 13, 2016
Programming
3
1.5k
マルウェアを機械学習する前に
Kaggle - Malware Classification Challenge勉強会 connpass.com/event/25007/ 発表資料
Yuma Kurogome
February 13, 2016
Tweet
Share
More Decks by Yuma Kurogome
See All by Yuma Kurogome
The Art of De-obfuscation
ntddk
16
27k
死にゆくアンチウイルスへの祈り
ntddk
55
38k
Windows Subsystem for Linux Internals
ntddk
10
2.9k
なぜマルウェア解析は自動化できないのか
ntddk
6
4k
Linear Obfuscation to Drive angr Angry
ntddk
4
810
CAPTCHAとボットの共進化
ntddk
2
1.1k
Peeling Onions
ntddk
7
3.5k
仮想化技術を用いたマルウェア解析
ntddk
8
27k
An Introduction to Drawbridge(ja)
ntddk
11
3.2k
Other Decks in Programming
See All in Programming
オートマトン学習しろ / Do automata learning
makenowjust
3
110
Some more adventure of Happy Eyeballs
coe401_
2
150
これからの時代の新標準!SwiftTestingへの移行とトラブルシューティング
uetyo
0
460
マルチモジュールにおけるテスト最適化
fxwx23
0
170
Rubyのobject_id
qnighy
6
1.3k
私のEbitengineの第一歩
qt_luigi
0
420
フロントエンドカンファレンス北海道2024 『小規模サイトでも使えるVite 〜HTMLコーディングをよりスマートに〜』長谷川広武(ハム)
h2ham
1
2.5k
ドメイン駆動設計を実践するために必要なもの
bikisuke
3
300
仮想ファイルシステムを導入して開発環境のストレージ課題を解消する
segadevtech
2
420
XStateでReactに秩序を与えたい
gizm000
0
440
快適な開発と高セキュリティを実現するCryptoKitを活用したCoreDataのデータ暗号化術
grandbig
1
310
Playwrightから始めるVisual Regression Testingのススメ by とっと
totto2727
2
1.8k
Featured
See All Featured
Robots, Beer and Maslow
schacon
PRO
157
8.1k
The Invisible Side of Design
smashingmag
295
50k
Six Lessons from altMBA
skipperchong
26
3.3k
How to Ace a Technical Interview
jacobian
275
23k
The Cost Of JavaScript in 2023
addyosmani
39
5.2k
Creatively Recalculating Your Daily Design Routine
revolveconf
215
12k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
22
1.7k
Writing Fast Ruby
sferik
623
60k
The Language of Interfaces
destraynor
153
23k
10 Git Anti Patterns You Should be Aware of
lemiorhan
653
58k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
45
4.8k
GitHub's CSS Performance
jonrohan
1029
450k
Transcript
@ntddk Kaggle - Malware Classification Challenge 2016.02.13 1
• http://ntddk.github.io/ • 2
3
4
Kaggle 5 https://www.kaggle.com/
6 • • • ※ David H. Wolpert, The Supervised
Learning No-Free-Lunch Theorems, In Proc. 6th Online World Conference on Soft Computing in Industrial Applications, pp.25-42, 2001.
7 • • • ※ David H. Wolpert, The Supervised
Learning No-Free-Lunch Theorems, In Proc. 6th Online World Conference on Soft Computing in Industrial Applications, pp.25-42, 2001.
8 There ain't no such thing as a free lunch
http://www.amazon.co.jp/dp/4150117489 http://www.amazon.co.jp/dp/B00GJMUKMG/ http://www.amazon.co.jp/dp/4150312133/
9 There ain't no such thing as a free lunch
http://www.amazon.co.jp/dp/4150117489 http://www.amazon.co.jp/dp/B00GJMUKMG/ http://www.amazon.co.jp/dp/4150312133/
10 http://blog.kaggle.com/
11 x η g a b c x …
12 x η g a b c x …
13 • • A B Satoshi Watanabe, Knowing and Guessing
― Quantitative Study of Inference and Information John Wiley & Sons, 1969.
14 • • A B Satoshi Watanabe, Knowing and Guessing
― Quantitative Study of Inference and Information John Wiley & Sons, 1969.
15 • • • •
16 https://www.av-test.org/en/statistics/malware/
17 http://www.mcafee.com/jp/resources/reports/rp-quarterly-threat-q2-2015.pdf
18 http://www.mcafee.com/jp/resources/reports/rp-quarterly-threat-q2-2015.pdf http://www.mcafee.com/jp/resources/reports/rp-threats-predictions-2016.pdf
19 • KERNEL32!VirtualAllocStub • KERNEL32!VirtualProtectStub • KERNEL32!OpenProcessStub • KERNEL32!OpenThreadStub •
…
20 CSEC: MWS: http://www.iwsec.org/mws/2015/about.html
21 https://www.kaggle.com/c/malware-classification/data 16
22 • https://virusshare.com/ • http://malware-traffic-analysis.net/
23 • • • •
24 • • • • API PE
25 https://github.com/corkami/
26 • • • • • •
27 #include <windows.h> typedef int (WINAPI *LPFNMESSAGEBOXW)(HWND, LPCWSTR, LPCWSTR, UINT);
int main() { HMODULE hmod = LoadLibrary(TEXT("user32.dll")); LPFNMESSAGEBOXW lpfnMessageBoxW = (LPFNMESSAGEBOXW)GetProcAddress(hmod, "MessageBoxW"); lpfnMessageBoxW(NULL, L"Hello, world!", L"Test", MB_OK); FreeLibrary(hmod); return 0; } •
28 { "category": "registry", "status": true, "return": "0x00000000", "timestamp": "2015-05-24
02:46:50,773", "thread_id": "3220", "repeated": 0, "api": "NtOpenKey", "arguments": [ { "name": "DesiredAccess", "value": "33554432" }, { "name": "KeyHandle", "value": "0x00000154" }, { "name": "ObjectAttributes", "value": "¥¥REGISTRY¥¥USER¥¥S-1-5-21-916742657-1382504153-4155998892-1001" } ], "id": 83 },
29 • • • ※ David H. Wolpert, The Supervised
Learning No-Free-Lunch Theorems, In Proc. 6th Online World Conference on Soft Computing in Industrial Applications, pp.25-42, 2001.
30 • AdaBoost, Gradient Boosting • Kaggle
DAF 31 Mohammad M. Masud, Latifur Khan, Bhavani Thuraisingham, A
scalable multi-level feature extraction technique to detect malicious executables, Information Systems Frontiers, Vol.10, Issue.1, pp.33-45, 2008. 16 DAF: Derived Assembly Features BFS: Binary N-gram Features