ファジング+トリアージ技術を用いた脆弱性解析自動化

ファジング+トリアージ技術を用いた脆弱性解析の自動化株式会社リチェルカセキュリティ代表取締役社長木村廉セキュリティキャンプ 2021 脅威解析トラック

2 木村廉株式会社リチェルカセキュリティ代表取締役社長カーネギーメロン大学客員研究員 (2018.9 - 2019.9)
CVE-2019-14247, CVE-2019-14248, CVE-2019-14249, CVE-2019-14250, CVE-2019-16161, CVE-2019-16162, CVE-2019-16163, CVE-2019-16164, CVE-2019-16165, CVE-2019-16166, CVE-2019-16167, CVE-2019-19725 その他の活 https://rkx1209.github.io/about/ 報告済み脆弱性 18件、 CVE 13件

3 サイバー攻撃自動化技術 (ハッキング AI)の進歩 2016年 DARPA Cyber Grand Challenge会場にて、各チームのハッキング AIが動作するサーバー
ソフトウェア/ハードウェアの複雑化、多様化に伴い ”人の手”による脆弱性の発見が困難に。 2016年、アメリカ国防総省DARPAが56億円を投じてハッキングAIコンテスト、 DARPA Cyber Grand Challenge (CGC)を開催。 → ほぼ全ての上位チームが”シンボリック実行/ファジング”という技術を使用していた

シンボリック実行 SMTソルバを用いた脆弱性発見

int main (int argc, char *argv[]) { read(input); if (input[0]
== ‘G’) { if (input[1] == ‘E’) { if (input[2] == ‘T’) { .... } } } if (!strcmp(input, “HTTP”)) { … StackBOF(); // Vulnerable Code } if (isinteger(input)) { … } 5 シンボリック実行概要

6 シンボリック実行概要条件付き分岐無条件ジャンプ脆弱性を含むパスターゲットプログラムの Control Flow Graph(CFG) “GET<\x00”
“HTTP” “501” “HTTP1.?” if (input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”))

7 シンボリック実行概要入力(input)をシンボルとしてインタプリタ実行 “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[0] ==
‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 条件式:sym_i0==’G’ を追加&fork SMTソルバ用クエリ sym_i0==’G’ (true) sym_i0!=’G’ (false)

‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) forkした2つのパスからtrue側を選択 SMTソルバ用クエリ sym_i0==’G’

‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) SMTソルバ用クエリ sym_i0==’G’ 条件式:sym_i1==’E’ を追加 &fork && sym_i1==’E’ (true) && sym_i1!=’E’ (false)

‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) SMTソルバ用クエリ sym_i0==’G’ && sym_i1==’E’ forkした2つのパスからtrue側を選択

‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) SMTソルバ用クエリ sym_i0==’G’ && sym_i1==’E’ 条件式:sym_i2==’T’ を追加 &fork && sym_i2==’T’ (true) && sym_i2!=’T’ (false)

‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) SMTソルバ用クエリ sym_i0==’G’ && sym_i1==’E’ && sym_i2==’T’ forkした2つのパスからtrue側を選択

13 シンボリック実行によるバグ発見各パス上でバグが発現したか、条件式を追加して確認 “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[0]
== ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) SMTソルバ用クエリ sym_i0==’G’ && sym_i1==’E’ && sym_i2==’T’ && PC == 0x4141414141414141 スタックバッファオーバーフローが発生するなら上記のクエリが充足可能となる -> (UNSAT)

== ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) SMTソルバ用クエリ sym_i0!=’G’ && sym_i1-i4==’’HTTP” -> (UNSAT) if (input[5]<0x10) && PC == 0x4141414141414141 && sym_i5<0x10

== ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) SMTソルバ用クエリ sym_i0!=’G’ && sym_i1-i4==’’HTTP” && PC == 0x4141414141414141 -> (SAT) if (input[5]<0x10) && sym_i5>=0x10 “HTTP1.?AAAAAAAAAAAAAAAAAAAAAAAA”

16 シンボリック実行による脆弱性発見 (演習1) pip install angr sudo sysctl -w kernel.randomize_va_space=0
gcc -z execstack -no-pie -fno-stack-protector -o stack_bof_tiny stack_bof_tiny.c python ./simple_sym_stack_bof.py stack_bof_tiny 参考: https://github.com/ChrisTheCoolHut/Zeratool

17 ここまでのまとめシンボリック実行: • 脆弱性を自動で発見するための技術 • プログラムをインタプリタで実行し、通過したパスに沿って条件式を解析、 SMTクエリ化する • 開発者がテストしたい条件
(ex. スタックバッファオーバーフローが発生する条件 )をSMTクエリに追加 • SMTソルバでクエリを解き、条件を満たすような入力を生成する

シンボリック実行高速化テクニック入力のサイズ、値を仮定し条件式を減らす

シンボリック実行のパス爆発問題 AEGで使用したシンボリック実行はシンボルを使用する条件分岐を 1度通過する度にforkにより状態数が増加し、組み合わせ爆発が起こってしまう。 19 プログラム int main(int argc, char
*argv[]) { char buf[100] = {}; printf("buf = %p\n", buf); char *input = argv[1]; while (input[i] != ‘\0’) // strcpy buf[i++] = input[i] puts(buf); return 0; } input[0] != ‘\x0’ input[1] != ‘\x0’ input[2] != ‘\x0’ input[n] != ‘\x0’ …..

20 シンボリック実行によるパフォーマンスオーバーヘッド (演習2) gcc -z execstack -no-pie -fno-stack-protector -o stack_bof
stack_bof.c python ./simple_sym_stack_bof.py stack_bof

シンボリック実行の高速化テクニック (Preconditioned Symbolic Execution) シンボル入力に対して ”強い”仮定を予め付加する事で (preconditioned)不必要なパスを削るテクニック。それぞれのパスと付加された仮定で論理積 (AND)を取り、充足不可能なパスは放棄する 21
プログラム int main(int argc, char *argv[]) { char buf[100] = {}; printf("buf = %p\n", buf); char *input = argv[1]; while (input[i] != ‘\0’) // strcpy buf[i++] = input[i] puts(buf); return 0; } input[0] != ‘\x0’ input[1] != ‘\x0’ input[2] != ‘\x0’ input[n] != ‘\x0’ ….. 付加仮定(precondition) input[0] == ‘\x0’ & input[1] == ‘\x0’ & …...

入力サイズの仮定による高速化 (with known length) スタックバッファオーバーフローを引き起こしそうな入力を仮定する。静的解析を用いて bufのサイズ(100) を取得、入力をそれより大きなサイズ (ex. 110)と仮定してしまう。 22
プログラム int main(int argc, char *argv[]) { char buf[100] = {}; printf("buf = %p\n", buf); char *input = argv[1]; while (input[i] != ‘\0’) // strcpy buf[i++] = input[i] puts(buf); return 0; } input[0] == ‘\x0’ & input[0] != ‘\x0’ & ….. input[1] == ‘\x0’ & input[0] != ‘\x0’ & input[1] != ‘\x0’ & …… input[109] != ‘\x0’ & input[0] != ‘\x0’ & input[1] != ‘\x0’ & ….. input[109] == ‘\x0’ ….. input[0] != ‘\x0’ & input[1] != ‘\x0’ & …... input[109] == ‘\x0’ 付加仮定(precondition) 付加仮定(precond)を論理積で追加し充足不可能な状態パスは捨てる → UNSAT → UNSAT → UNSAT

シンボル入力に対して ”強い”仮定を予め付加する事で (preconditioned)不必要なパスを削るテクニック。それぞれのパスと付加された仮定で論理積 (AND)を取り、充足不可能なパスは放棄する 23 プログラム int main(int argc,
char *argv[]) { char buf[100] = {}; printf("buf = %p\n", buf); char *input = argv[1]; while (input[i] != ‘\0’) // strcpy buf[i++] = input[i] puts(buf); return 0; } input[0] != ‘\x0’ input[1] != ‘\x0’ input[2] != ‘\x0’ input[109] == ‘\x0’ ….. input[0] != ‘\x0’ & input[1] != ‘\x0’ & …... input[109] == ‘\x0’ 付加仮定(precondition) input[0] != ‘\x0’ & input[1] != ‘\x0’ & …… input[109] == ‘\x0’ → “AAAAAAA………\x0” 入力サイズの仮定による高速化 (with known length)

入力値の仮定による高速化 (Concolic Testing) シンボル入力に対して具体値を 1つ仮定する事で(preconditioned)不必要なパスを削るテクニック。それぞれのパスと付加された仮定で論理積 (AND)を取り、充足不可能なパスは放棄する 24 プログラム int
main(int argc, char *argv[]) { char buf[100] = {}; printf("buf = %p\n", buf); char *input = parse_heavy(argv[1]); while (input[i] != ‘\0’) // strcpy buf[i++] = input[i] puts(buf); return 0; } input[0] != ‘\x0’ input[1] != ‘\x0’ input[n] != ‘\x0’ ….. 数千単位の巨大な状態パス (parse_heavy) シンボリック実行がパス爆発を起こす

入力値の仮定による高速化 (Concolic Testing) 初期シード(具体値)を用いて実行し、シンボル入力が具体値に完全一致しているという条件を付加する。具体値をうまく選べば parse_headerの処理を抜ける事が可能。 25 プログラム int main(int
argc, char *argv[]) { char buf[100] = {}; printf("buf = %p\n", buf); char *input = parse_header(argv[1]); while (input[i] != ‘\0’) // strcpy buf[i++] = input[i] puts(buf); return 0; } input[0] != ‘\x0’ input[1] != ‘\x0’ input[n] != ‘\x0’ ….. 数千単位の巨大な状態パス (parse_header) “\7fELF\x02\x01\x01……\x0” 初期シード input[0] == ‘\x7f’ & input[1] == ‘E’ & input[2] == ‘L’ & input[3] == ‘F’ & input[4] == ‘\x02’ & input[5] == ‘\x01’ & …... 付加仮定(precondition)

26 “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[0] == ‘G’) if
(input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) Concolic Testingの改善 (White box Fuzzing) int main (int argc, char *argv[]) { read(input); if (input[0] == ‘G’) { if (input[1] == ‘E’) { if (input[2] == ‘T’) { .... } } } if (!strcmp(input, “HTTP”)) { … StackBOF(); // Vulnerable Code } if (isinteger(input)) { … }

27 White box Fuzzing (SAGE) 初期シード “GET<\x00” “HTTP” “501” “HTTP1.?”
if (input[0] == ‘G’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) “a” (1) 初期シード“a”でプログラムを実行 (2) 実行時のトレースから制約式を組み立てる (input[0] != “G”) & (strcmp(input, HTTP)) & (!isinteger(input)) (3) 制約式中の1つをネゲート(NOT演算) (input[0] == “G”) & (strcmp(input, HTTP)) & (!isinteger(input)) (4) SMTソルバを用いて(3)の制約式を解く → 得られた解“G” を次のシードに採用

28 “GET<\x00” “HTTP” “501” “HTTP1.? if (input[1] == ‘E’) if
(input[2] == ‘T’) “G” if (input[0] == ‘G’) シード “G” (input[0] == “G”) & (input[1] != ‘E’) & (input[2] != “T”) (input[0] == “G”) & (input[1] == ‘E’) & (input[2] != “T”) White box Fuzzing (SAGE) (1) シード“G”でプログラムを実行 (2) 実行時のトレースから制約式を組み立てる (3) 制約式中の1つをネゲート(NOT演算) (4) SMTソルバを用いて(3)の制約式を解く → 得られた解“GE” を次のシードに採用

29 “GET<\x00” “HTTP” “501” “HTTP1.? if (input[1] == ‘E’) if
(input[2] == ‘T’) “GE” if (input[0] == ‘G’) シード “GE” (input[0] == “G”) & (input[1] != ‘E’) & (input[2] != “T”) (input[0] == “G”) & (input[1] == ‘E’) & (input[2] == “T”) White box Fuzzing (SAGE) (1) シード“GE”でプログラムを実行 (2) 実行時のトレースから制約式を組み立てる (3) 制約式中の1つをネゲート(NOT演算) (4) SMTソルバを用いて(3)の制約式を解く → 得られた解“GET” を次のシードに採用

30 “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[1] == ‘E’) if
(input[2] == ‘T’) “GET” if (input[0] == ‘G’) シード “GET” 以上の3ステップで “G”->“GE”->“GET”と3つの入力生成に成功した White box Fuzzing (SAGE)

31 White box Fuzzingのスケジューリング問題 “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[0]
== ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) “GET” “a” (Initial Seed) “G” “HTTP” “501” “GE” “GET” “GET<\x00” “HTTP1.?” “GE” “G” 脆弱性を発火させる”HTTP1.?AAAA….”をより早く引き当てるには”a”->”HTTP”の順番の方が良い次にどの制約式をネゲートしてシードを生成するべきか、次にどのパスを選択してシードを生成するべきか -> パス選択スケジューリング問題 (1) (2) (3) (4) (5)

32 White box Fuzzingのパフォーマンスオーバーヘッド数千式単位の大規模な制約式をネゲートし SMTソルバを使って次の入力を生成 .... ループ、再帰を始めとする長大な状態パスから制約式を構築 { C1
& C2 & C3 & C4 & C5 … C100 & C101 & C102 … …. C1000 && !C1001 && C1002 …. C2360 & C2361 & C2362 …. } C1 C2 C3 C2360 C2361 制約式 (数千単位の連言式) パフォーマンスオーバーヘッドネゲート(NOT) -> 次のパスをソルバで解くのに数日かかる

シンボリック実行高速化テクニック 33 [*1] Thanassis Avgerinos, et.al “AEG: Automatic Exploit Generation”
[*2] Koushik Sen, etl.al “DART: Directed Automated Random Testing” [PLDI’06] [*3] Sang Kil Cha, et.al “Unleashing MAYHEM on Binary Code” [S&P’12] [*4] Patrice Godefloyd, et.al “Automated Whitebox Fuzz Testing” [NDSS’08] [*5] Driller: Augmenting Fuzzing Through Selective Symbolic Execution [NDSS’16] AEG[*1] DART[*2] シンボリック実行 (Symbolic Execution) Preconditioned Symbolic Execution コンコリック実行 (Concolic Testing) White box Fuzzing Mayhem[*3] SAGE [*4] Driller(hybrid fuzzing) [*5] simple_aeg_stack_bof.py

34 ここまでのまとめシンボリック実行: • 脆弱性を自動で発見するための技術 • プログラムをインタプリタで実行し、通過したパスに沿って条件式を解析、 SMTクエリ化する • 開発者がテストしたい条件
(ex. スタックバッファオーバーフローが発生する条件 )をSMTクエリに追加 • SMTソルバでクエリを解き、条件を満たすような入力を生成する • パス上の条件式を全てクエリ化するため、ループなどをはさむと膨大なクエリが発生しパフォーマンスが低下コンコリック実行: • 入力の値を仮定する事で条件式を減らしシンボリック実行の高速化を行うテクニック White box Fuzzing: • 入力の値を仮定するコンコリック実行を行い、生成した条件式のうち 1つをネゲートし次の入力の値を生成。これを繰り返すテクニック。

ファジング Grey box Fuzzing

ファジングファジングはシンボリック実行と同じく脆弱性発見自動化技術の一つ大量の入力を生成して検査対象のプログラムを実行し続ける手法 → 長時間実行するほど脆弱性を誘発する入力を引き当てる確率が上がるファザー (Fuzzer) プログラム (Program Under
Test) ・・・ “a” “\kqfa” “AAAAAAA...” “xjoiau3” 5時間で1/10000個の脆弱な入力発見 “脆弱性発火” 36

Grey box Fuzzing (AFL) “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[0]
== ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 操作(mutation) BitFlip(input, n): inputのnビット目を反転させる ByteFlip(input, n): inputのnバイト目を反転させる Arithm(input, n): inputのnバイト目に算術演算 Insert(input, n, m, s): inputのnバイト目に[m, s]挿入 ….. Havoc(input): 上記操作をランダムに組み合わせた操作 “a” (初期シード ) “a” 37

38 ビットフリップ (BitFlip mutation) 入力のnビット目を反転。AFLやlibfuzzerなど多くのファザーが採用 0 0 1 0 1
0 0 1 0 0 1 0 1 1 0 1 BitFlip

39 バイトフリップ (ByteFlip mutation) 0 0 1 0 1 0
0 1 0 0 1 0 1 1 0 1 Byteﬂip 1 1 0 1 0 1 1 0 入力のnバイト目を反転。AFLやlibfuzzerなど多くのファザーが採用

40 算術演算 (Arithmetic mutation) 入力のnバイト目に様々な値を加減乗除算する 0 0 1 0 1
0 0 1 0 0 1 0 1 1 0 1 1 1 0 1 0 1 1 0 Arithmetic operation +-*/ 256, U32_MAX, U32_MIN

41 挿入/削除(Insert/Delete mutation) 入力のnバイト目からmバイト削除、mバイトのバイト列を挿入 0 0 1 0 1 0
0 1 0 0 1 0 1 0 0 1 Insert/Delete 0 0 1 0 1 1 0 1 1 1 0 1 0 1 1 0

0 0 1 0 1 0 0 1 BitFlip 0
0 1 0 1 1 0 1 0 0 1 0 1 0 0 1 ByteFlip 1 1 0 1 0 1 1 0 0 0 1 0 1 0 0 1 Insert/Delete 0 0 1 0 1 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 1 0 0 1 Arithmetic 1 1 0 1 0 1 1 0 +-*/ 256, U32_MAX, U32_MIN

== ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 操作(mutation) BitFlip(input, n): inputのnビット目を反転させる ByteFlip(input, n): inputのnバイト目を反転させる Arithm(input, n): inputのnバイト目に算術演算 Insert(input, n, m, s): inputのnバイト目に[m, s]挿入 ….. Havoc(input): 上記操作をランダムに組み合わせた操作 “G” Arithm(0) “G” 43 “a” (初期シード )

== ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 操作(mutation) BitFlip(input, n): inputのnビット目を反転させる ByteFlip(input, n): inputのnバイト目を反転させる Arithm(input, n): inputのnバイト目に算術演算 Insert(input, n, m, s): inputのnバイト目に[m, s]挿入 ….. Havoc(input): 上記操作をランダムに組み合わせた操 “G” “HTTP” Havoc() 44 “a” (初期シード )

== ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 操作(mutation) BitFlip(input, n): inputのnビット目を反転させる ByteFlip(input, n): inputのnバイト目を反転させる Arithm(input, n): inputのnバイト目に算術演算 Insert(input, n, m, s): inputのnバイト目に[m, s]挿入 ….. Havoc(input): 上記操作をランダムに組み合わせた操作 “G” “GE” ByteFlip(1) “GE” 45 “HTTP” “a” (初期シード )

Grey box Fuzzing (AFL) プログラムカバレッジと1:1に対応するシードが木構造で保存される “GET<\x00” “HTTP” “501” “HTTP1.?” if
(input[0] == ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) “GET” “GE” “G” AFL(American Fuzzy Lop)を始めとした多くのファザーがカバレッジファジングを採用 46 “a” (初期シード ) “G” “HTTP” “501” “GE” “GET” “GET<\x00” “HTTP1.?”

47 プログラムカバレッジの取得方法 if (input[0] == ‘G’) if (input[1] == ‘E’)
if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 2536 124 6210 8147 297 4010 ターゲットプログラムのコンパイル時に CFG中の節点(Basic Block)に乱数を予め埋め込む

48 Grey box Fuzzingにおけるエッジカバレッジ通過した各エッジを、埋め込まれた乱数(prev/nextBB)を元にハッシュ化してメモリに写像 if (input[0] == ‘G’)
if (isinteger(input)) if (!strcmp(input, “HTTP”)) 2536 124 297 4010 key = nextBB ^ prevBB >> 1 shared_mem[key] = cnt 6210 8147

49 if (input[0] == ‘G’) if (input[1] == ‘E’) if
(input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 2536 124 6210 8147 297 4010 Grey box Fuzzingにおけるエッジカバレッジターゲットプログラムにソースコードが無い場合は、 QEMUでバイナリを実行しBasicBlockを通過したタイミングで乱数を付与エミュレータ上での実行はネイティブ実行の 10-100倍以上遅い

Grey box Fuzzing カバレッジ情報 + 遺伝的アルゴリズム変異(mutation) 世代(generation) 評価関数シードスケジューラ(選択)
交叉/突然変異(mutation) 新しいプログラムパスの発見 (カバレッジの増加) ⇒ 1.0 どのテストケース(シード)にミューテーションを加えるべきか選択したテストケースから新たなテストケースを生成するための操作 “\x0\x0\x0\x0” (Initial Seed) “G” “HTTP” “501” “GE” “GET” “GET<\x00” “HTTP1.x” “a” 第1世代第2世代第3世代第4世代 50

51 https://tunnelshade.in/blog/2018/01/afl-internals-compile-time-instrumentation/ AFLによるファジング (演習3) git clone https://github.com/google/AFL cd AFL &&
make && cd .. ./AFL/afl-gcc -z execstack -no-pie -fno-stack-protector -o stack_bof_tiny stack_bof_tiny.c

52 https://tunnelshade.in/blog/2018/01/afl-internals-compile-time-instrumentation/ gdb -q stack_bof_tiny (gdb) disas main Dump of
assembler code for function main: 0x0000000000031ba0 <+0>: lea -0x98(%rsp),%rsp 0x0000000000031ba8 <+8>: mov %rdx,(%rsp) 0x0000000000031bac <+12>: mov %rcx,0x8(%rsp) 0x0000000000031bb1 <+17>: mov %rax,0x10(%rsp) 0x0000000000031bb6 <+22>: mov $0xab10,%rcx 0x0000000000031bbd <+29>: callq 0x3caf8 <__afl_maybe_log> AFLによるファジング (演習3)

53 AFLによるファジング (演習3) ./afl-fuzz -i <initial seed dir> -t <time
out> -m <memory limit> -o <output dir> -- <command line> ls <output dir>/queue # シード ls <output dir>/crashes # クラッシュを引き起こす入力 ./prepare-afl.sh mkdir initial echo “ABC” > initial/seed.in ./AFL/afl-fuzz -i initial/ -t 10000 -m 1024 -o output -- ./stack_bof_tiny @@

54 AFLによるファジング (演習3) 一連の手順を再度繰り返してstack_bofもファジングしてみよう

Grey box Fuzzingの問題点 “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[0] ==
‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) 操作(mutation) BitFlip(input, n): inputのnビット目を反転させる ByteFlip(input, n): inputのnバイト目を反転させる Arithm(input, n): inputのnバイト目に算術演算 Insert(input, n, m, s): inputのnバイト目に[m, s]挿入 ….. Havoc(input): 上記操作をランダムに組み合わせた操 “G” “HTTP” Havoc() 55 “a” (初期シード ) “a”に対してミューテーションを施し ”HTTP”を引き当てるには非常に多くの試行回数が必要

Grey box Fuzzingの問題点 56 Grey box fuzzingは長大な文字列/バイト列との比較条件を突破出来ない if (input ==
0xdeadbeefcafebabe) { crash(); } Grey box fuzzing White box fuzzing Mutation “ 0x0” “0xdeadbeefcafebabe ” 生成に成功する確率は1/(2^64) SMT solve “ 0x0” “0xdeadbeefcafebabe ” 生成に成功する確率は100%

ファジングとシンボリック実行の組み合わせ Hybrid Fuzzing

58 White box Fuzzingのパフォーマンス “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[0]
== ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) “GET” “a” (Initial Seed) “G” “HTTP” “501” “GE” “GET” “GET<\x00” “HTTP1.?” “GE” “G” white box fuzzingは次の入力を生成するために必ず SMTソルバを使用 -> ループなどの長大な制約式列が多く含まれると、毎回解くのに時間がかかる (1) (2) (3) (4) (5) SMT solve SMT solve SMT solve SMT solve SMT solve SMT solve SMT solve SMT solveを7回実行すれば全てのパスを網羅し、脆弱な入力の発見に成功する

59 Grey box Fuzzingのパフォーマンス “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[0]
== ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) “GET” “GE” “G” Grey box fuzzingは次の入力を生成するために乱択アルゴリズムを使用 -> 簡単な制約式であれば SMTソルバよりも早く満たすことが可能だが、難解な制約式は不可能 SMT solveを7回実行すれば全てのパスを網羅し、脆弱な入力の発見に成功する。ただし”HTTP”が発見できない “a” (Initial Seed) “G” “HTTP” “501” “GE” “GET” “GET<\x00” “HTTP1.?” (1) (2) (3) (4) (5) Fuzzing Fuzzing Fuzzing Fuzzing Fuzzing Fuzzing Fuzzing

60 Hybrid Fuzzing (Driller) “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[0]
== ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) “GET” “a” (Initial Seed) “G” “HTTP” “501” “GE” “GET” “GET<\x00” “HTTP1.?” “GE” “G” Grey box Fuzzingが一定ループ数実行しても次のシードを発見できない場合、 White box Fuzzingに切り替える。 (1) (2) (3) (4) (5) Fuzzing Fuzzing Fuzzing Fuzzing SMT solve SMT solve Fuzzing SMT solveを2回、Grey box Fuzzingを5回実行すれば全てのパスを網羅し、脆弱な入力の発見に成功する Fuzzing -> SMT solve へ切り替え

61 Hybrid Fuzzing (Driller) “a” (Initial Seed) “G” “HTTP” “501”
“GE” “GET” “GET<\x00” “HTTP1.?” Grey box Fuzzingが一定ループ数実行しても次のシードを発見できない場合、 White box Fuzzingに切り替える。 (1) (2) (3) (4) (5) Fuzzing Fuzzing Fuzzing Fuzzing SMT solve SMT solve Fuzzing SMT solveを2回、Grey box Fuzzingを5回実行すれば全てのパスを網羅し、脆弱な入力の発見に成功する Fuzzing -> SMT solve へ切り替え 1. Grey box Fuzzing (AFL)でパス探索 2. 1.で生成されたシードを用いて White box Fuzzing 3. 2.で新たに生成に成功した入力をシードに加え、 1.に戻る

トリアージ Triage

63 Automatic Exploit Generation (AEG) 脆弱性発見トリアージペイロード生成「ファジング」「シンボリック実行」 ...
「攻撃性評価」「クラッシュ分類」クラッシュ入力分類結果/POC エクスプロイトコードエクスプロイト自動化技術 (AEG)はファジングやシンボリック実行で発見した脆弱性に、さらに深い解析工程を加える事で、攻撃性評価やエクスプロイトコード生成までつなげる。

64 攻撃性評価 (Exploitability Assessment) トリアージの中でも攻撃性評価は、受け取ったクラッシュ入力がどの程度攻撃可能かのスコア付けや実際のエクスプロイトコード生成をおこなう技術。 Exploitability Assessment ターゲットプログラム入力
クラッシュ入力出力エクスプロイトコード/スコア “AAAAAAAAAAAAA\x31\xf6\x48 \xbb\x2f\x62\x69\x6e\x2f\x2f\x73 \x68\x56”

== ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) SMTソルバ用クエリ sym_i0!=’G’ && sym_i1-i4==’’HTTP” && PC == 0x4141414141414141 -> (SAT) if (input[5]<0x10) && sym_i5>=0x10 “HTTP1.?AAAAAAAAAAAAAAAAAAAAAAAA”

66 シンボリック実行によるエクスプロイト生成各パス上でバグが発現したか、条件式を追加して確認 “GET<\x00” “HTTP” “501” “HTTP1.?” if (input[0]
== ‘G’) if (input[1] == ‘E’) if (input[2] == ‘T’) if (isinteger(input)) if (!strcmp(input, “HTTP”)) SMTソルバ用クエリ sym_i0!=’G’ && sym_i1-i4==’’HTTP” && PC == sym_addr && sym_addr1,2,.... == shellcode -> (SAT) if (input[5]<0x10) && sym_i5>=0x10 “HTTP1.?AAAAAA\x31\xf6\x48\xbb\x2f\x62 \x69\x6e\x2f\x2f\x73\x68\x56”

67 シンボリック実行によるエクスプロイト生成 (演習4) gcc -z execstack -no-pie -fno-stack-protector -o stack_bof_tiny
stack_bof_tiny.c python ./simple_aeg_stack_bof.py stack_bof_tiny 参考: https://github.com/ChrisTheCoolHut/Zeratool https://github.com/angr/rex

68 クラッシュ分類トリアージの中でもクラッシュ分類は、受け取ったクラッシュ入力がどのバグに起因して発生した物なのかで分類する技術 Exploitability Assessment ターゲットプログラム入力クラッシュ入力出力
クラッシュ入力(分類済み)

69 クラッシュ分類 (AFL) AFLではプログラムパス (bitmap)が異なる2つのクラッシュ入力は異なるバグに起因していると仮定して分類をおこなう。(Cトラック応募課題) 2536 124 297 4010
6210 8147 バグ1 バグ2

70 クラッシュ分類 (AFL)の欠点異なるプログラムパスでも単一の同じバグ要因を踏んでいる場合がある 2536 124 297 4010 6210 8147
誤った分類正しい分類バグ1

71 クラッシュ分類技術の最先端 • スタックトレースから生成したハッシュによる分類 [*1] ◦ AFLのようなカバレッジではなく、スタックトレースを特徴量として使いハッシュ化する • リバース実行+逆テイント解析によるバグ要因箇所の検出 [*2]
◦ クラッシュ時のメモリダンプからクラッシュ要因となっていそうなレジスタやメモリの値を検出 ◦ 当該箇所にテイントを載せてプログラムの実行とは逆向きに伝搬させる事でバグの要因となった箇所を特定する • リバース実行+プロセッサトレースによるバグ要因箇所の検出 [*3] • スタックトレースから生成したハッシュ+テイント解析によるバグ要因箇所の検出[*4] ◦ クローズドで部分的なドキュメントそか存在しないため不明 [*1] Thanassis Avgerinos, et.al “Dynamic Test Generation To Find Integer Bugs in x86 Binary Linux Programs” [SEC’09] [*2] Weidong Cui, et.al “RETracer: Triaging Crashes by Reverse Execution from Partial Memory Dumps” [ICSE’16] [*3] Weidong Cui, et.al “REPT: Reverse Debugging of Failures in Deployed Software” [OSDI’18] [*4] !exploitable https://archive.codeplex.com/?p=msecdbg

72 まとめエクスプロイト自動化を実現するためには : • バグを自動で発見する技術が必要 ◦ 制約式を解いて発見するシンボリック実行 ◦ シンボリック実行の制約式を削減し高速化を計るコンコリック実行
◦ バグを引くまでランダムに入力を生成しつづけるファジング ◦ 両者を組み合わせたハイブリッドファジング • 発見したバグを自動でエクスプロイトに利用する技術が必要 ◦ シンボリック実行により生成した制約式にエクスプロイトのための制約を追記し解くことでエクスプロイトコードを自動生成する AEG ◦ クラッシュ入力をバグ要因ごとに分類するクラッシュ分類技術

https://ricsec.co.jp

ファジング+トリアージ技術を用いた脆弱性解析自動化

ファジング+トリアージ技術を用いた脆弱性解析自動化

More Decks by Ren Kimura

Other Decks in Programming

Featured

Transcript