DeClang : Anti-hacking Compiler

8a84268593355816432ceaf78777d585?s=47 DeNA_Tech
October 29, 2020

DeClang : Anti-hacking Compiler

クライアントプロテクション技術には、パッキング、難読化、アンチデコンパイル、改ざん検知など、さまざまなアプローチがあるが、本発表ではこれらのアプローチの優劣について検討し、我々のコンパイラ型のクライアントプロテクションツール DeClang を紹介する。
今までの先行研究では、LLVM ベースのオープンソースプロジェクトは沢山存在する。しかし、これらのプロジェクトのほとんどが実験段階にとどまり、バグが潜んでいる、ARMがサポートされていない、モバイルアプリのビルドフローに適用できないなど、様々な欠点が存在している。DeClang はこれらの問題を克服し、実用レベルの難読化コンパイラとして一部オープンソース化される。
本発表では、Unity のビルドフローを分析し、いかにして DeClang を Unity のビルドフローに取り込めるかを説明する。さらに、難読化コンパイラとして実用レベルのものとするにあたり、obfuscator-llvm というプロジェクトに長期潜んでいたバグを見つけ、解決した手法についても紹介する。本発表を通じて誰でも手軽にモバイルアプリを守れるように実現させたいと考えている。

8a84268593355816432ceaf78777d585?s=128

DeNA_Tech

October 29, 2020
Tweet

Transcript

  1. CODE BLUE 2020 DeClang : Anti-hacking Compiler Mengyuan Wan 万萌遠

  2. CODE BLUE 2020 Self Introduction ≫ Security Engineer at DeNA

    Co., Ltd. ≫ Reverse Engineering / Developing / SOC / Application Pentesting / Cloud Security etc. ≫ CISSP ≫ Contacts − GitHub: https://www.github.com/nevermoe − Twitter: @nevermoecom 2
  3. CODE BLUE 2020 Agenda ≫ Motivation ≫ DeClang Introduction &

    Features ≫ An O-LLVM Bug & Fix ≫ Conclusion 3
  4. CODE BLUE 2020 Motivation ≫ Game cheating, app hacking is

    everywhere − Memory Hacking − Time Hacking − Network Traffic Tampering − Binary Tampering − Hooking − Assets dumping − etc. 4
  5. CODE BLUE 2020 Motivation ≫ Commercial anti-hacking solutions are expensive

    − Packer − Anti-hacking Library − Obfuscation Compiler 5
  6. CODE BLUE 2020 Motivation ≫ Can we create a free,

    open sourced anti-hacking solution? − You can’t open source a packer or an anti-hacking library. − But you can open source an obfuscation compiler partly. 6 * Other reasons: https://www.slideshare.net/dena_tech/declang-clang-dena-techcon-2020
  7. CODE BLUE 2020 Motivation ≫ That is DeClang − An

    anti-hacking compiler partly open sourced. − Based on LLVM project and extended Obfuscator-LLVM: https://github.com/obfuscator-llvm/obfuscator − Free to secure your apps and games. (Apache License 2.0) − https://github.com/DeNA/DeClang 7
  8. CODE BLUE 2020 Motivation ≫ Why DeClang? − Compatible with

    Unity build flow, mobile apps build flow. − Cross-platform • Host ➢ Windows / OSX / Linux • Target ➢ X86 / X64 / ARM / AArch64 ➢ Elf / Mach-O / PE ➢ Windows / OSX / Linux / Android / iOS • Build Flow ➢ Unity / Cocos2d / NDK / Xcode / Make / Visual Studio 8
  9. CODE BLUE 2020 DeClang Introduction ≫ Take Unity for example

    9
  10. CODE BLUE 2020 DeClang Introduction ≫ Unity build flow Unity

    C# C++ IL2CPP iOS APK Apple Clang (Xcode) NDK Clang IPA 10 Android
  11. CODE BLUE 2020 DeClang Introduction ≫ How to integrate with

    DeClang? Unity C# C++ IL2CPP iOS APK Apple Clang (Xcode) NDK Clang IPA 11 Android
  12. CODE BLUE 2020 DeClang Introduction ≫ Simply replace official Clang

    with DeClang! − For Android, replace the Clang binary. − For iOS, set the CC and CXX environment variable. Unity C# C++ IL2CPP iOS APK DeClang (Xcode) NDK DeClang IPA Android 12
  13. CODE BLUE 2020 DeClang Introduction ≫ How to pass config

    parameters to compiler? − Pass -mllvm -fla to compiler − Add __attribute((__annotate__(("fla")))) to functions in source file 13
  14. CODE BLUE 2020 DeClang Introduction ≫ How to pass config

    parameters to compiler? − Pass -mllvm -fla to compiler − Add __attribute((__annotate__(("fla")))) to functions in source file ≫ You cannot control parameters passed to NDK in Unity build flow ≫ It’s difficult to modify C++ files generated by IL2CPP every time 14
  15. CODE BLUE 2020 DeClang Introduction ≫ How to pass config

    parameters to compiler? − Set environment variable DECLANG_HOME & pass parameters by $DECLANG_HOME/.DeClang/config.json − Flexible: All the setup can be done in shell / powershell scripts. So it’s easy to integrate DeClang into CI. 15
  16. CODE BLUE 2020 ≫ Control Flow Flattening & Split Basic

    Blocks (Originated from O-LLVM) DeClang’s Feature "flatten": [ { "name": "PlayerShooting_Shoot_m", "split_level": 2 }, { "name": "^is_jailbroken$" } ] //config.json: 16
  17. CODE BLUE 2020 DeClang’s Feature ≫ Control Flow Flattening &

    Split Basic Blocks (Originated from O-LLVM) 17
  18. CODE BLUE 2020 ≫ Control Flow Flattening & Split Basic

    Blocks (Originated from O-LLVM) DeClang’s Feature 18
  19. CODE BLUE 2020 DeClang’s Feature { "overall_obfuscation": 100 // obfuscation

    percentage } //config.json: ≫ Indirect Branch (Original Feature) − It is globally applied so you don’t bother selecting target functions. − However it is weaker. 19
  20. CODE BLUE 2020 DeClang’s Feature ≫ Indirect Branch (Original Feature)

    – These code blocks belong to a single function but IDA recognizes them as different functions. – As a result, IDA fails to decompile these codes. 20
  21. CODE BLUE 2020 DeClang’s Feature 21 ≫ Other O-LLVM features

    can be ported to DeClang easily − Instruction Substitution − Bogus Control Flow
  22. CODE BLUE 2020 DeClang’s Feature 22 ≫ Features that are

    not open sourced − Function-level anti-tamper − Global anti-tamper − Root / Jailbreak / Emulator detection − global-metadata encryption
  23. CODE BLUE 2020 ≫ Function-level anti-tamper DeClang’s Feature foo bar

    23
  24. CODE BLUE 2020 ≫ Insert tamper detection at the beginning

    of the function DeClang’s Feature foo bar tamper detect tamper detection 24
  25. CODE BLUE 2020 ≫ Detecting tamper mutually DeClang’s Feature foo

    bar tamper detection tamper detect tamper detect tamper detection 25
  26. CODE BLUE 2020 ≫ Detecting tamper mutually DeClang’s Feature baz

    tamper detect tamper detect tam per detect tam per detect tamper detection tamper detection foo tamper detection bar tamper detection 26
  27. CODE BLUE 2020 ≫ Detecting tamper mutually DeClang’s Feature baz

    tamper detect tamper detect tam per detect tam per detect tamper detection tamper detection foo tamper detection bar tamper detection Hacker have to remove all tamper detection at once! 27
  28. CODE BLUE 2020 ≫ Without Flattening uint32_t foo() { uint32_t

    V_0 = 0; goto LBL3; LBL1: if (V_0 == 1) { printf("2\n"); goto LBL2; } else goto LBL3; LBL2: printf("3\n"); goto LBL3; LBL3: printf("1\n"); V_0 = getNum(); //getNum() {return first_time_called ? 1 : 0;} if (V_0) goto LBL1; else { printf("4\n"); return V_0; } } An O-LLVM Bug 1 2 3 1 4 Output: 28
  29. CODE BLUE 2020 ≫ Flattened uint32_t foo() { uint32_t V_0

    = 0; goto LBL3; LBL1: if (V_0 == 1) { printf("2\n"); goto LBL2; } else goto LBL3; LBL2: printf("3\n"); goto LBL3; LBL3: printf("1\n"); V_0 = getNum(); //getNum() {return first_time_called ? 1 : 0;} if (V_0) goto LBL1; else { printf("4\n"); return V_0; } } An O-LLVM Bug 2 3 1 2 3 1 4 LBL1 executed first?! Output: 29
  30. CODE BLUE 2020 ≫ Flattening Logic ①: bb1 ends with

    “br bb2” An O-LLVM Bug https://github.com/obfuscator-llvm/obfuscator/blob/llvm-4.0/lib/Transforms/Obfuscation/Flattening.cpp bb1 (prologue) bb2 bb3 bb4 (epilogue) bb1 (prologue) switchVar = 0x1 bb2 switchVar = 0x2 bb3 switchVar = 0x3 switch bb bb4 (epilogue) if switchVar == 0x1 if switchVar == 0x2 if switchVar == 0x3 30
  31. CODE BLUE 2020 ≫ Flattening Logic ②: bb1 ends with

    conditional branch An O-LLVM Bug https://github.com/obfuscator-llvm/obfuscator/blob/llvm-4.0/lib/Transforms/Obfuscation/Flattening.cpp#L100 bb1 (prologue) bb2 bb3 bb4 (epilogue) 31
  32. CODE BLUE 2020 ≫ Flattening Logic ②: bb1 ends with

    conditional branch An O-LLVM Bug https://github.com/obfuscator-llvm/obfuscator/blob/llvm-4.0/lib/Transforms/Obfuscation/Flattening.cpp bb1 (prologue) bb2 bb3 bb4 (epilogue) bb1.2 (cond br) bb2 bb3 bb4 (epilogue) bb1.1 (prologue) bb1.1 (prologue) bb2 bb3 switch bb bb4 (epilogue) bb1.2 (cond br) switch bb 32
  33. CODE BLUE 2020 ≫ Flattening Logic ③: What if bb1

    ends with “br bb3”? An O-LLVM Bug bb1 (prologue) bb2 bb3 bb4 (epilogue) bb1 (prologue) switchVar = 0x1 bb2 switchVar = 0x3 bb3 switchVar = 0x1 switch bb bb4 (epilogue) if switchVar == 0x1 if switchVar == 0x2 if switchVar == 0x3 33
  34. CODE BLUE 2020 bb1 (prologue) switchVar = 0x1 bb2 switchVar

    = 0x3 bb3 switchVar = 0x1 switch bb bb4 (epilogue) if switchVar == 0x1 if switchVar == 0x2 if switchVar == 0x3 ≫ Flattening Logic ③: What if bb1 ends with “br bb3”? An O-LLVM Bug https://github.com/obfuscator-llvm/obfuscator/blob/llvm-4.0/lib/Transforms/Obfuscation/Flattening.cpp#L119-L122 bb1(prologue) bb2 bb3 bb4 (epilogue) bb4 (epilogue) O-LLVM always assume first bb (bb2) in switch will be executed first. 34
  35. CODE BLUE 2020 ≫ How could this happen? − Usually

    prologue will only branch to the bb indexed next to it (if the branch instruction is not conditional). An O-LLVM Bug bb1 (prologue) bb2 bb3 bb4 (epilogue) Normal Case 35
  36. CODE BLUE 2020 An O-LLVM Bug ≫ How could this

    happen? − However if you write a messy code with a lot of goto... 36 Abnormal Case bb1 (prologue) bb2 bb3 bb4 (epilogue)
  37. CODE BLUE 2020 ≫ How could this happen? uint32_t foo()

    { uint32_t V_0 = 0; goto LBL3; LBL1: if (V_0 == 1) { printf("2\n"); goto LBL2; } else goto LBL3; LBL2: printf("3\n"); goto LBL3; LBL3: printf("1\n"); V_0 = getNum(); //getNum() {return first_time_called ? 1 : 0;} if (V_0) goto LBL1; else { printf("4\n"); return V_0; } } An O-LLVM Bug LLVM IR (Without Flattening) define i32 @_Z3foov() local_unnamed_addr #1 { br label %5 ; <label>:1: ; preds = %5 %2 = tail call i32 @puts(i8* getelementptr inbounds ([2 x i8], [2 x i8]* @str.5, i64 0, i64 0)) %3 = tail call i32 @puts(i8* getelementptr inbounds ([2 x i8], [2 x i8]* @str.6, i64 0, i64 0)) br label %4 ; <label>:4: ; preds = %1, %5 br label %5 ; <label>:5: ; preds = %4, %0 %6 = tail call i32 @puts(i8* getelementptr inbounds ([2 x i8], [2 x i8]* @str, i64 0, i64 0)) %7 = tail call i32 @_Z6getNumv() switch i32 %7, label %4 [ i32 0, label %8 i32 1, label %1 ] ; <label>:8: ; preds = %5 %9 = tail call i32 @puts(i8* getelementptr inbounds ([2 x i8], [2 x i8]* @str.4, i64 0, i64 0)) ret i32 0 } 37
  38. CODE BLUE 2020 ≫ How could this happen? uint32_t foo()

    { uint32_t V_0 = 0; goto LBL3; LBL1: if (V_0 == 1) { printf("2\n"); goto LBL2; } else goto LBL3; LBL2: printf("3\n"); goto LBL3; LBL3: printf("1\n"); V_0 = getNum(); //getNum() {return first_time_called ? 1 : 0;} if (V_0) goto LBL1; else { printf("4\n"); return V_0; } } An O-LLVM Bug define i32 @_Z3foov() local_unnamed_addr #1 { %1 = alloca i32 %2 = alloca i32 %3 = bitcast i32 0 to i32 store i32 205249092, i32* %1 br label %4 ; <label>:4: ; preds = %0, %29 %5 = load i32, i32* %1 switch i32 %5, label %6 [ i32 205249092, label %7 i32 -1124873994, label %10 i32 -1130815655, label %11 i32 -1828373093, label %12 i32 192667987, label %15 i32 -599381087, label %19 i32 -786179519, label %23 i32 2098306815, label %27 ] ; <label>:6: ; preds = %4 br label %29 ; <label>:7: ; preds = %4 %8 = tail call i32 @puts(i8* getelementptr inbounds ([2 x i8], [2 x i8]* @str.5, i64 0, i64 0)) %9 = tail call i32 @puts(i8* getelementptr inbounds ([2 x i8], [2 x i8]* @str.6, i64 0, i64 0)) store i32 -1130815655, i32* %1 br label %29 … } first non-default bb is always executed first LLVM IR (Flattened by O-LLVM) 38
  39. CODE BLUE 2020 ≫ Fix − Simply always split first

    bb! An O-LLVM Bug // always split first BB if ((br != NULL /* && br->isConditional()*/ ) || insert -> getTerminator() -> getNumSuccessors() > 1) { BasicBlock::iterator i = insert -> end(); --i; if (insert -> size() > 1) { --i; } BasicBlock * tmpBB = insert -> splitBasicBlock(i, "first"); origBB.insert(origBB.begin(), tmpBB); } https://github.com/obfuscator-llvm/obfuscator/blob/llvm-4.0/lib/Transforms/Obfuscation/Flattening.cpp#L100 39
  40. CODE BLUE 2020 ≫ Fix uint32_t foo() { uint32_t V_0

    = 0; goto LBL3; LBL1: if (V_0 == 1) { printf("2\n"); goto LBL2; } else goto LBL3; LBL2: printf("3\n"); goto LBL3; LBL3: printf("1\n"); V_0 = getNum(); //getNum() {return first_time_called ? 1 : 0;} if (V_0) goto LBL1; else { printf("4\n"); return V_0; } } An O-LLVM Bug LLVM IR (Flattened by DeClang) define i32 @_Z3foov() local_unnamed_addr #1 { br label %1 ; <label>:1: ; preds = %0 br label %7 ; <label>:2: ; preds = %12 %3 = tail call i32 @puts(i8* getelementptr inbounds ([2 x i8], [2 x i8]* @str.5, i64 0, i64 0)) %4 = tail call i32 @puts(i8* getelementptr inbounds ([2 x i8], [2 x i8]* @str.6, i64 0, i64 0)) br label %6 ; <label>:5: ; preds = %12, %14 br label %6 ; <label>:6: ; preds = %5, %2 br label %7 ; <label>:7: ; preds = %6, %1 %8 = tail call i32 @puts(i8* getelementptr inbounds ([2 x i8], [2 x i8]* @str, i64 0, i64 0)) %9 = tail call i32 @_Z6getNumv() br label %10 … } 40
  41. CODE BLUE 2020 ≫ Fix uint32_t foo() { uint32_t V_0

    = 0; goto LBL3; LBL1: if (V_0 == 1) { printf("2\n"); goto LBL2; } else goto LBL3; LBL2: printf("3\n"); goto LBL3; LBL3: printf("1\n"); V_0 = getNum(); //getNum() {return first_time_called ? 1 : 0;} if (V_0) goto LBL1; else { printf("4\n"); return V_0; } } An O-LLVM Bug LLVM IR (Flattened by DeClang) define i32 @_Z3foov() local_unnamed_addr #1 { %1 = alloca i32 %2 = alloca i32 %3 = bitcast i32 0 to i32 store i32 205249092, i32* %1 br label %4 ; <label>:4: ; preds = %0, %30 %5 = load i32, i32* %1 switch i32 %5, label %6 [ i32 205249092, label %7 ... i32 192667987, label %13 ... ] ; <label>:6: ; preds = %4 br label %30 ; <label>:7: ; preds = %4 store i32 192667987, i32* %1 br label %30 ... ; <label>:13: ; preds = %4 %14 = tail call i32 @puts(i8* getelementptr inbounds ([2 x i8], [2 x i8]* @str, i64 0, i64 0)) %15 = tail call i32 @_Z6getNumv() store i32 %15, i32* %2 store i32 -599381087, i32* %1 br label %30 41
  42. CODE BLUE 2020 ≫ But who the hell writes this

    kind of messy code? An O-LLVM Bug uint32_t foo() { uint32_t V_0 = 0; goto LBL3; LBL1: if (V_0 == 1) { printf("2\n"); goto LBL2; } else goto LBL3; LBL2: printf("3\n"); goto LBL3; LBL3: printf("1\n"); V_0 = getNum(); ///getNum() {return first_time_called ? 1 : 0;} if (V_0) goto LBL1; else { printf("4\n"); return V_0; } } 42
  43. CODE BLUE 2020 ≫ But who the hell writes this

    kind of messy code? − Unity does! An O-LLVM Bug 43
  44. CODE BLUE 2020 ≫ C++ files generated by IL2CPP have

    a lot of “goto”! An O-LLVM Bug IL2CPP_EXTERN_C IL2CPP_METHOD_ATTR void TutorialInfo_ToggleShowAtLaunch_m3632B30A9CA1D2147A5A71C32AA605C54EBA1E37 (TutorialInfo_t32C32F28F3E107CDDA9A04D4A6B927D7CED565C6 * __this, const RuntimeMethod* method) { ... { ... if (L_3) { G_B2_0 = L_2; goto IL_0021; } } { G_B3_0 = 0; G_B3_1 = G_B1_0; goto IL_0022; } IL_0021: { G_B3_0 = 1; G_B3_1 = G_B2_0; } IL_0022: { ... return; } } 44
  45. CODE BLUE 2020 ≫ C++ files generated by IL2CPP have

    a lot of “goto”! An O-LLVM Bug IL2CPP_EXTERN_C IL2CPP_METHOD_ATTR void TutorialInfo_ToggleShowAtLaunch_m3632B30A9CA1D2147A5A71C32AA605C54EBA1E37 (TutorialInfo_t32C32F28F3E107CDDA9A04D4A6B927D7CED565C6 * __this, const RuntimeMethod* method) { ... { ... if (L_3) { G_B2_0 = L_2; goto IL_0021; } } { G_B3_0 = 0; G_B3_1 = G_B1_0; goto IL_0022; } IL_0021: { G_B3_0 = 1; G_B3_1 = G_B2_0; } IL_0022: { ... return; } } 45 To conclude: This is a bug triggered by GOTO campaign ?!
  46. CODE BLUE 2020 ≫ A demo of function-level anti-tamper feature

    − https://www.youtube.com/watch?v=Y-zkDt2e-pI&featur e=youtu.be Demo 46
  47. CODE BLUE 2020 ≫ DeClang motto − Cheaper: Everyone can

    secure their apps freely. − Easier: Everyone can integrate DeClang easily. − Stronger: Everyone can improve DeClang. Conclusion 47 https://github.com/DeNA/DeClang
  48. CODE BLUE 2020 Follow Twitter @DeNAxTech !