Upgrade to Pro — share decks privately, control downloads, hide ads and more …

SITCON2015 Android Repackaged App Detection System

M157q
March 07, 2015

SITCON2015 Android Repackaged App Detection System

本投影片採 CC BY-SA 3.0 TW 授權
https://creativecommons.org/licenses/by-sa/3.0/tw/

此 Talk 將會介紹我和另一位專題夥伴
報名並獲選國科會(現已改名科技部)計劃後,
利用一些 Open Source 的工具
打造一個自動化的 Android App 重製偵測系統的辛酸血淚史。
其內容包含:
相關論文簡單介紹、
打造自動下載官方 Google Play 及第三方 App 的爬蟲、
對 App 進行分析後再針對程式特徵進行相似度的比對、
將結果以網頁視覺化呈現給使用者。
其中使用到了 Perl, Python, Java, JavaScript 語言及相關 Framework,
會對有使用到的部分進行簡單介紹並分享做專題的一些經驗。

M157q

March 07, 2015
Tweet

More Decks by M157q

Other Decks in Programming

Transcript

  1. ABOUT ME 鄭順⼀一 (Shun-Yi Jheng) [email protected] http://m157q.github.io A Python, Free

    Software, Security and Arch GNU/ Linux Lover. 魯蛇⼀一條,⽬目前正在三修 Compiler 中。︒
  2. KEYWORDS App, Android, Google Play Perl, Python, Java, JavaScript .apk,

    .dex, .smali, .json Scrapy, NetworkX, Node.js, D3.js Data Dependence Graph, Program Slicing Fuzzy Hashing, Obfuscation, Subgraph Isomorphism
  3. MOTIVATION 始於 2013 年 7 ⽉月的⼤大學部專題計劃。︒ 原本是要拿 Lab 的各種 Virus

    Sample 檢測比對市 ⾯面上的 Android App 並將其做分類分群,但已有 ⼈人先做。︒ 後來邊看相關論⽂文邊討論的過程中,某件當時很 紅的事件成為了契機。︒
  4. ANDARWIN A scalable approach to detecting similar Android apps based

    on semantic information. AnDarwin: Scalable Detection of Semantically Similar Android Applications
  5. DROIDRANGER A malware detection system to do pairwise compare the

    app with malware samples. Hey, You, Get Off of My Market: Detecting Malicious Apps in Official and Alternative Android Markets
  6. DROIDMOSS Analyze repackaged App between official and third-party marketplaces with

    fuzzy hashing technique. Detecting Repackaged Smartphone Applications in Third-Party Android Marketplace
  7. FUSSY HASHING aka Context Triggered Piecewise Hashing allows examiners to

    find documents that are similar but not quite identical. The Digital Standard: Why Fuzzy Hashing is Really Cool http://jessekornblum.com/presentations/htcia06.pdf
  8. GOAL 從 Google Play 和第三⽅方 Market 下載 App 分析比較 App

    之間的相似度 分析結果視覺化 讓使⽤用者可⾃自⾏行上傳並提供分析結果的網站
  9. NO PROBLEM SHOULD EVER HAVE TO BE SOLVED TWICE. Creative

    brains are a valuable, limited resource. They shouldn't be wasted on re-inventing the wheel when there are so many fascinating new problems waiting out there.
  10. CRAWLER Google Play - Akdeniz/google-play-crawler Unofficial API, written in Java.

    Third Party - mssun/android-apps-crawler Use Scrapy (A Python Framework for webcrawler) Scheduling Use Perl Script + crontab
  11. ANALYZER Data Dependency Use SAAF (Static Android Analysis Framework), written

    in Java. Supports Program Slicing on smali code. Subgraph Isomorphism Use NetworkX - A Python Lib for graph analyzing
  12. DATA DEPENDENCY GRAPH Directed graph representing dependencies of several objects

    towards each other. An edge from a to b: iff a must be evaluated before b. Dependency graph - Wikipedia, the free encyclopedia Data dependency - Wikipedia, the free encyclopedia
  13. PROGRAM SLICING Computation of the set of programs statements, the

    program slice, that may affect the values at some point of interest, referred to as a slicing criterion.
  14. VISUALIZER Use D3.js A JavaScript Lib for data manipulating &

    visualization 可使⽤用 json 直接 render 出視覺化圖形 有非常多不同的類型可以選擇
  15. CRAWLER 從 Google Play 下載 App 從 Third Party Market

    下載 App 定時更新並記錄 App 資訊
  16. ANALYZER 將每個下載下來的 apk 檔解開得到 dex code 利⽤用 smali 將 dex

    code 轉為 smali code 再透過 SAAF 得到每個 apk 檔的 Data Dependency Graph 將 Data Dependency Graph 以 json 格式輸出
  17. .APK, .DEX, .SMALI .apk: Android Application Package .dex: Dalvik EXecutable

    .smali Smali (assembler in Icelandic) Backsmali (disassembler in Icelandic)
  18. ANALYZER 將 json 形式的 Data Dependency Graph 載入 ⾃自⾏行撰寫 Python

    Script,利⽤用 NetworkX 進⾏行⼦子 圖共構 (Subgraph Isomorphism) 的比對,計算 其相似度 將分析完後的圖形以 json 形式輸出給 Visualizer
  19. SUBGRAPH ISOMORPHISM two graphs G and H are given as

    input, and one must determine whether G contains a subgraph that is isomorphic to H. Subgraph isomorphism problem - Wikipedia, the free encyclopedia Graph isomorphism - Wikipedia, the free encyclopedia