ABOUT ME
鄭順⼀一 (Shun-Yi Jheng)
[email protected]
http://m157q.github.io
A Python, Free Software, Security and Arch GNU/
Linux Lover.
魯蛇⼀一條,⽬目前正在三修 Compiler 中。︒
Slide 3
Slide 3 text
ABOUT MY PARTNER
江泓樂 (Kenny Chiang)
http://kenny5312012.blogspot.tw
溫拿卷哥,⽬目前出國交換到瑞⼠士的蘇黎世聯邦理
⼯工學院中。︒
Slide 4
Slide 4 text
決定這場 TALK 的難度
容我學⼀一下 Halt and Catch Fire
Slide 5
Slide 5 text
沒聽過或不太懂的名詞請將⼿手放下
偷偷推薦資⼯工阿宅觀看這影集
Slide 6
Slide 6 text
KEYWORDS
App, Android, Google Play
Perl, Python, Java, JavaScript
.apk, .dex, .smali, .json
Scrapy, NetworkX, Node.js, D3.js
Data Dependence Graph, Program Slicing
Fuzzy Hashing, Obfuscation, Subgraph Isomorphism
Slide 7
Slide 7 text
OUTLINE
Motivation & Goal
Related Projects
System Architecture
Related Open Source Tools
Conclusion
RELATED PROJECTS
AnDarwin
SCanDroid
DroidRanger
DroidMOSS
Slide 15
Slide 15 text
ANDARWIN
A scalable approach to detecting similar
Android apps based on semantic information.
AnDarwin: Scalable Detection of Semantically
Similar Android Applications
Slide 16
Slide 16 text
SCANDROID
Static analyzing data flow and permissions.
SCanDroid: Automated Security Certification of
Android Applications
Slide 17
Slide 17 text
DROIDRANGER
A malware detection system to do pairwise
compare the app with malware samples.
Hey, You, Get Off of My Market: Detecting
Malicious Apps in Official and Alternative
Android Markets
Slide 18
Slide 18 text
DROIDMOSS
Analyze repackaged App between official and
third-party marketplaces with fuzzy hashing
technique.
Detecting Repackaged Smartphone
Applications in Third-Party Android
Marketplace
Slide 19
Slide 19 text
FUSSY HASHING
aka Context Triggered Piecewise Hashing
allows examiners to find documents that are similar
but not quite identical.
The Digital Standard: Why Fuzzy Hashing is Really
Cool
http://jessekornblum.com/presentations/htcia06.pdf
Slide 20
Slide 20 text
列出⽬目標
Slide 21
Slide 21 text
GOAL
從 Google Play 和第三⽅方 Market 下載 App
分析比較 App 之間的相似度
分析結果視覺化
讓使⽤用者可⾃自⾏行上傳並提供分析結果的網站
Slide 22
Slide 22 text
根據⽬目標劃分系統架構
System Architecture
Slide 23
Slide 23 text
根據四個⽬目標將系統分為
Crawler
Analyzer
Visualizer
Website
Slide 24
Slide 24 text
No content
Slide 25
Slide 25 text
⽤用到了哪些
OPEN SOURCE TOOLS?
Related Open Source Tools
Slide 26
Slide 26 text
THE TRUE HACKER
為了呼應今年主題
Slide 27
Slide 27 text
HOW TO
BECOME A HACKER
Eric Steven Raymond
http://www.catb.org/esr/faqs/hacker-howto.html
Slide 28
Slide 28 text
NO PROBLEM
SHOULD EVER HAVE TO BE
SOLVED TWICE.
Creative brains are a valuable, limited resource.
They shouldn't be wasted on re-inventing the wheel
when there are so many fascinating new problems
waiting out there.
Slide 29
Slide 29 text
所以我開始了⼀一段旅程
尋找現成的輪⼦子(?)、︑評估比較、︑將其整合
Slide 30
Slide 30 text
CRAWLER
Google Play - Akdeniz/google-play-crawler
Unofficial API, written in Java.
Third Party - mssun/android-apps-crawler
Use Scrapy (A Python Framework for webcrawler)
Scheduling
Use Perl Script + crontab
Slide 31
Slide 31 text
ANALYZER
Data Dependency
Use SAAF (Static Android Analysis Framework),
written in Java.
Supports Program Slicing on smali code.
Subgraph Isomorphism
Use NetworkX - A Python Lib for graph analyzing
Slide 32
Slide 32 text
DATA DEPENDENCY
GRAPH
Directed graph representing dependencies of several
objects towards each other.
An edge from a to b: iff a must be evaluated before b.
Dependency graph - Wikipedia, the free encyclopedia
Data dependency - Wikipedia, the free encyclopedia
Slide 33
Slide 33 text
DATA DEPENDENCY
GRAPH
http://knuth.uprrp.edu/blog/wp-content/uploads/
2011/12/CCOM3033-DataDependency.pdf
Slide 34
Slide 34 text
PROGRAM SLICING
Computation of the set of programs statements,
the program slice, that may affect the values at
some point of interest, referred to as a slicing
criterion.
Slide 35
Slide 35 text
PROGRAM SLICING
Slide 36
Slide 36 text
DATA V.S FLOW
Data Dependence
追蹤分析變數的儲存值
Flow Dependence
追蹤程式執⾏行的流程
Slide 37
Slide 37 text
VISUALIZER
Use D3.js
A JavaScript Lib for data manipulating &
visualization
可使⽤用 json 直接 render 出視覺化圖形
有非常多不同的類型可以選擇
SUBGRAPH
ISOMORPHISM
two graphs G and H are given as input, and one must
determine whether G contains a subgraph that is
isomorphic to H.
Subgraph isomorphism problem - Wikipedia, the free
encyclopedia
Graph isomorphism - Wikipedia, the free
encyclopedia
Slide 45
Slide 45 text
GRAPH ISOMORPHISM
Slide 46
Slide 46 text
VISUALIZER
利⽤用 D3.js 在網⾴頁上呈現 Data Dependency
Graph 視覺化的結果
Slide 47
Slide 47 text
WEBSITE
⽤用 Node.js 架站,讓使⽤用者上傳 apk 檔進⾏行分析
Slide 48
Slide 48 text
LIVE DEMO?
時間應該不太夠了吧?
其實還剩網站最後⼀一⼩小部分沒開發完 最近在寫計劃結報
SAAF 是 GNU GPL v3 之後會將程式碼公佈 敬請關注