Slide 1

Slide 1 text

Peeling Onions @ntddk AVTOKYO 2015 2015.11.14 1

Slide 2

Slide 2 text

$ whoami • http://ntddk.github.io • Student • Security Camp Lecturer • CODE BLUE 2015 Speaker 2

Slide 3

Slide 3 text

Peeling Onions 3

Slide 4

Slide 4 text

The Onion Routing 4

Slide 5

Slide 5 text

Tor • • One of the most common anonymous communication systems 5

Slide 6

Slide 6 text

Tor 6 AES encryption

Slide 7

Slide 7 text

Tor 7 Exit node can sniff plaintext packet. So you should use SSL. IP Entry node can knows source IP address.

Slide 8

Slide 8 text

Tor 8 Tor hidden services Hidden services can only be accessed through Tor Exit nodes are not used at this time. .onion TLD These are also called “deep web”.

Slide 9

Slide 9 text

9 http://blog.trendmicro.co.jp/archives/12349 Investigation of underground community in Japan

Slide 10

Slide 10 text

10 https://www.facebookcorewwwi.onion Hash of public key

Slide 11

Slide 11 text

11 https://www.facebookcorewwwi.onion

Slide 12

Slide 12 text

Tor 12

Slide 13

Slide 13 text

13 https://goo.gl/PFWpYn

Slide 14

Slide 14 text

14 Tor Measurement of drug marketplace on Tor network https://www.usenix.org/conference/usenixsecurity15/technical- sessions/session/measurement

Slide 15

Slide 15 text

15 Confidence is most important in underground. One review, one transaction.

Slide 16

Slide 16 text

16

Slide 17

Slide 17 text

Crawling Deep Web 17

Slide 18

Slide 18 text

18 http://securelist.com/blog/incidents/58542/tor-hidden-services-a-safe-haven-for- cybercriminals/ 900 hidden services Kaspersky found 900 hidden services.

Slide 19

Slide 19 text

19 http://securelist.com/blog/incidents/58542/tor-hidden-services-a-safe-haven-for- cybercriminals/ Tor 30,000 hidden services Tor project says there are over 30,000 hidden services.

Slide 20

Slide 20 text

20 https://bdpuqvsqmphctrcs.onion List of 174,523 hidden services

Slide 21

Slide 21 text

21 { "aaData": [ [ "0", "torlinkbgs6aabns.onion", "TorLinks | .onion Link List The Hidden Wiki TheHiddenWiki Onion Urls onionland Tor linklist Deepweb", "https://encrypted.google.com/search?q=¥".onion¥"", "1442342899", "1369353102", "388", "2328" ], [ "1", "ci3hn2uzjw2wby3z.onion", "Talk.onion", "https://encrypted.google.com/search?q=¥".onion¥"", "1375548844", "1369353102", "396", … But, how many of them still alive?

Slide 22

Slide 22 text

Scraping • • Confirm the existence of services from the list. 22 torsocks wget ¥ --connect-timeout=10 --tries=1 ¥ --user-agent= ¥ “Mozilla/5.0 (Macintosh; Intel Mac OS X 10.5; rv:8.0.1) Gecko/20100101Firefox/8.0.1” ¥ [.onion] Same as Tor browser html You should scrape only html to avoid child pornography

Slide 23

Slide 23 text

Scraping • • Confirm the existence of services from the list. • 4,102/174,523 • Found 4,102/174,523 sites still alive! 23 torsocks wget ¥ --connect-timeout=10 --tries=1 ¥ --user-agent= ¥ “Mozilla/5.0 (Macintosh; Intel Mac OS X 10.5; rv:8.0.1) Gecko/20100101Firefox/8.0.1” ¥ [.onion]

Slide 24

Slide 24 text

Word Cloud • html • Now we have html files of the top pages. • • Visualize word frequency 24 #!/usr/bin/env python2 import sys from os import path from wordcloud import WordCloud d = path.dirname(__file__) argvs = sys.argv text = open(path.join(d, argvs[1])).read() wordcloud = WordCloud(max_font_size=600,width=2560,height=1440).generate(text) wordcloud.to_file(path.join(d, argvs[1]+".png"))

Slide 25

Slide 25 text

25 • Word cloud of all sites

Slide 26

Slide 26 text

26 • Word cloud of sites which contain “malware” • • Black hat hacker likes sophomoric words. lol

Slide 27

Slide 27 text

Clustering • html • Strip html tags, extract text 27

Slide 28

Slide 28 text

Clustering • Calculate tf-idf • • Term frequency • • Inverse document frequency • tf*log_2(N/df) 28 0.0896242738109923 facebook 0.0811608402477274 checkthis 0.0763369146951766 words 0.0637879047442443 sign 0.0599923671251114 039 0.0494245419919427 try https://www.facebookcorewwwi.onion

Slide 29

Slide 29 text

Clustering • Pearson score 29 a b c d e

Slide 30

Slide 30 text

Clustering • 100 sites (randomly chosen) 30

Slide 31

Slide 31 text

Clustering • 1,000 sites (randomly chosen) 31 __ ━┓ ― ┏┛ ( ) ( ) ⌒ __ | / _ / | | | |

Slide 32

Slide 32 text

Clustering • 4,102 sites (all) • • O(N^2) 32

Slide 33

Slide 33 text

33

Slide 34

Slide 34 text

Tor shops • PHP • Template written in PHP for black markets • Bitcoin • 48 • Used at 48 sites 34

Slide 35

Slide 35 text

Clustering (Tor shops) 35

Slide 36

Slide 36 text

Clustering (Tor shops) 36

Slide 37

Slide 37 text

Clustering (Tor shops) 37

Slide 38

Slide 38 text

Clustering (Tor shops) 38

Slide 39

Slide 39 text

Clustering (Tor shops) 39

Slide 40

Slide 40 text

Clustering (Tor shops) 40

Slide 41

Slide 41 text

Clustering (Tor shops) 41

Slide 42

Slide 42 text

Clustering (Tor shops) 42

Slide 43

Slide 43 text

Clustering (Tor shops) 43

Slide 44

Slide 44 text

Clustering (Tor shops) 44

Slide 45

Slide 45 text

Clustering (Tor shops) 45

Slide 46

Slide 46 text

Clustering (Tor shops) 46

Slide 47

Slide 47 text

Clustering (Tor shops) 47

Slide 48

Slide 48 text

Clustering (Tor shops) 48

Slide 49

Slide 49 text

Clustering (Tor shops) 49

Slide 50

Slide 50 text

Clustering (Tor shops) • About 12 clusters • Hitman, Drug, Phone, Tablet, Kush, LSD, Cannabis, Cocaine, USD, Hacker, US Passport, UK Passport 50

Slide 51

Slide 51 text

51 https://www.youtube.com/watch?v=-oTEoLB-ses&feature=youtu.be&t=1998 Previous research of clustering web-based hidden services But not focused on black markets

Slide 52

Slide 52 text

Conclusion • Tor hidden services • There are hidden services for criminal purpose. • 12 • There are about 12 types of black markets. • k-means • We will do further analysis by using a k-means in the future. 52

Slide 53

Slide 53 text

Running Exit Node 53

Slide 54

Slide 54 text

Tor 54 Exit node can sniff unencrypted packet. So you should use SSL. IP Entry node can knows source IP address.

Slide 55

Slide 55 text

Tor 55 Exit node can sniff unencrypted packet. Tor We can monitor attacks via Tor.

Slide 56

Slide 56 text

• • For victim, attacker is YOU. • IPS • IPS is required. 56

Slide 57

Slide 57 text

57

Slide 58

Slide 58 text

Attack to exit node • .torrc • You can specify exit nodes by .torrc. • 58

Slide 59

Slide 59 text

Pass the back (bad idea) 59 Tor Passing to Tor once more. Entry node

Slide 60

Slide 60 text

Appendix 60

Slide 61

Slide 61 text

61 https://goo.gl/PFWpYn

Slide 62

Slide 62 text

62 https://goo.gl/PFWpYn

Slide 63

Slide 63 text

63 https://goo.gl/PFWpYn

Slide 64

Slide 64 text

64 https://goo.gl/PFWpYn

Slide 65

Slide 65 text

65 https://www.reddit.com/r/SilkRoad/

Slide 66

Slide 66 text

66 https://www.reddit.com/r/DarkNetMarkets

Slide 67

Slide 67 text

67 “The Dark Net: Inside the Digital Underworld” ( )

Slide 68

Slide 68 text

68 Session of anonymizing data https://www.usenix.org/conference/usenixsecurity15/technical-sessions/session/forget- me-not BGP/AS Tor BGP/AS-level de-anonymization of Tor Fingerprinting hidden service connections

Slide 69

Slide 69 text

69 http://panopticlick.eff.org Tor browser on this machine(default setting)

Slide 70

Slide 70 text

70 https://torstatus.blutmagie.de There are only ~7000 exit nodes Tor is not fully secure. Tor Tor Research is not in the stage about whether it can be de-anonymized or not. It is just about its efficiency.

Slide 71

Slide 71 text

• 187 deep web architectures • Alternative internet alter the world, crime as well. 71 https://github.com/redecentralize/alternative-internet