Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Peeling Onions

Yuma Kurogome
November 15, 2015

Peeling Onions

AVTOKYO 2015 発表資料

Yuma Kurogome

November 15, 2015
Tweet

More Decks by Yuma Kurogome

Other Decks in Research

Transcript

  1. Tor 7 Exit node can sniff plaintext packet. So you

    should use SSL. IP Entry node can knows source IP address.
  2. Tor 8 Tor hidden services Hidden services can only be

    accessed through Tor Exit nodes are not used at this time. .onion TLD These are also called “deep web”.
  3. 16

  4. 21 { "aaData": [ [ "0", "torlinkbgs6aabns.onion", "TorLinks | .onion

    Link List The Hidden Wiki TheHiddenWiki Onion Urls onionland Tor linklist Deepweb", "https://encrypted.google.com/search?q=¥".onion¥"", "1442342899", "1369353102", "388", "2328" ], [ "1", "ci3hn2uzjw2wby3z.onion", "Talk.onion", "https://encrypted.google.com/search?q=¥".onion¥"", "1375548844", "1369353102", "396", … But, how many of them still alive?
  5. Scraping • • Confirm the existence of services from the

    list. 22 torsocks wget ¥ --connect-timeout=10 --tries=1 ¥ --user-agent= ¥ “Mozilla/5.0 (Macintosh; Intel Mac OS X 10.5; rv:8.0.1) Gecko/20100101Firefox/8.0.1” ¥ [.onion] Same as Tor browser html You should scrape only html to avoid child pornography
  6. Scraping • • Confirm the existence of services from the

    list. • 4,102/174,523 • Found 4,102/174,523 sites still alive! 23 torsocks wget ¥ --connect-timeout=10 --tries=1 ¥ --user-agent= ¥ “Mozilla/5.0 (Macintosh; Intel Mac OS X 10.5; rv:8.0.1) Gecko/20100101Firefox/8.0.1” ¥ [.onion]
  7. Word Cloud • html • Now we have html files

    of the top pages. • • Visualize word frequency 24 #!/usr/bin/env python2 import sys from os import path from wordcloud import WordCloud d = path.dirname(__file__) argvs = sys.argv text = open(path.join(d, argvs[1])).read() wordcloud = WordCloud(max_font_size=600,width=2560,height=1440).generate(text) wordcloud.to_file(path.join(d, argvs[1]+".png"))
  8. 26 • Word cloud of sites which contain “malware” •

    • Black hat hacker likes sophomoric words. lol
  9. Clustering • Calculate tf-idf • • Term frequency • •

    Inverse document frequency • tf*log_2(N/df) 28 0.0896242738109923 facebook 0.0811608402477274 checkthis 0.0763369146951766 words 0.0637879047442443 sign 0.0599923671251114 039 0.0494245419919427 try https://www.facebookcorewwwi.onion
  10. 33

  11. Tor shops • PHP • Template written in PHP for

    black markets • Bitcoin • 48 • Used at 48 sites 34
  12. Clustering (Tor shops) • About 12 clusters • Hitman, Drug,

    Phone, Tablet, Kush, LSD, Cannabis, Cocaine, USD, Hacker, US Passport, UK Passport 50
  13. Conclusion • Tor hidden services • There are hidden services

    for criminal purpose. • 12 • There are about 12 types of black markets. • k-means • We will do further analysis by using a k-means in the future. 52
  14. Tor 54 Exit node can sniff unencrypted packet. So you

    should use SSL. IP Entry node can knows source IP address.
  15. 57

  16. 70 https://torstatus.blutmagie.de There are only ~7000 exit nodes Tor is

    not fully secure. Tor Tor Research is not in the stage about whether it can be de-anonymized or not. It is just about its efficiency.
  17. • 187 deep web architectures • Alternative internet alter the

    world, crime as well. 71 https://github.com/redecentralize/alternative-internet