Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Honey Onions: Exposing Snooping Tor HSDir Relays

Honey Onions: Exposing Snooping Tor HSDir Relays

Honey Onions:
Exposing Snooping Tor HSDir Relays

Amirali Sanatinia

August 01, 2016
Tweet

More Decks by Amirali Sanatinia

Other Decks in Research

Transcript

  1. Honey Onions: Exposing Snooping Tor HSDir Relays Amirali Sanatinia &

    Guevara Noubir {amirali, noubir}@ccs.neu.edu 1
  2. What is Tor? • Tor is a powerful and popular

    privacy tool • Services • Allows one to browse the Internet anonymously • Allows to run a hidden service without revealing its physical location • Used by a variety of people and applications to protect their privacy • Normal users (majority): do not want to be tracked, circumvent censorship • Journalists • Activists & whistleblowers • Law enforcement & military • Criminals 2
  3. Hidden Services • Allows to run a hidden service without

    revealing its physical location • Additional side effects: self-authentication (no need for certificates), end-to- end encryption • Examples 3
  4. Tor • Tor does not claim to protect against everything

    • Tor Browser: does not mean End-to-End encryption • Hidden Services: does not hide the existence of the hidden service 4
  5. Related Work & Motivation • Like most privacy infrastructure it

    is also a target of attacks & abuse • Cryptography, crypto-currencies • Previous research studied maliciousness of the relays/exit • Other work looked at the nature of hidden services content • We are interested in Hidden Service Directories (HSDirs) • Indicator of the presence of malicious actors • Related work within the Tor Project (Donncha O'Cearbhaill) • Not about breaking the stated properties Tor/Hidden Services 5
  6. Questions • How many relays with HSDir flag snoop on

    hidden services information? • Derive a lower-bound • Which relays snoop? • IP address? Location? • What else do they do? • Passive vs. active • Who are they? 6
  7. Questions • How many relays with HSDir flag snoop on

    hidden services information? • Derive a lower-bound • Which relays snoop? • IP address? Location? • What else do they do? • Passive vs. active • Who are they? 7
  8. Ring of Responsible HSDirs • Every day 3 + 3

    relays are selected • By design to prevent DoS • Side effect misbehaving HSDir can snoop and learn .onion addresses 9
  9. Honey Onions (HOnions) • Each HOnion corresponds to a server/process

    that we don’t share • Run on local IP address (Hidden Service) • Serve a white page • Accessible only through Tor and not shared anywhere/anyone • Three schedules • Daily, Weekly, Monthly • #"#$%& ≅ 3000 ⇒ #ℎ,-%,- = 1500 (cover 95% HSDir relays) • Log the requests/visits for further investigation 10
  10. HOnions Architecture 1. Generate honions ho i ho j 2.

    Place honions on HSDirs 3. Build bipartite graph On visit, mark potential HSDirs ho j d i d i+2 d i+1 d i d i+1 d i+2 On visit, add to bipartite graph 15
  11. Minimal Set of HSDir Explaining Visits • !"# = %&:

    ()* *+,-./ 012ℎ !"#1* 4,-5 • !6 = ℎ)7: !681)8 2ℎ-2 0-/ 91/12+% • : = !"# ∪ !6 • < = ℎ)7, %& ∈ !6 × !"# ℎ)7 0-/ @,-Aed on %& and was visited} • M ⊆OMP QRST&U ": ∀ ℎ)7, %& ∈ <, ∃%′& ∈ " ∧ ℎ)7, %′& ∈ < • Can be shown equivalent to a set cover (an NP-complete problem) • Can be calculated using approximation algorithms • Set cover gives the lower bound on the number of snooping HSDirs 16
  12. Heuristic Approach: Greedy Algorithm • Input: G(V, E): Bipartitie graph

    of HOnions to HSDirs • Output: S Set explaining visits • ! ⟵ ∅ • while $ ∩ &' ≠ ∅ )o • Pick * ∈ $ ∩ &!,: ./0ℎ ℎ/2ℎ340 degree • $ ← $ \ * 78* /04 &'8/98 83/2ℎ:9;4 • end • Gives log |&'| approximation ratio 17
  13. Integer Linear Programming (ILP) • min (%& , … ,

    %)*+ ) ∑ -.& |)*+| %0 • subject to ∀ ℎ3- ∈ 56 ∑ ∀ 0: 89:,;< ∈= %0 ≥ 1 • Provides a lower bound on the number of snooping HSDirs to explain the visits • Fast using Matlab ILP solver 18
  14. Experiment • Period reported on: • Start: Feb 12, 2016

    • End: April 24, 2016 20 Daily Weekly Monthly Unique Hidden Services Stats
  15. Snoopers’ Most Likely Geolocation 23 • No snooping HSDirs in

    China, Middle East, or Africa but • Many of these countries Tor is blocked • Alibaba datacenter in California (15 HSDir)
  16. More Stats • More than 70% of these HSDirs are

    hosted on Cloud infrastructure • ~25% are exit nodes (compare to 15% of the average relay) • Top 5 cloud providers are • Alibaba California (15 detected HSDirs) • Digital Ocean (7) • Online S.A.S. (7) • OVH SAS (6) • Hetzner Online GmbH (6) 24
  17. Snooping Behavior • Wide variety of behavior • Automated vs

    manual probing • Mostly automated but some request favicon.ico (probably manual TorBrowser) • Aggressive, periodic probing • One periodically asked for server-status page of Apache (mod_status) • Attempts to find vulnerabilities • SQL Injection • Targeted information_schema.tables, username enumeration in Drupal • Path traversal • looking for boot.ini and /etc/passwd • XSS • PHP Easter Eggs • Targeting Drupal and Ruby on Rails 25
  18. Snoopers’ Identity • Hard to identify the real entity behind

    the relays • More than half of the HSDirs are hosted on cloud platform • The geolocations correspond to the location of the hosting platform not necessarily the entity running them • Number of cloud platforms are located in countries with stronger privacy protection for customers • Some cloud platform accept payments over bitcoin, making it even harder to identify the real actors 26
  19. Conclusion • Tor relies on the honest behavior of volunteering

    relays • The detection, identification and mitigation of misbehaving relays helps to improve the privacy and security of Tor • This work is an addition to the previous body of work focusing on detection of misbehaving Tor relays • Honey Onions (HOnions) is a framework to detect snooping HSDirs • Provides a lower bound on such relays • Identifies likely misbehaving HSDir relays • Game theoretic formulation for different types of actors • Systematic immediate, delay, randomizers • Large Delay => miss on fresh information • Small Delay => risk detection • Mitigations • Next generation Hidden Services • Do not assume that the existence of your Hidden Service cannot be discovered 29