privacy tool • Services • Allows one to browse the Internet anonymously • Allows to run a hidden service without revealing its physical location • Used by a variety of people and applications to protect their privacy • Normal users (majority): do not want to be tracked, circumvent censorship • Journalists • Activists & whistleblowers • Law enforcement & military • Criminals 2
is also a target of attacks & abuse • Cryptography, crypto-currencies • Previous research studied maliciousness of the relays/exit • Other work looked at the nature of hidden services content • We are interested in Hidden Service Directories (HSDirs) • Indicator of the presence of malicious actors • Related work within the Tor Project (Donncha O'Cearbhaill) • Not about breaking the stated properties Tor/Hidden Services 5
hidden services information? • Derive a lower-bound • Which relays snoop? • IP address? Location? • What else do they do? • Passive vs. active • Who are they? 6
hidden services information? • Derive a lower-bound • Which relays snoop? • IP address? Location? • What else do they do? • Passive vs. active • Who are they? 7
that we don’t share • Run on local IP address (Hidden Service) • Serve a white page • Accessible only through Tor and not shared anywhere/anyone • Three schedules • Daily, Weekly, Monthly • #"#$%& ≅ 3000 ⇒ #ℎ,-%,- = 1500 (cover 95% HSDir relays) • Log the requests/visits for further investigation 10
Place honions on HSDirs 3. Build bipartite graph On visit, mark potential HSDirs ho j d i d i+2 d i+1 d i d i+1 d i+2 On visit, add to bipartite graph 15
()* *+,-./ 012ℎ !"#1* 4,-5 • !6 = ℎ)7: !681)8 2ℎ-2 0-/ 91/12+% • : = !"# ∪ !6 • < = ℎ)7, %& ∈ !6 × !"# ℎ)7 0-/ @,-Aed on %& and was visited} • M ⊆OMP QRST&U ": ∀ ℎ)7, %& ∈ <, ∃%′& ∈ " ∧ ℎ)7, %′& ∈ < • Can be shown equivalent to a set cover (an NP-complete problem) • Can be calculated using approximation algorithms • Set cover gives the lower bound on the number of snooping HSDirs 16
%)*+ ) ∑ -.& |)*+| %0 • subject to ∀ ℎ3- ∈ 56 ∑ ∀ 0: 89:,;< ∈= %0 ≥ 1 • Provides a lower bound on the number of snooping HSDirs to explain the visits • Fast using Matlab ILP solver 18
hosted on Cloud infrastructure • ~25% are exit nodes (compare to 15% of the average relay) • Top 5 cloud providers are • Alibaba California (15 detected HSDirs) • Digital Ocean (7) • Online S.A.S. (7) • OVH SAS (6) • Hetzner Online GmbH (6) 24
the relays • More than half of the HSDirs are hosted on cloud platform • The geolocations correspond to the location of the hosting platform not necessarily the entity running them • Number of cloud platforms are located in countries with stronger privacy protection for customers • Some cloud platform accept payments over bitcoin, making it even harder to identify the real actors 26
relays • The detection, identification and mitigation of misbehaving relays helps to improve the privacy and security of Tor • This work is an addition to the previous body of work focusing on detection of misbehaving Tor relays • Honey Onions (HOnions) is a framework to detect snooping HSDirs • Provides a lower bound on such relays • Identifies likely misbehaving HSDir relays • Game theoretic formulation for different types of actors • Systematic immediate, delay, randomizers • Large Delay => miss on fresh information • Small Delay => risk detection • Mitigations • Next generation Hidden Services • Do not assume that the existence of your Hidden Service cannot be discovered 29