Privacy Pass: Anonymously Limiting Abuse

Slide 1

Slide 1 text

Privacy Pass: Anonymously Limiting Abuse George Tankersley https://twitter.com/gtank__

Slide 2

Slide 2 text

Our scene: Cloudﬂare vs Tor, 2016

Slide 3

Slide 3 text

Cloudﬂare vs Tor Cloudﬂare ● Massive global CDN ● 100+ PoPs, “10% of the web” ● Not just a CDN, does extra logic Tor Project ● Distributed anonymity network ● ~6000 relays, ~2M users / day, ~200Gbit/s ● Tor Browser Bundle (Firefox ESR)

Slide 4

Slide 4 text

Cloudﬂare vs Tor Two things to know: 1. Tor hated Cloudﬂare 2. They were very loud about it

Slide 5

Slide 5 text

It was all very reasonable

Slide 6

Slide 6 text

The problem was CAPTCHAs Cloudfare use of IP reputation + Tor exits’ terrible IP rep

Slide 7

Slide 7 text

The problem was CAPTCHAs Averages from 7 days’ trafﬁc: ● 1.6 trillion web requests ● 780 million from Tor exits (0.05% of trafﬁc) ● 16.5B CAPTCHAs served (1.04% of global requests) ● 132.6M CAPTCHAs served to Tor (0.8% of CAPTCHAs, 17% of Tor requests) Disproportionately challenging Tor users.

Slide 8

Slide 8 text

They noticed

Slide 9

Slide 9 text

Social Problem, Technical Problem

Slide 10

Slide 10 text

Technical solutions First, nuke the ofﬁce IP reputation (incentive to improve!) Then ● Adopt reCAPTCHA v2 ● Add option to whitelist Tor network as a “country” ● Alter the internal treatment of Tor trafﬁc ● Invent some clever crypto protocol?

Slide 11

Slide 11 text

Slide 12

Slide 12 text

Technical solution A new protocol needs to meet the requirements of all parties: ● Cloudﬂare wants to protect websites from automated abuse. ● Tor Browser wants to protect the anonymity of its users. ● Users want to solve fewer CAPTCHAs. What if we gave users an untrackable token that proves they solved a CAPTCHA?

Slide 13

Slide 13 text

Privacy Pass Unlinkable, unforgeable, single-use anonymous credential Meets everyone’s needs: ● Allows Cloudflare to rate-limit with CAPTCHAs ● Prevents Cloudflare from tracking users ● Allows users to bypass redundant CAPTCHAs And they’re more efficient in batches!

Slide 14

Slide 14 text

Prerequisite: blind signatures Idea: A user wants a signer to sign a message without learning its contents Functions: sign, verify, blind, unblind blindMessage := Blind(message) blindSignature := Sign(blindedMessage) clearSignature := Unblind(blindSignature) Verify(message, clearSignature) => true

Slide 15

Slide 15 text

Signatures with RSA Public (N, e) and private d N = p*q and d = 1/e mod (N) Generate a signature S by S = Md mod N Verify it by checking that Se == M mod N (Md)e = M(1/e)*e = M1 = M mod N N is the modulus, product of two primes e is the public exponent, d is its inverse is magic, do not let it sense fear

Slide 16

Slide 16 text

Blind Signatures with RSA Generate a blinding factor re Blind the message: M blind = (re)M mod N S blind = (M blind )d mod N r also has an inverse, 1/r To unblind a signature: S = (S blind ) * (1/r) mod N = (M blind )d * (1/r) = (reM)d * (1/r) = (M)d(re)d * (1/r) = Md * red * (1/r) = Md mod N

Slide 17

Slide 17 text

Prerequisite: blind signatures Blind signatures provide unlinkability. Unlinkability means: ● Can’t tell that blinded/unblinded signatures are related ● Signer can’t associate unblinded (message, signature) with the blinded message it signed.

Slide 18

Slide 18 text

The protocol, generally Issuance Phase 1. User encounters a CAPTCHA or other challenge 2. User generates [many] blinded tokens, submits them with solution 3. Server signs tokens and returns them to user with response 4. User unblinds signatures and stores (token, signature) pairs. Redemption Phase 1. User encounters a CAPTCHA or other challenge 2. User submits a (token, signature) pair instead of solving 3. Server validates token, then grants access or rejects

Slide 19

Slide 19 text

Issuance, in detail 1. User generates a random token t and a blinding factor r 2. User calculates T = HashToGroup(t) and M = rT 3. User sends M to the server along with the CAPTCHA solution 4. Server validates solution with the challenger and signs Z = xM = xrT 5. Server generates ZK proof D showing that DLEQ(Z/M == Y/G) 6. Server sends (Z, D) to user 7. User checks the proof D against the sent tokens and the server’s public key commitment Y to establish that the server is using a consistent key. 8. User unblinds Z to calculate N = (1/r)Z = xT and stores (t, N)

Slide 20

Slide 20 text

Redemption, in detail 1. User calculates request binding data R for the request they want to make 2. User chooses unspent token t to redeem and retrieves (t, N) 3. User calculates a shared key sk = Hash(t || N) 4. User sends a pass (t, HMAC(sk, R)) to the server with the HTTP request 5. Server calculates R’ from observed request data 6. Server checks the double-spend list for t 7. Server calculates T = HashToGroup(t), N = xT and sk = Hash(t || N) 8. Server checks that HMAC(sk, R’) matches the user-supplied value 9. If HMAC matches, server processes the request and stores a record of t

Slide 21

Slide 21 text

Issuance and Redemption Flows 1-RTT best we can do; there is no known 0-RTT blind signing

Slide 22

Slide 22 text

Even more detail (for later) Design rationale: https://privacypass.github.io/protocol Protocol spec: github.com/privacypass/challenge-bypass-extension/blob/master/PROTOCOL.md Paper: https://petsymposium.org/2018/ﬁles/papers/issue3/popets-2018-0026.pdf Brave's improved variant: https://github.com/brave-intl/challenge-bypass-ristretto

Slide 23

Slide 23 text

Security & Privacy Guarantees 1. The server can’t track users through the tokens 2. The server can’t segment users with selective key use 3. The server can’t maliciously craft tokens or signatures 4. Users can’t create tokens without the server’s approval 5. Users can’t use a token more than once 6. Users can clear their state at any time

Slide 24

Slide 24 text

Server can’t track users through tokens Because of the blinding, each token spend is unlinkable both from its issuance request and from other tokens in the same issued batch. But if a user tries to spend the same token twice, those two requests will be linkable to each other (still not to issuance!) Extensions: traceability would allow you to link a token to its issued batch on double spend. Complexity tradeoff.

Slide 25

Slide 25 text

Server can’t segment users via keys Server could try to tag users by using unique signing keys. But it can’t go undetected w/ ZK proofs of key consistency Epochs/rotation and publishing key commitments: ● Signed software ● Transparency? Logs, auditors ● Other (e.g. Tor consensus) Tradeoff: number of simultaneously valid keys vs anon set

Slide 26

Slide 26 text

Server can’t maliciously craft tokens or signatures Server might try to tag users with structured responses. Can’t structure tokens: client generates them! Can’t structure signatures: unblinding is keyed randomization!

Slide 27

Slide 27 text

Users can’t create tokens without the server’s approval Token redemption involves (t, N) where N = k*H(t) A valid signature requires k, which is the server’s secret. Type of algorithm called an oblivious pseudorandom function.

Slide 28

Slide 28 text

Users can’t spend a token more than once Token redemption involves (t, N) where N = k*H(t) User incentives: ● t is a unique value. If server sees it twice, can link IPs/requests. Server enforcement: ● Maintains double-spend list to deny reuse during given key epoch.

Slide 29

Slide 29 text

Users can clear their state at any time Especially common for Tor Browser: “New Identity” button Protocol doesn’t assume any long-term user identity or keys Complete state clearance looks like a new user

Slide 30

Slide 30 text

Operational Needs 1. Signing/Validation service (low maintenance, ~stateless) 2. Key management and rotation (tunable; security/privacy) 3. Trusted key publication endpoint or client software (TBB) 4. Double-spend accumulator (tunable; potential SPOF)

Slide 31

Slide 31 text

In practice We redeem tokens for a TTL’d clearance cookie (safe because SOP). ● Token is effectively an anonymous, cross-origin precookie Probabilistic double-spend. We avoid SPOF by allowing occasional false negatives. We use an elliptic curve VOPRF for 10x better bandwidth & speed than blind RSA. ● Was necessary at one point to ﬁt tokens in a header ● Uses NIST P-256 for compatibility. Not optimal. Use Ristretto now. Binding data is Host and Path concatenated.

Slide 32

Slide 32 text

Deployment stats Released Nov 2018, ~108k downloads now (Chrome: 86,738; Firefox: 21, 161) The number of redemptions occurring peaked globally at 2000 qps and for Tor users at 200 qps during the surveyed week. Not an experiment! Privacy Pass is still running. In absolute numbers it’s more popular with VPN users than Tor. Operation Bytes Signing request 57 + 63 * N Signing response 295 + 121 * N Redemption request 396 Bandwidth Overhead* (batch of N) *Median request and response sizes for Tor users were 700–800 bytes and 5–6 KB.

Slide 33

Slide 33 text

Extensions and Future Work ● Performance: faster curves, better impl, batch & vectorize ○ Brave is already doing this! ● Standard AC properties ○ Attributes (open and/or hidden) ○ Tracing ● Asymmetric version (issuer != veriﬁer) ● Composition with macaroons (anon “OAuth” delegation?) ● Post-quantum primitives: lattice, isogeny variants

Slide 34

Slide 34 text

Questions? George Tankersley https://twitter.com/gtank__