Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Scary JavaScript (and other Tech) that Tracks You Online

luke crouch
November 04, 2016

Scary JavaScript (and other Tech) that Tracks You Online

There are over 5,000 online trackers that use cookies, fingerprinting, and probablistic device matching to follow you across the web. Some methods are actively used for fraud, malware, and intrusive user tracking. Some are commonly used for legit purposes. We'll talk about how sites are able to follow users, tracking methods both fair and foul, and how Mozilla protects users from tracking.

luke crouch

November 04, 2016

More Decks by luke crouch

Other Decks in Technology


  1. Scary JavaScript (and other Tech) That Tracks You Online Luke

    Crouch, Mozilla @groovecoder
  2. Luke Crouch • Web Developer at Mozilla • Not an

    expert in privacy tech (yet?) • Working on privacy & security experiments, prototypes, and studies for Firefox • Has 10 seconds per slide
  3. None
  4. None
  5. Data and Goliath Bruce Schneier

  6. Data that’s a by-product of online activity Browsing History Cookies

  7. Browser History

  8. Browser History Vulnerabilities

  9. CSS History Sniffing

  10. CSS History Sniffing getComputedStyle

  11. CSS History Sniffing 2010

  12. requestAnimationFrame History Sniffing

  13. https://developer.mozilla.org/en-US/docs/Web/API/window/ requestAnimationFrame requestAnimationFrame History Sniffing

  14. requestAnimationFrame History Sniffing

  15. requestAnimationFrame History Sniffing

  16. Resource Access Sniffing

  17. https://robinlinus.github.io/socialmedia-leak/

  18. Resource Access Social Media Leak

  19. Cache Timing History Sniffing

  20. Network timing noise

  21. More reliable timing attacks https://tom.vg/2016/08/browser-based-timing-attacks/

  22. Video Parsing Timing Attack

  23. Video Parsing Timing Attack

  24. HSTS History Sniffing yan/@bcrypt https://diracdeltas.github.io/blog/sniffly/

  25. None
  26. None
  27. None
  28. None
  29. Set CSP on server

  30. Time CSP violations on client

  31. None
  32. Cookies

  33. developer.mozilla.org/login

  34. None
  35. None
  36. http://clearcode.cc/2015/12/cookie-syncing/

  37. http://clearcode.cc/2015/12/cookie-syncing/

  38. None
  39. None
  40. None
  41. Clear your cookies

  42. Cookie Re-spawning

  43. Re-spawning/“Supercookies”

  44. Using Flash

  45. Also Silverlight Isolated Storage

  46. HTML localStorage

  47. ETag

  48. Get/Set/Re-spawn on client

  49. Check ETag on Server

  50. Cookie Re-spawning is “Illegal” Or, at least, companies have been

    sued for it
  51. Cookie Syncing

  52. http://clearcode.cc/2015/12/cookie-syncing/

  53. http://clearcode.cc/2015/12/cookie-syncing/

  54. https://freedom-to-tinker.com/blog/englehardt/the-hidden- perils-of-cookie-syncing/

  55. http://clearcode.cc/2015/12/cookie-syncing/

  56. Cookie Syncing defeats No-respawn

  57. https://freedom-to-tinker.com/blog/englehardt/the-hidden- perils-of-cookie-syncing/

  58. Cookie Syncing = Giant Cookie Databases

  59. Without cookies

  60. None
  61. None
  62. Passive Fingerprints Don’t require code execution

  63. User-Agent, IP, Accept-Language, etc.

  64. HTTP Header Injection

  65. None
  66. turn.com Re-spawn http://webpolicy.org/2015/01/14/turn-verizon-zombie-cookie/

  67. None
  68. Active Fingerprints JavaScript code executes on your device

  69. Plugin Enumeration

  70. Okay but … … enumeration is still possible via sniffing,

    like …
  71. Font Enumeration http://www.lalit.org/lab/javascript-css-font-detect/

  72. Measure default fonts

  73. Measure dictionary of fonts

  74. Canvas Fingerprint

  75. None
  76. None
  77. WebGL Fingerprinting http://cseweb.ucsd.edu/~hovav/dist/canvas.pdf

  78. AudioContext

  79. None
  80. https://webtransparency.cs.princeton.edu/webcensus/#audio-fp

  81. WebRTC

  82. WebRTC Local Addressing

  83. None
  84. WebVR “eyeprinting”

  85. Device Fingerprints ~= Cookies

  86. Cross-Device Matching

  87. Probabilistic “Householding” FTC Cross-Device Tracking Workshop https://www.ftc.gov/news-events/audio-video/video/cross-device-tracking-part-1

  88. Probabilistic “Tethering” cookie=4qasr4sdf1 cookie=f52dh64dhq Android Advertising Id=0436732361

  89. Probabilistic “Tethering” IP address: (early mornings, evenings, weekends) IP

    address: (9am-6pm weekdays) IP address: (9am-6pm weekdays) Cellular network (early mornings, evenings, weekends)
  90. Probabilistic “Tethering” Work?? Cell?? Home?? 80% 80% IP address:

    (9am-6pm weekdays) IP address: (9am-6pm weekdays) Cellular network (early mornings, evenings, weekends) IP address: (early mornings, evenings, weekends)
  91. Probabilistic Matching Work? Cell? Home? Location: 38.883914, -77.020997 Weekday location:

    38.883914, -77.020997 Evening location: 38.897634, -77.036544 Location: 38.897634, -77.036544 95% 95%
  92. Probabilistic Matching Work Cell Home Technology news UVa sports Capitol

    Hill Arsenal football Technology news UVa sports Capitol Hill Arsenal football Technology news UVa sports Capitol Hill Arsenal football 98% 98% cookie=4qasr4sdf1 Android Advertising Id=0436732361 cookie=f52dh64dhq
  93. Device Graph id=4qasr4sdf1 Android Advertising Id=0436732361 id=f52dh64dhq

  94. First-Party Deterministic You are signed in to their service

  95. None
  96. First-Party Deterministic Matching Login: JustinBrookman Login: JustinBrookman Login: JustinBrookman Third-party

    sites/ apps that embed first-party Third-party sites/ apps that embed first-party Third-party sites/ apps that embed first-party
  97. –Mark Zuckerberg “Over 1 billion people use Facebook on their

    phones every month and more than 80% of the top apps on iOS and Android now use Facebook logins.”
  98. “One industry source that spoke with AdExchanger estimated Google’s logged-in

    cross-device user count as somewhere between 600 million and 1.2 billion, a conclusion based on the numerical intersection between Android users, iOS users, the Google login rate of iOS users and the number of logged-in desktop users for Google products.”
  99. •Email Address,
 Personally-Identifiable Information (PII) •Email Address,
 “Google Advertising

    ID” •Email Address,
 PII •Email Address,
  100. Note: Trusted Parties

  101. First-Party Deterministic You click their links

  102. Email for First Party Cross-Device Tracking Purchase item at a

    shopping site as [email protected]
  103. Purchase item at a shopping site as [email protected] Click on

    email from shopping site Open email from shopping site Android Advertising Id=0436732361 cookie=4qasr4sdf1 cookie=a035fs35fm Email for First Party Cross-Device Tracking
  104. Third-Party Probabilistic Device Matching

  105. Machine Learning Model 1. Acquire device activity data set

    IP addresses, WiFi networks, GPS coordinates, websites browsed, ads displayed, device type, operating system, browser cookies, mobile device IDs, time of day, etc. 2. Acquire “truth set” of deterministic matching data
 “training set” and “test set” 3. Train ML models on the training set, evaluating accuracy, precision, and recall against the test set 4. Point ML model at entire device activity data set
  106. https://www.google.com/policies/privacy/#nosharing, Sep 5 2016

  107. Untrusted Third-Party Deterministic

  108. PII Leaking https://www3.cs.stonybrook.edu/~phillipa/papers/ contactus_pets2016.pdf

  109. None
  110. None
  111. None
  112. None
  113. None
  114. None
  115. None
  116. Audio Beaconing for Cross-Device Matching

  117. ec25d046746de3be33779256f6957d8f

  118. Other device privacy vulnerabilities • Visual/IR beaconing for cross-device matching?

    • Recognizing speech from gyroscope signals
 (crypto.stanford.edu/gyrophone) • Recognizing gait patterns with accelerometers
  119. Purchase item at a shopping site as [email protected] Click on

    email from shopping site Open email from shopping site Advertising Network md5=b16f55bbe0ff554fb40003f8e5f96b99 Hashed Email for Third-Party Tracking
  120. Does Hashing Make Data Anonymous? https://www.ftc.gov/news-events/blogs/techftc/2012/04/does-hashing-make-data-anonymous

  121. Hash Functions https://blog.varonis.com/the-definitive-guide-to-cryptographic-hash-functions-part-1/

  122. None
  123. None
  124. None
  125. None
  126. How much tracking is going on?

  127. Web Privacy Census Dec 12, 2015 http://techscience.org/a/2015121502/

  128. Web Privacy Census Dec 12, 2015 http://techscience.org/a/2015121502/

  129. Web Privacy Census Dec 12, 2015 http://techscience.org/a/2015121502/

  130. https://webtransparency.cs.princeton.edu/webcensus/

  131. https://webtransparency.cs.princeton.edu/webcensus/

  132. https://webtransparency.cs.princeton.edu/webcensus/

  133. Canvas Fingerprinting https://webtransparency.cs.princeton.edu/webcensus/

  134. Audio Fingerprinting https://webtransparency.cs.princeton.edu/webcensus/

  135. WebRTC Local Addressing https://webtransparency.cs.princeton.edu/webcensus/

  136. Re-spawning https://securehomes.esat.kuleuven.be/~gacar/persistent/

  137. Cookie Syncing https://securehomes.esat.kuleuven.be/~gacar/persistent/

  138. –Steven Englehardt, Princeton WebTAP “in our measurements we found only

    two trackers (doubleclick.net and googleanalytics.com) that are present on 40% or more of websites. But if we assumed a moderate amount of back-end data sharing (defined in Section 5.3 of our paper), the number of trackers that can observe 40% of users’ browsing history would jump to 161”
  139. What are the good implications?

  140. Analytics

  141. Personalized Services

  142. Relevant Advertising

  143. Advertising Attribution

  144. Prevent Fraud

  145. Prevent Criminal Activity

  146. National Security

  147. What are the bad implications?

  148. Over-Personalized Services

  149. Creepy Advertising https://blogs.harvard.edu/doc/2014/12/12/is-perfectly- personalized-advertising-perfectly-creepy/

  150. Targeting options for Facebook advertisers https://www.washingtonpost.com/news/the-intersect/wp/2016/08/19/98- personal-data-points-that-facebook-uses-to-target-ads-to-you/

  151. Commit Fraud

  152. Enable Criminal Activity

  153. Enable Criminal Activity

  154. Enable Criminal Activity

  155. National Insecurity from Mass Surveillance

  156. None
  157. False Positive Paradox https://www.crosswise.com/cross-device-learning-center/ device-map-accuracy-precision-and-recall/

  158. www.wired.com/2016/08/shadow-brokers-mess-happens-nsa-hoards-zero-days/

  159. None
  160. (Why?) Aren’t we doing anything?

  161. Privacy Paradox • consumers are concerned about ways marketers access

    and use their data • people still release data about themselves that suggest much less concern The Tradeoff Fallacy Joseph Turow, Michael Hennessy, University of Pennsylvania Nora Draper, University of New Hampshire
  162. “Notice and Choice” People are expected to negotiate for privacy

    protection by reading privacy policies and selecting services consistent with their preferences. Alan Westin’s Privacy Homo Economics Chris Hoofnagle & Jennifer Urban, UC Berkeley
  163. None
  164. The Tradeoff Fallacy Joseph Turow, Michael Hennessy, University of Pennsylvania

    Nora Draper, University of New Hampshire 2015 Survey
  165. What are the options?

  166. As a user

  167. None
  168. None
  169. None
  170. None
  171. None
  172. None
  173. None
  174. Encrypt your drive Windows BitLocker™ Mac FileVault™ LinuxGPL

  175. Check your data-breach status

  176. Use temporary email addresses

  177. As a power user

  178. None
  179. None
  180. None
  181. None
  182. None
  183. None
  184. None
  185. None
  186. None
  187. None
  188. As a developer

  189. HTTPS all the things

  190. None
  191. Secure cookies http://blog.teamtreehouse.com/how-to-create-totally- secure-cookies

  192. None
  193. Prevent Account enumeration https://www.troyhunt.com/website-enumeration-insanity- how-our-personal-data-is-leaked/

  194. AshleyMadison.com

  195. Don’t leak PII https://www.troyhunt.com/website-enumeration-insanity- how-our-personal-data-is-leaked/

  196. strawberrynet.com Please be advised that in surveys we have completed,

    a huge majority of customers like our system with no password. Using your e-mail address as your password is sufficient security, and in addition we never keep your payment details on our website or in our computers.
  197. As an advocate

  198. reddit.com/r/privacy Note: use tracking protection on reddit.com

  199. None
  200. Deepen Your Understanding http://papers.ssrn.com/sol3/papers.cfm?abstract_id=998565&

  201. None