$30 off During Our Annual Pro Sale. View Details »

Scary JavaScript (and other Tech) that Tracks You Online

luke crouch
November 04, 2016

Scary JavaScript (and other Tech) that Tracks You Online

There are over 5,000 online trackers that use cookies, fingerprinting, and probablistic device matching to follow you across the web. Some methods are actively used for fraud, malware, and intrusive user tracking. Some are commonly used for legit purposes. We'll talk about how sites are able to follow users, tracking methods both fair and foul, and how Mozilla protects users from tracking.

luke crouch

November 04, 2016
Tweet

More Decks by luke crouch

Other Decks in Technology

Transcript

  1. Scary JavaScript
    (and other Tech)
    That Tracks You Online
    Luke Crouch, Mozilla
    @groovecoder

    View Slide

  2. Luke Crouch
    • Web Developer at Mozilla
    • Not an expert in privacy tech (yet?)
    • Working on privacy & security experiments,
    prototypes, and studies for Firefox
    • Has 10 seconds per slide

    View Slide

  3. View Slide

  4. View Slide

  5. Data and
    Goliath
    Bruce Schneier

    View Slide

  6. Data that’s a
    by-product of
    online activity
    Browsing History
    Cookies
    Fingerprints

    View Slide

  7. Browser History

    View Slide

  8. Browser History
    Vulnerabilities

    View Slide

  9. CSS History Sniffing

    View Slide

  10. CSS History Sniffing
    getComputedStyle

    View Slide

  11. CSS History Sniffing
    2010

    View Slide

  12. requestAnimationFrame
    History Sniffing

    View Slide

  13. https://developer.mozilla.org/en-US/docs/Web/API/window/
    requestAnimationFrame
    requestAnimationFrame
    History Sniffing

    View Slide

  14. requestAnimationFrame
    History Sniffing

    View Slide

  15. requestAnimationFrame
    History Sniffing

    View Slide

  16. Resource Access
    Sniffing

    View Slide

  17. https://robinlinus.github.io/socialmedia-leak/

    View Slide

  18. Resource Access
    Social Media Leak

    View Slide

  19. Cache Timing History Sniffing

    View Slide

  20. Network timing noise

    View Slide

  21. More reliable timing attacks
    https://tom.vg/2016/08/browser-based-timing-attacks/

    View Slide

  22. Video Parsing Timing Attack

    View Slide

  23. Video Parsing Timing Attack

    View Slide

  24. HSTS History Sniffing
    yan/@bcrypt
    https://diracdeltas.github.io/blog/sniffly/

    View Slide

  25. View Slide

  26. View Slide

  27. View Slide

  28. View Slide

  29. Set CSP on server

    View Slide

  30. Time CSP violations on client

    View Slide

  31. View Slide

  32. Cookies

    View Slide

  33. developer.mozilla.org/login

    View Slide

  34. View Slide

  35. View Slide

  36. http://clearcode.cc/2015/12/cookie-syncing/

    View Slide

  37. http://clearcode.cc/2015/12/cookie-syncing/

    View Slide

  38. View Slide

  39. View Slide

  40. View Slide

  41. Clear your cookies

    View Slide

  42. Cookie Re-spawning

    View Slide

  43. Re-spawning/“Supercookies”

    View Slide

  44. Using Flash

    View Slide

  45. Also
    Silverlight Isolated
    Storage

    View Slide

  46. HTML localStorage

    View Slide

  47. ETag

    View Slide

  48. Get/Set/Re-spawn on client

    View Slide

  49. Check ETag on Server

    View Slide

  50. Cookie Re-spawning
    is “Illegal”
    Or, at least, companies have been sued for it

    View Slide

  51. Cookie Syncing

    View Slide

  52. http://clearcode.cc/2015/12/cookie-syncing/

    View Slide

  53. http://clearcode.cc/2015/12/cookie-syncing/

    View Slide

  54. https://freedom-to-tinker.com/blog/englehardt/the-hidden-
    perils-of-cookie-syncing/

    View Slide

  55. http://clearcode.cc/2015/12/cookie-syncing/

    View Slide

  56. Cookie Syncing
    defeats
    No-respawn

    View Slide

  57. https://freedom-to-tinker.com/blog/englehardt/the-hidden-
    perils-of-cookie-syncing/

    View Slide

  58. Cookie Syncing =
    Giant Cookie Databases

    View Slide

  59. Without cookies

    View Slide

  60. View Slide

  61. View Slide

  62. Passive Fingerprints
    Don’t require code execution

    View Slide

  63. User-Agent, IP,
    Accept-Language, etc.

    View Slide

  64. HTTP Header Injection

    View Slide

  65. View Slide

  66. turn.com Re-spawn
    http://webpolicy.org/2015/01/14/turn-verizon-zombie-cookie/

    View Slide

  67. View Slide

  68. Active Fingerprints
    JavaScript code executes on your device

    View Slide

  69. Plugin Enumeration

    View Slide

  70. Okay but …
    … enumeration is still possible via sniffing, like …

    View Slide

  71. Font Enumeration
    http://www.lalit.org/lab/javascript-css-font-detect/

    View Slide

  72. Measure default fonts

    View Slide

  73. Measure dictionary of fonts

    View Slide

  74. Canvas Fingerprint

    View Slide

  75. View Slide

  76. View Slide

  77. WebGL Fingerprinting
    http://cseweb.ucsd.edu/~hovav/dist/canvas.pdf

    View Slide

  78. AudioContext

    View Slide

  79. View Slide

  80. https://webtransparency.cs.princeton.edu/webcensus/#audio-fp

    View Slide

  81. WebRTC

    View Slide

  82. WebRTC Local Addressing

    View Slide

  83. View Slide

  84. WebVR “eyeprinting”

    View Slide

  85. Device Fingerprints ~= Cookies

    View Slide

  86. Cross-Device Matching

    View Slide

  87. Probabilistic
    “Householding”
    FTC Cross-Device Tracking Workshop
    https://www.ftc.gov/news-events/audio-video/video/cross-device-tracking-part-1

    View Slide

  88. Probabilistic “Tethering”
    cookie=4qasr4sdf1 cookie=f52dh64dhq
    Android Advertising
    Id=0436732361

    View Slide

  89. Probabilistic “Tethering”
    IP address:
    23.64.176.179
    (early mornings,
    evenings, weekends)
    IP address:
    164.62.9.0
    (9am-6pm weekdays)
    IP address:
    164.62.9.0
    (9am-6pm weekdays)
    Cellular network
    23.64.176.179
    (early mornings,
    evenings, weekends)

    View Slide

  90. Probabilistic “Tethering”
    Work?? Cell?? Home??
    80% 80%
    IP address:
    164.62.9.0
    (9am-6pm weekdays)
    IP address:
    164.62.9.0
    (9am-6pm weekdays)
    Cellular network
    23.64.176.179
    (early mornings,
    evenings, weekends)
    IP address:
    23.64.176.179
    (early mornings,
    evenings, weekends)

    View Slide

  91. Probabilistic Matching
    Work? Cell? Home?
    Location:
    38.883914,
    -77.020997
    Weekday location:
    38.883914,
    -77.020997
    Evening location:
    38.897634,
    -77.036544
    Location:
    38.897634,
    -77.036544
    95% 95%

    View Slide

  92. Probabilistic Matching
    Work Cell Home
    Technology news
    UVa sports
    Capitol Hill
    Arsenal football
    Technology news
    UVa sports
    Capitol Hill
    Arsenal football
    Technology news
    UVa sports
    Capitol Hill
    Arsenal football
    98% 98%
    cookie=4qasr4sdf1 Android Advertising
    Id=0436732361
    cookie=f52dh64dhq

    View Slide

  93. Device Graph
    id=4qasr4sdf1 Android Advertising
    Id=0436732361
    id=f52dh64dhq

    View Slide

  94. First-Party
    Deterministic
    You are signed in to their service

    View Slide

  95. View Slide

  96. First-Party Deterministic Matching
    Login:
    JustinBrookman
    Login:
    JustinBrookman
    Login:
    JustinBrookman
    Third-party sites/
    apps that embed
    first-party
    Third-party sites/
    apps that embed
    first-party
    Third-party sites/
    apps that embed
    first-party

    View Slide

  97. –Mark Zuckerberg
    “Over 1 billion people use Facebook on their
    phones every month and more than 80% of the
    top apps on iOS and Android now use
    Facebook logins.”

    View Slide

  98. “One industry source that spoke with
    AdExchanger estimated Google’s logged-in
    cross-device user count as somewhere
    between 600 million and 1.2 billion, a
    conclusion based on the numerical intersection
    between Android users, iOS users, the Google
    login rate of iOS users and the number of
    logged-in desktop users for Google products.”

    View Slide

  99. •Email Address,

    Personally-Identifiable Information (PII)
    •Email Address,

    PII,

    “Google Advertising ID”
    •Email Address,

    PII
    •Email Address,

    PII,

    iOS IDFA

    View Slide

  100. Note:
    Trusted Parties

    View Slide

  101. First-Party Deterministic
    You click their links

    View Slide

  102. Email for First Party Cross-Device Tracking
    Purchase item at a
    shopping site as
    [email protected]

    View Slide

  103. Purchase item at a
    shopping site as
    [email protected]
    Click on email
    from shopping
    site
    Open email from
    shopping site
    Android Advertising
    Id=0436732361
    cookie=4qasr4sdf1
    cookie=a035fs35fm
    Email for First Party Cross-Device Tracking

    View Slide

  104. Third-Party
    Probabilistic
    Device Matching

    View Slide

  105. Machine Learning Model
    1. Acquire device activity data set


    IP addresses, WiFi networks, GPS coordinates, websites
    browsed, ads displayed, device type, operating system,
    browser cookies, mobile device IDs, time of day, etc.
    2. Acquire “truth set” of deterministic matching data


    “training set” and “test set”
    3. Train ML models on the training set, evaluating accuracy,
    precision, and recall against the test set
    4. Point ML model at entire device activity data set

    View Slide

  106. https://www.google.com/policies/privacy/#nosharing, Sep 5 2016

    View Slide

  107. Untrusted
    Third-Party
    Deterministic

    View Slide

  108. PII Leaking
    https://www3.cs.stonybrook.edu/~phillipa/papers/
    contactus_pets2016.pdf

    View Slide

  109. View Slide

  110. View Slide

  111. View Slide

  112. View Slide

  113. View Slide

  114. View Slide

  115. View Slide

  116. Audio Beaconing for
    Cross-Device Matching

    View Slide

  117. ec25d046746de3be33779256f6957d8f

    View Slide

  118. Other device privacy
    vulnerabilities
    • Visual/IR beaconing for cross-device matching?
    • Recognizing speech from gyroscope signals

    (crypto.stanford.edu/gyrophone)
    • Recognizing gait patterns with accelerometers

    (vtt.fi/inf/julkaisut/muut/2005/ICASSP05.pdf)

    View Slide

  119. Purchase item at a
    shopping site as
    [email protected]
    Click on email
    from shopping
    site
    Open email from
    shopping site
    Advertising Network
    md5=b16f55bbe0ff554fb40003f8e5f96b99
    Hashed Email for Third-Party Tracking

    View Slide

  120. Does Hashing Make
    Data Anonymous?
    https://www.ftc.gov/news-events/blogs/techftc/2012/04/does-hashing-make-data-anonymous

    View Slide

  121. Hash Functions
    https://blog.varonis.com/the-definitive-guide-to-cryptographic-hash-functions-part-1/

    View Slide

  122. View Slide

  123. View Slide

  124. View Slide

  125. View Slide

  126. How much tracking is
    going on?

    View Slide

  127. Web Privacy Census
    Dec 12, 2015
    http://techscience.org/a/2015121502/

    View Slide

  128. Web Privacy Census
    Dec 12, 2015
    http://techscience.org/a/2015121502/

    View Slide

  129. Web Privacy Census
    Dec 12, 2015
    http://techscience.org/a/2015121502/

    View Slide

  130. https://webtransparency.cs.princeton.edu/webcensus/

    View Slide

  131. https://webtransparency.cs.princeton.edu/webcensus/

    View Slide

  132. https://webtransparency.cs.princeton.edu/webcensus/

    View Slide

  133. Canvas Fingerprinting
    https://webtransparency.cs.princeton.edu/webcensus/

    View Slide

  134. Audio Fingerprinting
    https://webtransparency.cs.princeton.edu/webcensus/

    View Slide

  135. WebRTC Local Addressing
    https://webtransparency.cs.princeton.edu/webcensus/

    View Slide

  136. Re-spawning
    https://securehomes.esat.kuleuven.be/~gacar/persistent/

    View Slide

  137. Cookie Syncing
    https://securehomes.esat.kuleuven.be/~gacar/persistent/

    View Slide

  138. –Steven Englehardt, Princeton WebTAP
    “in our measurements we found only two
    trackers (doubleclick.net and
    googleanalytics.com) that are present on 40%
    or more of websites. But if we assumed a
    moderate amount of back-end data sharing
    (defined in Section 5.3 of our paper), the
    number of trackers that can observe 40% of
    users’ browsing history would jump to 161”

    View Slide

  139. What are the
    good
    implications?

    View Slide

  140. Analytics

    View Slide

  141. Personalized Services

    View Slide

  142. Relevant Advertising

    View Slide

  143. Advertising Attribution

    View Slide

  144. Prevent Fraud

    View Slide

  145. Prevent Criminal Activity

    View Slide

  146. National Security

    View Slide

  147. What are the
    bad
    implications?

    View Slide

  148. Over-Personalized Services

    View Slide

  149. Creepy Advertising
    https://blogs.harvard.edu/doc/2014/12/12/is-perfectly-
    personalized-advertising-perfectly-creepy/

    View Slide

  150. Targeting options for Facebook advertisers
    https://www.washingtonpost.com/news/the-intersect/wp/2016/08/19/98-
    personal-data-points-that-facebook-uses-to-target-ads-to-you/

    View Slide

  151. Commit Fraud

    View Slide

  152. Enable Criminal Activity

    View Slide

  153. Enable Criminal Activity

    View Slide

  154. Enable Criminal Activity

    View Slide

  155. National Insecurity
    from
    Mass Surveillance

    View Slide

  156. View Slide

  157. False Positive Paradox
    https://www.crosswise.com/cross-device-learning-center/
    device-map-accuracy-precision-and-recall/

    View Slide

  158. www.wired.com/2016/08/shadow-brokers-mess-happens-nsa-hoards-zero-days/

    View Slide

  159. View Slide

  160. (Why?) Aren’t we
    doing anything?

    View Slide

  161. Privacy Paradox
    • consumers are concerned about ways
    marketers access and use their data
    • people still release data about themselves
    that suggest much less concern
    The Tradeoff Fallacy
    Joseph Turow, Michael Hennessy, University of Pennsylvania
    Nora Draper, University of New Hampshire

    View Slide

  162. “Notice
    and
    Choice”
    People are expected to
    negotiate for privacy
    protection by reading
    privacy policies and
    selecting services
    consistent with their
    preferences.
    Alan Westin’s Privacy Homo Economics
    Chris Hoofnagle & Jennifer Urban, UC Berkeley

    View Slide

  163. View Slide

  164. The Tradeoff Fallacy
    Joseph Turow, Michael Hennessy, University of Pennsylvania
    Nora Draper, University of New Hampshire
    2015 Survey

    View Slide

  165. What are the options?

    View Slide

  166. As a user

    View Slide

  167. View Slide

  168. View Slide

  169. View Slide

  170. View Slide

  171. View Slide

  172. View Slide

  173. View Slide

  174. Encrypt your drive
    Windows BitLocker™ Mac FileVault™ LinuxGPL

    View Slide

  175. Check your data-breach status

    View Slide

  176. Use temporary email addresses

    View Slide

  177. As a power user

    View Slide

  178. View Slide

  179. View Slide

  180. View Slide

  181. View Slide

  182. View Slide

  183. View Slide

  184. View Slide

  185. View Slide

  186. View Slide

  187. View Slide

  188. As a developer

    View Slide

  189. HTTPS all the things

    View Slide

  190. View Slide

  191. Secure cookies
    http://blog.teamtreehouse.com/how-to-create-totally-
    secure-cookies

    View Slide

  192. View Slide

  193. Prevent Account
    enumeration
    https://www.troyhunt.com/website-enumeration-insanity-
    how-our-personal-data-is-leaked/

    View Slide

  194. AshleyMadison.com

    View Slide

  195. Don’t leak PII
    https://www.troyhunt.com/website-enumeration-insanity-
    how-our-personal-data-is-leaked/

    View Slide

  196. strawberrynet.com
    Please be advised that in surveys we have completed, a huge majority
    of customers like our system with no password. Using your e-mail
    address as your password is sufficient security, and in addition we
    never keep your payment details on our website or in our computers.

    View Slide

  197. As an advocate

    View Slide

  198. reddit.com/r/privacy
    Note: use tracking protection on reddit.com

    View Slide

  199. View Slide

  200. Deepen Your Understanding
    http://papers.ssrn.com/sol3/papers.cfm?abstract_id=998565&

    View Slide

  201. View Slide