Elastic{ON} 2018 - Sipping from the Firehose: Scalable Endpoint Data for Incident Response

Elastic{ON} 2018 - Sipping from the Firehose: Scalable Endpoint Data for Incident Response

Enterprises have better sources of endpoint telemetry to respond to intrusions than ever before, yet attackers continue to slip through the cracks, often with surprising ease. And security teams still struggle to fully scope or remediate compromises, even after they’ve been detected.

This presentation will examine why it's so difficult to gather and maintain the right mix of endpoint data for effective incident response. It will then demonstrate how a blended approach — combining technologies like Elasticsearch with distributed, on-endpoint analysis — can offer comprehensive, high-speed, and efficient visibility at any scale. Examples from real-world breaches (including a few that inspired hacks in the latest season of Mr. Robot) will illustrate lessons learned from the field.

Ryan Kazanciyan| Chief Security Architect | Tanium

Dd9d954997353b37b4c2684f478192d3?s=128

Elastic Co

March 01, 2018
Tweet

Transcript

  1. Tanium February 28, 2018 @ryankaz42 Sipping from the Firehose: Scalable

    Endpoint Data for Incident Response Ryan Kazanciyan, Chief Security Architect
  2. whoami

  3. 3 Alexandria, VA

  4. 4 2004 - 2009 2009 - 2015 2015 - Present

  5. 5

  6. 6

  7. Why this topic?

  8. 8 • Endpoints are the ultimate security perimeter • We’re

    in a golden age of endpoint security tools and data… • …but we still struggle with scale, efficiency, and effectiveness
  9. WHAT DATA SHOULD I PRIORITIZE FOR ENDPOINT DETECTION AND RESPONSE…

    
 AND WHERE DO I PUT IT?
  10. Common Challenges

  11. 11 You have a much wider variety of endpoint data

    than you expect
  12. 12 …but all you need is a best-of-breed Endpoint Protection

    product, right?
 
 (wrong)
  13. • “Black box” flight recorder
 • Limited to the most

    common event-based data (process execution, file changes, network connections, etc.)
 • High-volume, high-value EDR telemetry 13
  14. MITRE ATT&CK Framework 14 https://attack.mitre.org/wiki/Technique_Matrix

  15. • Access Tokens • Anti-virus • API monitoring • Authentication

    logs • Binary file metadata • BIOS • Browser extensions • Data loss prevention • Digital Certificate Logs • DLL monitoring • EFI • Environment variable • File monitoring • Host network interface • Kernel drivers • Loaded DLLs • MBR & VBR • Netflow • Network device logs • Network protocol analysis • Packet capture • PowerShell logs • Process command-line parameters • Process monitoring • Process use of network • Sensor health and status • Services • SSL/TLS inspection • System calls • Third-party application logs • User interface • Windows Error Reporting • Windows event logs • Windows Registry • WMI Objects Data sources per MITRE ATT&CK 15
  16. • Access Tokens • Anti-virus • API monitoring • Authentication

    logs • Binary file metadata • BIOS • Browser extensions • Data loss prevention • Digital Certificate Logs • DLL monitoring • EFI • Environment variable • File monitoring • Host network interface • Kernel drivers • Loaded DLLs • MBR & VBR • Netflow • Network device logs • Network protocol analysis • Packet capture • PowerShell logs • Process command-line parameters • Process monitoring • Process use of network • Sensor health and status • Services • SSL/TLS inspection • System calls • Third-party application logs • User interface • Windows Error Reporting • Windows event logs • Windows Registry • WMI Objects What do most EDR tools focus on? 16
  17. 17

  18. 18

  19. 19

  20. 20

  21. 21

  22. 22

  23. 23

  24. 24

  25. 25

  26. 26

  27. 27

  28. 28

  29. • What sources of data? • What can be centralized?

    • What must be examined on-endpoint? • What’s your cadence to collect? • What’s your cadence to analyze? You cannot capture everything, constantly 29
  30. • Typical endpoint sources • Alerting tools • Telemetry tools

    • Critical logs (limited to select systems)
 • Ideal for correlation with non-endpoint sources, aggregate data analysis
 • Resource constrained by event forwarding and storage over time Centralized approach 30
  31. • Broadest set of available data: • Volatile / in-memory

    • Files and artifacts on-disk • Locally stored telemetry and logs • Often difficult to efficiently search and collect at-scale On-endpoint evidence 31
  32. 32 rule PAS_TOOL_PHP_WEB_KIT { meta: description = "PAS TOOL PHP

    WEB KIT FOUND" strings: $php = "<?php" $base64decode = /\='base'\.\(\d+\*\d+\)\.'_de'\.'code'/ $strreplace = "(str_replace(" $md5 = ".substr(md5(strrev(" $gzinflate = "gzinflate" $cookie = "_COOKIE" $isset = “isset" condition: (filesize > 20KB and filesize < 22KB) and #cookie == 2 and #isset == 3 and all of them } Searching web server files with Yara On-endpoint example #1
  33. On-endpoint example #2 33 Hunting for a unique event in

    a non-forwarded log
  34. 34 Your endpoints are noisier than you might expect…

  35. • Different OS versions, add-ons, and regional variants • User

    applications • Enterprise applications • Randomized file paths, GUIDs, and other per-host unique artifacts • Churn from updates & patches Your software is noisy 35 Examining operating system, application, and script usage at-scale
  36. 5-7 per host 1-3 per host Large networks (>100k endpoints)

    Small networks (<100k endpoints) * Measured by total unique instances of installed application versions
  37. 230,000 systems 400,000 unique
 application + version pairings

  38. 38 “Using Endpoint Telemetry to Accelerate the Baseline”, McCammon,
 https://www.sans.org/summit-archives/file/summit-archive-1492181402.pdf

  39. What trade-offs do we make?

  40. TUNNEL VISION

  41. “Let’s focus on critical systems” 41

  42. COGNITIVE LOAD

  43. 43 Your analysts are overwhelmed

  44. 44 Your analysts are overwhelmed

  45. Iterating on an hunting technique 45 “How often do legitimate

    Windows applications run PowerShell encoded commands?” 1 2 3 4 5 “Oops. 10% of our endpoints produce 1000s of false positives per day. Too much noise. “Let’s apply some client-side filters to the data and try again.” “Eureka!” Now let’s collect, centralize, and analyze the data over time.” “Find all the evil things!” Ask a 
 question Get unexpected results Learn and refine Add to workflow Success!
  46. Common inhibitors 46 1 2 3 4 5 Ask a

    
 question Get unexpected results Learn and refine Add to workflow Success! • Expensive or slow to test at-scale • Can only work with pre-selected data • Contend for resources with other workflows 
 “Will this break something?”…“Take too long?”…“I guess I won’t try…”
  47. Taking a balanced approach

  48. 48 Open platform to consolidate and analyze EDR data with

    other sources of evidence
  49. 49

  50. 50

  51. 51

  52. 52

  53. Finding a story in the data 53

  54. Finding a story in the data (a better way) 54

  55. with Integrating

  56. 56 Real-time visibility and control - across any number of

    endpoints - from a single server
  57. 57

  58. Distributed access to endpoint data 58 Full-disk index, files at-rest,

    OS configuration 
 and forensic artifacts Volatile memory and short-lived / stateful evidence Historical EDR telemetry, OS logs, application logs Efficient data aggregation, and a single source of truth
  59. Search, collect, and analyze at-scale 59 1 2 3 4

    5 Ask a 
 question Get unexpected results Learn and refine Add to workflow Success! Experiment without penalty, with results in seconds
  60. Demo Video

  61. 61

  62. 62 More Questions? Visit me at the AMA

  63. Tanium February 28, 2018 @ryankaz42 Sipping from the Firehose: Scalable

    Endpoint Data for Incident Response Ryan Kazanciyan, Chief Security Architect