Building Effective Security Alerting

E6d1c11fdcefed14f50848efc0cb2e72?s=47 Kennysan
August 05, 2016

Building Effective Security Alerting

Code available here: fouroneone.io

E6d1c11fdcefed14f50848efc0cb2e72?s=128

Kennysan

August 05, 2016
Tweet

Transcript

  1. Building Effective Security Alerting Kai Zhong @sixhundredns Ken Lee @kennysan

  2. None
  3. OPEN SOURCE!

  4. Who Are We?

  5. KZER Kai Zhong Product Security Engineer @ Etsy Loves tea,

    cats and netbooks Twitter: @sixhundredns
  6. I’m a

  7. KLEE Ken Lee Senior Product Security Engineer @ Etsy Spoke

    at Defcon 21 about Content Security Policy Loves funny cat gifs Twitter: @Kennysan
  8. None
  9. What is Etsy?

  10. Some Stats $2.32 billion GMS marketplace

  11. Some Stats $2.32 billion GMS marketplace 1.7 million active sellers

  12. Some Stats $2.32 billion GMS marketplace 1.7 million active sellers

    26.1 million active buyers
  13. Some Stats

  14. Engineering Stats Average 40-60 deploys a day

  15. Engineering Stats Average 40-60 deploys a day PHP 7

  16. Engineering Stats Average 40-60 deploys a day PHP 7 Native

    iOS, Android apps
  17. None
  18. What Are We Covering?

  19. History

  20. Our Solution

  21. Alert Management @ Etsy

  22. Demo

  23. But First, Some Terminology

  24. Logs

  25. The ELK Stack

  26. None
  27. Logstash Data processor and log shipper

  28. Logstash Data processor and log shipper Allows you to break

    out your log data into separate fields
  29. Logstash Data processor and log shipper Allows you to break

    out your log data into separate fields We use it to ship logs into Elasticsearch!
  30. None
  31. Elasticsearch Distributed, real-time search engine

  32. Elasticsearch Distributed, real-time search engine Allows storing complex, nested documents

  33. Elasticsearch Distributed, real-time search engine Allows storing complex, nested documents

    Allows generating statistics over your data
  34. Elasticsearch Distributed, real-time search engine Allows storing complex, nested documents

    Allows generating statistics over your data We use it for analyzing logs!
  35. Kibana Data visualization frontend for Elasticsearch

  36. Kibana Data visualization frontend for Elasticsearch Log discovery

  37. Kibana Data visualization frontend for Elasticsearch Log discovery Visualizations!

  38. None
  39. History

  40. Switching to ELK Work started in mid 2014

  41. Switching to ELK Work started in mid 2014 Finished in

    mid 2015
  42. Switching to ELK Work started in mid 2014 Finished in

    mid 2015 We learned a lot from the migration
  43. Switching to ELK Work started in mid 2014 Finished in

    mid 2015 We learned a lot from the migration And got a bunch of great tools out of it
  44. It Was A Bumpy Road Hiccups are expected when moving

    to a new technology
  45. It Was A Bumpy Road Hiccups are expected when moving

    to a new technology Had to deal with annoying, performance-impacting bugs
  46. It Was A Bumpy Road Hiccups are expected when moving

    to a new technology Had to deal with annoying, performance-impacting bugs Issues with SSDs, kernel-level bugs
  47. It Was A Bumpy Road Hiccups are expected when moving

    to a new technology Had to deal with annoying, performance-impacting bugs Issues with SSDs, kernel-level bugs Security needed an alerting solution
  48. None
  49. ESQuery

  50. Features Superset of the standard Lucene syntax

  51. Features Superset of the standard Lucene syntax Syntactically similar to

    SPL!
  52. Features Superset of the standard Lucene syntax Syntactically similar to

    SPL! Supports all the functionality we need!!!
  53. Syntax Command Syntax Inline params $size:20 $sort:user_id $fields:[a,b,c] Joins *

    | join source:src_ip target:dst_ip Aggregations * | agg:terms field:src_ip | agg:terms field:user_id Variable substitution src_ip:@internal_ips
  54. SPL source="/data/syslog/current/web/info.log" log_namespace="login" reason="wrong password" response=403 | top 10 remote_host

  55. ESQuery type:web_info_log log_namespace:login logdata.reason:"wrong password" -response:403 | agg:terms field:logdata.remote_host size:10

  56. { "query": { "filtered": { "query": { "bool": { "minimum_number_should_match":

    1, "should": [ { "query_string": { "query": "type:web_info_log log_namespace:login logdata.reason:\"wrong password\" response:403 ", "default_operator": "AND", "lowercase_expanded_terms": false, "allow_leading_wildcard": false }}]}}, "filter": { "bool": { "must": [ { "range": { "event_timestamp": { "from": 1468294422783, "to": 1468295322783 }}}]}}}}, "size": 0, "sort": [ { "event_timestamp": { "order": "desc", "ignore_unmapped": true }}, { "event_timestamp": { "order": "desc", "ignore_unmapped": true }}], "aggs": { "terms_bucket": { "terms": { "field": "logdata.remote_host", "size": 10 }}}}
  57. splogTASH

  58. None
  59. 411

  60. Alert Generation & Management Write queries to be periodically executed

  61. Alert Generation & Management Write queries to be periodically executed

    Receive email alerts with results
  62. Alert Generation & Management Write queries to be periodically executed

    Receive email alerts with results Manage alerts via the web interface
  63. Dashboard

  64. None
  65. None
  66. None
  67. Managing queries

  68. None
  69. None
  70. None
  71. None
  72. None
  73. None
  74. Configuring a query

  75. None
  76. None
  77. None
  78. None
  79. None
  80. None
  81. None
  82. None
  83. Types of queries

  84. Logstash

  85. None
  86. None
  87. None
  88. HTTP

  89. None
  90. None
  91. Graphite

  92. None
  93. None
  94. Configuring a query (cont)

  95. None
  96. None
  97. None
  98. None
  99. None
  100. None
  101. None
  102. None
  103. Configuring groups

  104. None
  105. None
  106. None
  107. None
  108. None
  109. None
  110. None
  111. Configuring a query (cont)

  112. None
  113. None
  114. None
  115. Scheduling

  116. Under the Hood Scheduler Search Jobs Workers

  117. Under the Hood Search Alerts Targets Filters Search Job Data

    Source
  118. Configuring filters

  119. None
  120. None
  121. Filter types Regex

  122. None
  123. Filter types Regex Throttle

  124. None
  125. Filter types Regex Throttle Expression

  126. None
  127. Configuring targets

  128. None
  129. None
  130. Target types Jira

  131. None
  132. Target types Jira Webhook

  133. None
  134. Target types Jira Webhook Pagerduty

  135. None
  136. Managing alerts

  137. None
  138. None
  139. None
  140. None
  141. None
  142. None
  143. Alert actions Assign

  144. Alert actions Assign Annotate

  145. Alert actions Assign Annotate Resolve

  146. Reviewing an alert

  147. None
  148. None
  149. None
  150. None
  151. None
  152. None
  153. None
  154. None
  155. None
  156. None
  157. Live alerts feed

  158. None
  159. None
  160. None
  161. Alert Management @ Etsy

  162. Make Alerting Great Again

  163. Sensitivity For a given event, how often a search modelled

    on that event will alert
  164. Sensitivity For a given event, how often a search modelled

    on that event will alert True Positive Rate
  165. Sensitivity For a given event, how often a search modelled

    on that event will alert True Positive Rate Avoid creating searches that are too specific
  166. Sensitivity For a given event, how often a search modelled

    on that event will alert True Positive Rate Avoid creating searches that are too specific Minimize False Negatives
  167. Sensitivity For a given event, how often a search modelled

    on that event will alert True Positive Rate Avoid creating searches that are too specific Minimize False Negatives E.g. IP address AND user agent AND user id
  168. Specificity For a given event, how often a search modelled

    on that event will correctly not fire
  169. Specificity For a given event, how often a search modelled

    on that event will correctly not fire True Negative Rate
  170. Specificity For a given event, how often a search modelled

    on that event will correctly not fire True Negative Rate Avoid creating searches that are overly broad
  171. Specificity For a given event, how often a search modelled

    on that event will correctly not fire True Negative Rate Avoid creating searches that are overly broad Minimize False Positives
  172. Specificity For a given event, how often a search modelled

    on that event will correctly not fire True Negative Rate Avoid creating searches that are overly broad Minimize False Positives E.g. Numerous POST requests to /login
  173. None
  174. Incident Response High specificity alerts

  175. Incident Response High specificity alerts Low priority alerts don’t generate

    notification e-mails
  176. Incident Response High specificity alerts Low priority alerts don’t generate

    notification e-mails Medium/High priority alerts generate alerts
  177. Incident Response High specificity alerts Low priority alerts don’t generate

    notification e-mails Medium/High priority alerts generate alerts Attackers often generate a lot of noise -- can result in numerous alerts firing!
  178. Responding to an Alert Is this an alert that can

    wait till morning?
  179. Responding to an Alert Is this an alert that can

    wait till morning? How many other related alerts went off during this time period?
  180. Responding to an Alert Is this an alert that can

    wait till morning? How many other related alerts went off during this time period? Example: failed logins and bot activity
  181. How We Respond to an Alert

  182. Responding to an Alert Was there activity our alerts did

    not catch initially?
  183. Responding to an Alert Was there activity our alerts did

    not catch initially? Dashboards, developers, combing through log files
  184. Responding to an Alert Was there activity our alerts did

    not catch initially? Dashboards, developers, combing through log files Incorporate into new alerts, improve sensitivity of old alerts
  185. Alert Maintenance Sometimes certain queries are no longer useful

  186. Alert Maintenance Sometimes certain queries are no longer useful Review

    noisy alerts
  187. Alert Maintenance Sometimes certain queries are no longer useful Review

    noisy alerts Add in other useful fields
  188. Alert Maintenance Sometimes certain queries are no longer useful Review

    noisy alerts Add in other useful fields Example: Attacker using an off-the shelf scanner
  189. What Deserves an Alert? Potential error conditions

  190. None
  191. What Deserves an Alert? Potential error conditions Volume of traffic/Thresholds

    being hit
  192. None
  193. What Deserves an Alert? Potential error conditions Volume of traffic/Thresholds

    being hit Deprecating old code
  194. None
  195. Multiple 411 Instances Really easy to set up a new

    instance
  196. Multiple 411 Instances Really easy to set up a new

    instance Supports multiple hosts out the box
  197. Multiple 411 Instances Really easy to set up a new

    instance Supports multiple hosts out the box Just need to run a script
  198. Instances Sec411

  199. Instances Sec411 Netsec411

  200. Instances Sec411 Netsec411 Dev411

  201. Instances Sec411 Netsec411 Dev411 Sox411

  202. Demo

  203. Questions? 411 is available at: https://fouroneone.io Kai @sixhundredns kai@etsy.com Ken

    Lee @kennysan ken@etsy.com