Data Driven Web Application Security

Data Driven Web Application Security

The security posture of an application is directly proportional to the amount of information that is known about the application. How can we, as web application security practitioners, take advantage of application metrics to improve the security posture of our product? This talk will explore the ways that application data and metrics can be taken advantage of to create effective defenses for web applications today. We’ll outline the fundamental classes of web application security mechanisms and once an understanding of the domain is established, we’ll explore several specific examples that outline how Etsy’s security team uses metrics, analytics and big data every day to solve hard, interesting problems and create a safer experience for millions of users all over the world.

Bc60a5fc6a131ea6cfa80e000b40c743?s=128

Mike Arpaia

August 30, 2013
Tweet

Transcript

  1. Data Driven Web Application Security Mike Arpaia Kyle Barry

  2. None
  3. None
  4. None
  5. Mike Arpaia Senior Software Engineer @mikearpaia

  6. Mike Arpaia Senior Software Engineer @mikearpaia Kyle Barry Security Engineering

    Manager @allofmywats
  7. https://www.etsy.com/listing/92868829/the-oh-my-orange-elephant-designer-wall

  8. https://www.etsy.com/listing/116016218/atomic-orbits-chemistry-fat-quarter

  9. https://www.etsy.com/listing/104411356/leather-iphone-44s-case-slipcover-sleeve

  10. Data Infrastructure

  11. Graphite

  12. None
  13. https://github.com/etsy/dashboard

  14. https://github.com/etsy/statsd

  15. if is_xss($request_params) { StatsD::increment('security.potenital_xss'); }

  16. None
  17. None
  18. Splunk

  19. None
  20. None
  21. None
  22. None
  23. MySQL

  24. None
  25. Sharded application data http://www.slideshare.net/jgoulah/the-etsy-shard-architecture-starts-with-s-and-ends-with-hard

  26. Dozens of database servers

  27. Hundreds of tables

  28. Postgres

  29. Legacy

  30. Hadoop

  31. MapReduce

  32. Disk Performance 0 500 1000 1500 2000 1998 1999 2000

    2001 2002 2003 2004 2005 2006 2007 2008 Capacity in GB
  33. Disk Performance 0 500 1000 1500 2000 1998 1999 2000

    2001 2002 2003 2004 2005 2006 2007 2008 Capacity in GB Transfer Rate in GB/s
  34. Let’s add disks! 0 275 550 825 1100 1 2

    3 4 5 6 7 8 9 10 Seconds it takes to read 1 TB of data at 1 GB/s
  35. Sounds good.

  36. Good for ad-hoc, whole dataset analysis Linearly scalable programming model

  37. MySQL data

  38. Event logs & Visit logs

  39. Cascading

  40. Complex Workflows

  41. Less lines of code

  42. Minimal barrier to entry

  43. Awesome Data Team

  44. 96 cores

  45. 384 GB of RAM

  46. 24 TB of storage

  47. ...per 2U of rack space

  48. 160 nodes 960 TB storage 3840 cores 15 TB of

    RAM
  49. Vertica

  50. Proprietary

  51. Columnar

  52. Postgres-like syntax

  53. MySQL + Postgres

  54. Fast analytics

  55. Security Mechanisms

  56. First, a thesis

  57. The security posture of your application is directly proportional to

    how much you know about your application.
  58. Reactive Security

  59. Real-time event monitoring and alerting Events that trigger immediate response

    You always query the same data and you do it often
  60. None
  61. graphite

  62. None
  63. Proactive Security

  64. Things we do now to protect us later Actions taken

    to prevent future compromise
  65. None
  66. None
  67. Incident Response

  68. Ad-hoc analysis of a large dataset Driven by an event

    or incident You’re not going to do it more than once Needs to be fast
  69. None
  70. Gather data to create reactive security mechanisms Gather data to

    create proactive security mechanisms Directly create a new proactive security mechanism Perform incident response
  71. Gather data to create reactive security mechanisms Gather data to

    create proactive security mechanisms Directly create a new proactive security mechanism Perform incident response
  72. Gather data to create reactive security mechanisms Gather data to

    create proactive security mechanisms Directly create new proactive security mechanisms Perform incident response
  73. Gather data to create reactive security mechanisms Gather data to

    create proactive security mechanisms Directly create new proactive security mechanisms Perform incident response
  74. Case Studies

  75. Reactive Security

  76. None
  77. Alerting

  78. None
  79. None
  80. None
  81. Use analytics to set thresholds

  82. Reporting

  83. SuperBIT

  84. None
  85. Putting it together

  86. None
  87. Proactive Security

  88. Goal Full-site SSL for all Etsy sellers

  89. analytics_cascade do analytics_flow do analytics_source 'event_logs' tap_db_snapshot 'users_index' assembly 'event_logs'

    do group_by 'user_id', 'scheme' do count 'value' end end assembly 'users_index' do project 'user_id', 'is_seller' end assembly 'ssl_traffic' do project 'user_id', 'is_seller', 'scheme', 'value' group_by 'is_seller', 'scheme' do count 'value' end end analytics_sink 'ssl_traffic' end end
  90. None
  91. Keeping current

  92. Two Factor Authentication

  93. None
  94. Do Etsy app users use two factor auth?

  95. Splunk & Vertica

  96. Proactively Realtime

  97. Content Security Policy Violations

  98. None
  99. Incident Response

  100. Needle in a haystack

  101. • URL Patterns • IP Addresses Simple Patterns

  102. analytics_cascade do analytics_flow do analytics_source 'access_logs' assembly 'incident_response' do query_event

    'timestamp', 'request_uri', 'useragent', 'ip' where '"/bad_url.php'".equals(request_uri:string) group_by ’url’ do count 'value' end end analytics_sink 'incident_response' end end
  103. Phishing Attack In Two Parts

  104. Part One

  105. None
  106. None
  107. Part Two

  108. source=”access_logs” client_ip=10.163.2.3 | transaction request_uri

  109. Collusion Fraud

  110. Look for patterns Incident Response

  111. Set up monitoring Be reactive

  112. Stay Aware Get proactive

  113. Conclusions

  114. Instrument your application at length Understand security mechanisms Use your

    data and use it often
  115. None