Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Cyber Security Log Analytics at Decision Lab

Dd9d954997353b37b4c2684f478192d3?s=47 Elastic Co
October 06, 2015

Cyber Security Log Analytics at Decision Lab

Buiding a seamless Cyber Security analytics tool starts by choosing the right stack. The team at DecisionLab shares how they've deployed Kibana, Logstash, Elasticsearch, Angular, and Node to kick Hadoop and Splunk to the curb. Other insights include architecture, automation, and deployment best practices.

Dd9d954997353b37b4c2684f478192d3?s=128

Elastic Co

October 06, 2015
Tweet

More Decks by Elastic Co

Other Decks in Technology

Transcript

  1. Nathan Necaise and Drew Malone Oct. 6, 2015 1  

    Cyber Security Log Analytics
  2. Our Cyber Security Stories •  Why should we care • 

    What a solution should provide •  How did we get to ELK §  Hadoop §  Hadoop with Lambda §  MEAN Stack §  KLEAN Stack •  What about other COTS solutions •  Next 2
  3. So What’s the Big Deal? •  Data Exhaust •  Evolution

    of disparate data sources •  IRS and OPM breaches •  Shellshock, Heartbleed, and more •  Heterogeneous environment •  Reactionary •  Decisions based on anecdotal evidence 3
  4. What Does a Solution Need? •  Aggregation •  Correlation • 

    Alerting •  Compliance verification •  Data Retention •  Forensic/Historic analysis •  Work well with existing and new technologies 4
  5. How We Started •  What we could now do: § 

    Analysis of large volumes of data §  Excellent batch and historic insight §  Allowed both quick prototyping as well as fine grained control •  What was still not present: §  Real time analysis §  Barrier to entry via need for Java, Pig, etc. §  Output was not visual and intuitive 5
  6. Hadoop Solution 6

  7. Lets build on this starting point •  What we gained:

    §  Real-time analysis §  Ability to perform low-latency queries into vast amounts of data •  What was still a challenge: §  We had to support a custom web application §  Missing flexibility in queries §  Lead time required for unforeseen questions §  More complexity 7
  8. Lambda Architecture 8

  9. Lets Check out this MEAN Stack •  What worked well:

    §  Nice and low barrier to entry (JSON + JavaScript) §  Flattened the tech stack •  What challenges did we find: §  Higher sustainment and maintenance cost §  Scaling was more complex and time consuming §  Limitations in the aggregation pipeline for complex queries §  Still spending more time developing than solving problems 9
  10. MEAN Stack 10

  11. K, Lets try to keep all the good and toss

    the bad… •  We already use Elasticsearch for full text searching. Why don’t we make it structured search and use it as a big data back-end? •  How did this work: §  Surprisingly easy transition (MEAN to KLEAN in prod ~5 days) §  Better query performance §  More flexibility in queries §  Crazy simple scalability §  Simple stack compared to Hadoop §  Support for both simple and custom ingest §  Kibana! 11
  12. KLEAN Stack 12

  13. That’s great, so what… •  Because our time is focused

    on solving problems and not maintenance and sustainment of the technology we can do things like: §  Tell where vulnerable versions of software are present in a highly fragmented enterprise §  Deliver insight into data based on roles and accesses by integrating with in house authorization services §  Retrospectively query for prior evidence of newly found malicious behavior §  Automatically discover the uncommonly common and anomalous trends §  Make data based decisions and not rely on stale data and intuition 13
  14. Did we try Splunk? •  Sister projects used Splunk. We

    worked closely with large Splunk deployments. •  It works for some scenarios: moderate data, nice out of the box UIs •  Why didn’t we also use it? §  No way to test drive at scale §  500MB / day doesn’t allow me to determine if it will fit my needs §  High cost for our scale §  Our data is not always time series §  Poor performance compared to Elasticsearch at scale §  HUNK, emphasis on summarizing data before ingest §  Splunk licenses not renewed 14
  15. What have we done since? 15

  16. What next? •  Continue to resolve pain points •  Automate

    everything, everywhere 16