years in system management (SysAdmin) • Founded Linux-HA project - led 1998-2007 – aka “Heartbeat” - now called Pacemaker • Founded Assimilation Project in 2010 • Founded Assimilation Systems Limited in 2013 • Alumnus of Bell Labs, SuSE, IBM
• Do you think good security staff is easily available? • Do you think security is going to get better soon? • Do you think you have enough staff for security to keep up with changes at DevOps / Agile rates?
30% of break-ins come through “lost” systems (Verizon) 90% have had failures of unmonitored services (Turnbull) 71% are unable to stay in compliance (Verizon) 30% only start monitoring only after a problem (Turnbull) 30% of systems doing nothing useful (Koomey)
drives everything • Continuous extensible discovery (CMDB) – systems, switches, services, dependencies – zero network footprint discovery process • Extensible exception monitoring – more than 100K systems • Discovery Drives Best Practice Analyses – Initially concentrating on security • All data goes into central graph CMDB (Config Mgmt Data Base)
dead servers in “I see dead servers in O O(1) time” (1) time” • Adding systems does not increase the monitoring work on any system • Each server monitors 2 (or 4) neighbors • Each server monitors and discovers its own services • Ring repair and alerting is O(n) – but a very small amount of work Current Implementation
• Central Collective Management Authority – written in Python – delegates most work to nanoprobes – does nothing as much as possible – Doing nothing scaless really well – should be into the 100K system range • Fully distributed “nanoprobe” agents – Simple, policy-free – Written in 'C' – Run scripts for monitoring or discovery – Send/receive heartbeats – Listen for ARP, LLDP, CDP packets • Neo4j graph database
est Practice Analyses • Triggered by Discovery Updates – Analysis occurs within seconds of change – No change => No analysis • We can analyze anything discovered • You can easily discover anything you want • Alerts and Reports available
be discovered – nothing will be configured manually • What needs hardening • How to Triage your hardening issues • How to Demonstrate and Track Progress • How to keep them in compliance (hardened) • Visualizing Your Attack Surface • Who has what package+version – Docker package discovery too!
Information online? Where to find this Information online? http://assimilationsystems.com/category/getting-started/ 1. 15 Minutes To Better Security 2. An Hour To Better Security 3. A Half-Day To Better Security Where to See Similar Demos • http://assimilationsystems.com/category/videos/ • http://assimilationsystems.com/sample-demo-output/
Is Futile! These slides: bit.ly/DevOpsDaysRox16 Mailing List: bit.ly/AssimML @OSSAlanR #assimilation on irc.freenode.net Project Web Site: assimproj.org Company Web Site: assimilationsystems.com Download: assimilationsystems.com/download
(Neo4j) Why a graph database? (Neo4j) • Humans describe systems as graphs • Dependency & Discovery information: graph • Speed of graph traversals depends on size of subgraph, not total graph size • Root cause queries graph traversals – notoriously slow in relational databases • Visualization is Natural • Schema-less design: good for constantly changing heterogeneous environment • Graph Model === Object Model
Monitoring Pros and Cons Pros Simple & Scalable Uniform work distribution No single point of failure Distinguishes switch vs host failure Easy on LAN, WAN Multi-tenant approach Cons Active agents Potential slowness at power-on
Unique Powerful Features Unique Powerful Features 1. Continuous Discovery 2. Discovery: Zero network footprint 3. Centralized graph database 4. We know everything that changes 5. Discover and update dependency information 6. Discovery and monitoring tightly integrated – discovery drives automation
more) Features... 7. Discovery and monitoring easily extensible 8. Naturally scalable to > 100K systems 9. Minimal network load 10.Server failures distinguishable from switch failures 11.Best practice and vulnerability alerts 12.Multi-tenant support
Fully distributed work Fully distributed work Two philosophical underpinnings 1. Monitoring and Discovery are fully distributed 2. Reliable “no news is good news” Only responses to changes are centralized
HA Technologies Service Monitoring based on HA Technologies • Well-proven architecture: – reliable “no news is good news” • Implements Open Cluster Framework standard (LSB and others – Nagios coming!) • Each system monitors own services • Can also start, stop, migrate services
How does discovery work? Nanoprobe scripts perform discovery • Each discovers one kind of information • Can take arguments from environment • Output JSON CMA stores Discovery Information • JSON stored in Neo4j database • CMA discovery plugins => graph nodes and relationships
A Few Canned Queries allipports get all port/ip/service/hosts allswitchports get switch connections crashed get crashed servers shutdown get gracefully shutdown servers downservices get nonworking services findip get system owning IP findmac get system owning MAC unknownips get unknown IP addresses unmonitored get unmonitored services