Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Discovery and Monitoring Without Limit for COSUG 2013 - Colorado Springs Open Source Users Group

Discovery and Monitoring Without Limit for COSUG 2013 - Colorado Springs Open Source Users Group

The Assimilation Project provides integrated IT discovery and monitoring aimed at risk management and mitigation. Discovery finds systems, services, dependencies, including services you aren’t monitoring and systems you’ve forgotten about. About 30% of all outside security breaches come through forgotten systems. Discovery is continuous and has zero-network-footprint. Monitoring is extremely scalable due to a radically distributed architecture. Discovery informs monitoring - simplifying configuration and maintenance.

The Assimilation Project software provides extremely scalable easy-to-configure monitoring, and creates a continually up to date, detailed configuration management database based on the Neo4j graph database. This talk will give an overview of the Assimilation project - its capabilities, current status and future plans.

Alan Robertson

November 21, 2013
Tweet

More Decks by Alan Robertson

Other Decks in Technology

Transcript

  1. CO
    Spgs
    O
    S
    U
    G
    IT Discovery and Monitoring
    Without Limit
    using
    The Assimilation Project
    #AssimProj @OSSAlanR
    http://assimproj.org/
    Alan Robertson
    Assimilation Systems Limited
    http://assimilationsystems.com

    View Slide

  2. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 2/36
    CO
    Spgs
    O
    S
    U
    G
    Project Scope
    Zero-network-footprint continuous Discovery
    integrated with extreme-scale Monitoring

    Continuous extensible discovery
    – systems, switches, services, dependencies
    – zero network footprint

    Extensible exception monitoring
    – more than 100K systems

    All data goes into central graph database

    View Slide

  3. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 3/36
    CO
    Spgs
    O
    S
    U
    G
    Questions

    How many of you have monitoring?
    – Open or closed source?
    – How many of you are happy with it?

    How many of you have discovery?
    – Open or closed source?
    – Is it continuous?
    – How many of you are happy with it?

    View Slide

  4. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 4/36
    CO
    Spgs
    O
    S
    U
    G
    Assimilation Project History

    Inspired by 2 million core computer (cyclops64)

    Concerns for extreme scale

    Topology aware monitoring

    Topology discovery w/out security issues
    =►Discovery of everything!

    View Slide

  5. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 5/36
    CO
    Spgs
    O
    S
    U
    G

    View Slide

  6. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 6/36
    CO
    Spgs
    O
    S
    U
    G
    An 8-dimensional overview

    Problems Addressed

    Unique Capabilities

    Distribution of Work

    Architectural Components

    Discovery Graph Schema

    Extensible Discovery API

    Current Status

    Project Needs

    View Slide

  7. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 7/36
    CO
    Spgs
    O
    S
    U
    G
    First Dimension:
    Problems Addressed
    Risk Management at extreme scale
    1. Maintaining detailed
    discovery database
    2. Discovering systems
    you've forgotten about
    3. Discovering what (licensed)
    software you're running – and where
    4. Monitoring services, systems and
    switches
    5. Finding services you aren't monitoring

    View Slide

  8. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 8/36
    CO
    Spgs
    O
    S
    U
    G
    Risk Management/Mitigation

    Intrusions

    Licensed Software

    Audit Risk

    Outages

    System management

    View Slide

  9. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 9/36
    CO
    Spgs
    O
    S
    U
    G
    Why Discovery? (DevOps)

    Documentation: incomplete, incorrect

    Dependencies: unknown

    Planning: Needs accurate data

    Best Practices: Verification needs
    data

    ITIL CMDB (Configuration Mgmt
    DataBase)
    Our Discovery: continuous, low-profile

    View Slide

  10. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 10/36
    CO
    Spgs
    O
    S
    U
    G
    Second Dimension:
    Unique Powerful Features
    1. Continuous Discovery
    2. Zero network discovery footprint
    3. Centralized graph database
    4. We know everything that
    changes
    5. Discover and update dependency
    information

    View Slide

  11. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 11/36
    CO
    Spgs
    O
    S
    U
    G
    (even more) Features...
    6. Discovery and monitoring tightly
    integrated
    7. Discovery and monitoring easily
    extensible
    8. Naturally scalable to > 100K systems
    9. Server failures distinguishable
    from switch failures
    10.Minimal network load
    11.Multi-tenant support

    View Slide

  12. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 12/36
    CO
    Spgs
    O
    S
    U
    G
    This all sounds unreasonable...

    Huge scalability without complexity?

    Discovery without sending packets?
    Really?

    View Slide

  13. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 13/36
    CO
    Spgs
    O
    S
    U
    G
    Third Dimension:
    Uniformly, fully distributed work
    Two philosophical underpinnings
    1. Monitoring and Discovery
    are fully distributed
    2. Reliable “no news is good news”
    Only responses to changes are centralized

    View Slide

  14. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 14/36
    CO
    Spgs
    O
    S
    U
    G
    Simple Scalability

    I can explain how we distribute
    work so your grandmother
    would understand

    View Slide

  15. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 15/36
    CO
    Spgs
    O
    S
    U
    G
    Massive Scalability – or
    “I see dead servers in O(1) time”

    Adding systems does not increase the monitoring work on any
    system

    Each server monitors 2 (or 4) neighbors

    Each server monitors its own services

    Ring repair and alerting is O(n) – but a very small amount of work

    Ring repair for a million nodes is less than 10K packets per day
    (approximately 1 packet per 9 seconds)
    Current Implementation

    View Slide

  16. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 16/36
    CO
    Spgs
    O
    S
    U
    G
    Minimizing Network Footprint
    (planned)

    Support diagnosing switch issues

    Minimize network traffic

    Ideal for multi-site arrangements

    View Slide

  17. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 17/36
    CO
    Spgs
    O
    S
    U
    G
    Fourth Dimension:
    Architectural Components
    Three Architectural Components
    Collective Management
    Authority

    One CMA per installation
    Nanoprobes

    One nanoprobe per system
    Data Storage

    Central Neo4j graph database

    View Slide

  18. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 18/36
    CO
    Spgs
    O
    S
    U
    G
    Basic CMA Functions (python)
    Nanoprobe management

    Configure & direct

    Hear alerts & discovery

    Update rings: join/leave
    Update database
    Issue alerts

    View Slide

  19. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 19/36
    CO
    Spgs
    O
    S
    U
    G
    Nanoprobe Functions ('C')
    Announce self to CMA

    Reserved multicast address (can be
    unicast address or name if no multicast)
    Do what CMA says

    receive configuration information
    – CMA addresses, ports, defaults

    send/expect heartbeats

    perform discovery actions

    perform monitoring actions
    No persistent state across reboots

    View Slide

  20. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 20/36
    CO
    Spgs
    O
    S
    U
    G
    Service Monitoring based on
    Linux-HA/Pacemaker LRM

    LRM == Local Resource Manager

    Well-proven architecture:
    – “no news is good news” AKA
    management by exception

    Implements Open Cluster Framework
    standard (and others)

    Each system monitors own services

    Can also start, stop, migrate services

    View Slide

  21. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 21/36
    CO
    Spgs
    O
    S
    U
    G
    Monitoring Pros and Cons
    Pros
    Simple & Scalable
    Uniform work
    distribution
    No single point of
    failure
    Distinguishes switch
    vs host failure
    Easy on LAN, WAN
    Multi-tenant approach
    Cons
    Active agents
    Potential slowness at
    power-on

    View Slide

  22. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 22/36
    CO
    Spgs
    O
    S
    U
    G
    Why a graph database? (Neo4j)

    Humans describe systems as graphs

    Dependency & Discovery information: graph

    Speed of graph traversals depends on size
    of subgraph, not total graph size

    Root cause queries  graph traversals –
    notoriously slow in relational databases

    Visualization is Natural

    Schema-less design: good for constantly
    changing heterogeneous environment

    Graph Model === Object Model

    View Slide

  23. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 23/36
    CO
    Spgs
    O
    S
    U
    G
    Fifth Dimension:
    Discovery API
    Scripts perform discovery
    – output JSON
    Three Discovery Snippets

    OS information

    Service discovery

    Client discovery

    View Slide

  24. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 24/36
    CO
    Spgs
    O
    S
    U
    G
    How does discovery work?
    Nanoprobe scripts perform discovery

    Each discovers one kind of information

    Can take arguments from environment

    Output JSON
    CMA stores Discovery Information

    JSON stored in Neo4j database

    CMA discovery plugins => graph nodes and
    relationships

    View Slide

  25. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 25/36
    CO
    Spgs
    O
    S
    U
    G
    OS discovery JSON Snippet
    { "nodename": "alanr-1225B",
    "operating-system": "GNU/Linux",
    "machine": "x86_64",
    "processor": "x86_64",
    "hardware-platform": "x86_64",
    "kernel-name": "Linux",
    "kernel-release": "3.8.0-31-generic",
    "kernel-version": "#46-Ubuntu SMP ...",
    "Distributor ID": "Ubuntu",
    "Description": "Ubuntu 13.04",
    "Release": "13.04",
    "Codename": "raring"
    }

    View Slide

  26. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 26/36
    CO
    Spgs
    O
    S
    U
    G
    sshd Service JSON Snippet
    (from netstat and /proc)
    "sshd": {
    "exe": "/usr/sbin/sshd",
    "cmdline": [ "/usr/sbin/sshd", "-D" ],
    "uid": "root",
    "gid": "root",
    "cwd": "/",
    "listenaddrs": {
    "0.0.0.0:22": {
    "proto": "tcp",
    "addr": "0.0.0.0",
    "port": 22
    }, and so on...

    View Slide

  27. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 27/36
    CO
    Spgs
    O
    S
    U
    G
    ssh Client JSON Snippet
    (from netstat and /proc)
    "ssh": {
    "exe": "/usr/sbin/ssh",
    "cmdline": [ "ssh", "servidor" ],
    "uid": "alanr",
    "gid": "alanr",
    "cwd": "/home/alanr/monitor/src",
    "clientaddrs": {
    "10.10.10.5:22": {
    "proto": "tcp",
    "addr": "10.10.10.5",
    "port": 22
    }, and so on...

    View Slide

  28. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 28/36
    CO
    Spgs
    O
    S
    U
    G
    Sixth Dimension:
    Graph Schema
    Two Schema subgraphs

    Client / server
    dependency

    Switch interconnect

    View Slide

  29. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 29/36
    CO
    Spgs
    O
    S
    U
    G
    ssh -> sshd dependency graph

    View Slide

  30. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 30/36
    CO
    Spgs
    O
    S
    U
    G
    Switch Discovery Data
    from LLDP (or CDP)
    CRM transforms LLDP (CDP) Data to JSON

    View Slide

  31. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 31/36
    CO
    Spgs
    O
    S
    U
    G
    Seventh Dimension:
    Current Status

    First release April 2013

    Great unit tests

    Nanoprobe code works well

    Several discovery methods written

    CMA restructuring finishing up

    UI development underway

    Licensed under GPL: commercial
    options available

    View Slide

  32. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 32/36
    CO
    Spgs
    O
    S
    U
    G
    Eighth Dimension:
    Get Involved!
    We need every talent!

    Early adopters

    Testers, Continuous Integration

    Designers

    Developers (C,Python, Shell, PowerShell, JavaScript)

    Porters (esp Windows)

    Promoters, publicists

    Packagers

    And so on...

    View Slide

  33. CoSpgsOSUG
    21 November
    2013
    © 2013 Assimilation Systems Limited 33/36
    CO
    Spgs
    O
    S
    U
    G
    Resistance Is Futile!
    Mailing List bit.ly/AssimML
    #AssimProj @OSSAlanR
    Project Web Site
    assimproj.org
    Blog
    techthoughts.typepad.com
    assimilationsystems.com

    View Slide