Upgrade to Pro — share decks privately, control downloads, hide ads and more …

2015 Ohio LinuxFest Assimilation Talk

2015 Ohio LinuxFest Assimilation Talk

Overview of the Assimilation Project for the 2015 Ohio LinuxFest.

Alan Robertson

October 03, 2015
Tweet

More Decks by Alan Robertson

Other Decks in Technology

Transcript

  1. How to Painlessly Discover What You
    How to Painlessly Discover What You
    Don't Know - Before It Bites You
    Don't Know - Before It Bites You
    #AssimProj @OSSAlanR
    http://assimproj.org/
    Alan Robertson
    Assimilation Systems Limited
    http://assimilationsystems.com

    View Slide

  2. © 2015 Assimilation Systems Limited
    2/45
    Biography
    Biography

    35+ years in IT/development – 10 years in
    system management (SysAdmin)

    Founded Linux-HA project - led 1998-2007 –
    aka “Heartbeat” - now called Pacemaker

    Founded Assimilation Project in 2010

    Founded Assimilation Systems Limited in 2013

    Alumnus of Bell Labs, SuSE, IBM

    View Slide

  3. © 2015 Assimilation Systems Limited
    3/45
    Disturbing Trends...
    Disturbing Trends...

    30% of all break-ins come through “lost” systems (Verizon)

    90% have had failures of unmonitored services (Turnbull)

    80% are unable to keep systems in compliance (Verizon)

    30% start monitoring only after a problem (Turnbull)

    30% of all systems are doing nothing useful (Koomey)

    Many sites have trouble scaling monitoring (Turnbull)

    Larger site admins often don’t know dependencies

    Documentation is incomplete, out of date, expensive

    View Slide

  4. © 2015 Assimilation Systems Limited
    4/45
    Assimilation Project Evolution
    Assimilation Project Evolution

    Inspired by 2 million core computer
    (cyclops64)

    Concerns for extreme scale

    Topology aware monitoring

    Topology discovery w/out security issues
    =►Discovery of everything!

    View Slide

  5. © 2015 Assimilation Systems Limited
    5/45
    A 6-dimensional overview
    A 6-dimensional overview
    1.System Management Suite Overview
    2.Basic Technology
    3.Best Practice Analyses
    4.Demo
    5.Current Status
    6.What You Need To Do!

    View Slide

  6. © 2015 Assimilation Systems Limited
    6/45
    Why
    Why the Assimilation System
    the Assimilation System
    Management Suite?
    Management Suite?

    Provides insight and details through a graph-model CMDB

    Helps you understand and automate your environment
    – Reduce Errors
    – Speed up problem resolution

    Reduces Manual Documentation

    CMDB-driven configuration => near-zero configuration

    Automates Monitoring

    Enhances Security

    Designed for Extreme Scale

    View Slide

  7. © 2015 Assimilation Systems Limited
    7/45
    What's in the Suite?
    What's in the Suite?

    Graph CMDB

    Exception Monitoring

    Security Discovery

    Network Connections

    View Slide

  8. Complexity
    Complexity
    “Complexity is the enemy of reliability”

    Complexity likely your single biggest
    problem
    – Near-zero configuration reduces complexity
    – Tight service integration reduces complexity
    – Accurate detailed view improves complexity
    management

    View Slide

  9. © 2015 Assimilation Systems Limited
    9/45
    Highly Scalable Discovery-Driven
    Highly Scalable Discovery-Driven
    Automation
    Automation
    Continuous Discovery drives everything

    Continuous extensible discovery (CMDB)
    – systems, switches, services, dependencies – zero
    network footprint discovery process

    Extensible exception monitoring
    – more than 100K systems

    Discovery Drives Best Practice Analyses
    – Initially concentrating on security

    All data goes into central graph CMDB

    View Slide

  10. © 2015 Assimilation Systems Limited
    10/45
    This all sounds unreasonable...
    This all sounds unreasonable...

    Huge scalability without complexity?

    Discovery without pings or port scans?
    Really?

    View Slide

  11. © 2015 Assimilation Systems Limited
    11/45
    S
    Simple Scalability
    imple Scalability
    I can explain how we scale so your
    grandmother would understand...
    istockphoto
    ©bowdenimages

    View Slide

  12. © 2015 Assimilation Systems Limited
    12/45
    Massive Scalability –
    Massive Scalability – or
    or
    “I see dead servers in
    “I see dead servers in O
    O(1) time”
    (1) time”

    Adding systems does not increase the monitoring work on any system

    Each server monitors 2 (or 4) neighbors

    Each server monitors and discovers its own services

    Ring repair and alerting is O(n) – but a very small amount of work
    Current Implementation

    View Slide

  13. © 2015 Assimilation Systems Limited
    13/45
    Minimizing Network Footprint
    Minimizing Network Footprint
    (in our roadmap)
    (in our roadmap)

    Support diagnosing switch issues

    Minimize network traffic

    Ideal for multi-site arrangements

    View Slide

  14. © 2015 Assimilation Systems Limited
    14/45
    Service Monitoring based on HA
    Service Monitoring based on HA
    Technologies
    Technologies

    Well-proven architecture:
    – reliable “no news is good news”

    Implements Open Cluster Framework
    standard (LSB and others – Nagios coming!)

    Each system monitors own services

    Can also start, stop, migrate services

    View Slide

  15. © 2015 Assimilation Systems Limited
    15/45
    How does discovery work?
    How does discovery work?
    Nanoprobe scripts perform discovery

    Each discovers one kind of information

    Can take arguments from environment

    Output JSON
    CMA stores Discovery Information

    JSON stored in Neo4j database

    CMA discovery plugins => graph nodes and relationships

    View Slide

  16. © 2015 Assimilation Systems Limited
    16/45
    OS discovery JSON Snippet
    OS discovery JSON Snippet
    { "nodename": "alanr-1225B",
    "operating-system": "GNU/Linux",
    "machine": "x86_64",
    "processor": "x86_64",
    "hardware-platform": "x86_64",
    "kernel-name": "Linux",
    "kernel-release": "3.8.0-31-generic",
    "kernel-version": "#46-Ubuntu SMP ...",
    "Distributor ID": "Ubuntu",
    "Description": "Ubuntu 13.04",
    "Release": "13.04",
    "Codename": "raring" }

    View Slide

  17. © 2015 Assimilation Systems Limited
    17/45
    "sshd": {
    "exe": "/usr/sbin/sshd",
    "cmdline": [ "/usr/sbin/sshd", "-D" ],
    "uid": "root",
    "gid": "root",
    "cwd": "/",
    "listenaddrs": {
    "0.0.0.0:22": {
    "proto": "tcp",
    "addr": "0.0.0.0",
    "port": 22 },
    sshd
    sshd Service
    Service JSON Snippet
    JSON Snippet
    (from netstat and /proc)
    (from netstat and /proc)

    View Slide

  18. © 2015 Assimilation Systems Limited
    18/45
    "ssh": {
    "exe": "/usr/sbin/ssh",
    "cmdline": [ "ssh", "servidor" ],
    "uid": "alanr",
    "gid": "alanr",
    "cwd": "/home/alanr/monitor/src",
    "clientaddrs": {
    "10.10.10.5:22": {
    "proto": "tcp",
    "addr": "10.10.10.5",
    "port": 22 },
    ssh
    ssh Client
    Client JSON Snippet
    JSON Snippet
    (from netstat and /proc)
    (from netstat and /proc)

    View Slide

  19. © 2015 Assimilation Systems Limited
    19/45
    Service Dependency Graph
    Service Dependency Graph

    View Slide

  20. © 2015 Assimilation Systems Limited
    20/45
    Switch Discovery Graph
    Switch Discovery Graph
    from LLDP (or CDP)
    from LLDP (or CDP)

    View Slide

  21. © 2015 Assimilation Systems Limited
    21/45
    Why a graph database? (Neo4j)
    Why a graph database? (Neo4j)

    Humans describe systems as graphs

    Dependency & Discovery information: graph

    Speed of graph traversals depends on size of
    subgraph, not total graph size

    Root cause queries  graph traversals –
    notoriously slow in relational databases

    Visualization is Natural

    Schema-less design: good for constantly changing
    heterogeneous environment

    Graph Model === Object Model

    View Slide

  22. © 2015 Assimilation Systems Limited
    22/45
    A Few Canned Queries
    A Few Canned Queries
    allipports get all port/ip/service/hosts
    allswitchports get switch connections
    crashed get crashed servers
    shutdown get gracefully shutdown servers
    downservices get nonworking services
    findip get system owning IP
    findmac get system owning MAC
    unknownips get unknown IP addresses
    unmonitored get unmonitored services

    View Slide

  23. © 2015 Assimilation Systems Limited
    23/45
    B
    Best Practice Analyses
    est Practice Analyses
    Under active development

    Triggered by Discovery Updates
    – Analysis occurs within seconds of change
    – No change => No analysis

    We can analyze anything discovered

    Expect to create alerts and reports

    SIEM integration

    View Slide

  24. © 2015 Assimilation Systems Limited
    24/45
    Sample Security Best Practices
    Sample Security Best Practices

    Inappropriate services (telnet, etc)

    Settings in /proc/sys/

    Security Patch Coverage
    – OS vendor (RedHat, SuSE, Canonical, etc)
    – Application (Oracle, IBM, WordPress, etc)

    Other OS settings

    Common Application Settings

    Looking at best practices
    FYI: Collaborating with Lynis project and Linux Foundation

    View Slide

  25. © 2015 Assimilation Systems Limited
    25/45
    Other Sample Security Features
    Other Sample Security Features

    Discovery of “forgotten” IP addresses

    Monitoring of Open Ports and Services

    Collection of network-facing app checksums

    Nmon profiling of new MAC addresses

    Checksum outliers analysis

    Security Best Practice Analyses

    View Slide

  26. © 2015 Assimilation Systems Limited
    26/45
    IT Best Practices Project
    IT Best Practices Project
    ITBestPractices.info

    IT-Bestpractices GitHub project

    Working on Linux Foundation Sponsorship

    Apache 2 License (or similar)

    Initial Sources
    – DISA STIGs
    – Lynis project
    – Individual contributions

    View Slide

  27. © 2015 Assimilation Systems Limited
    27/45
    IT Best Practices Goals
    IT Best Practices Goals

    Make Best Practice rules available in JSON
    – Curate mechanically-verifiable practices
    – Human-readable descriptions of issues and
    remedies
    – Multiple language support
    – Not limited to security best practices
    – Web server under development

    View Slide

  28. © 2015 Assimilation Systems Limited
    28/45
    Sample short description
    Sample short description
    The system must limit the ability of processes to
    have simultaneous write and execute access to
    memory.

    View Slide

  29. © 2015 Assimilation Systems Limited
    29/45
    Sample long description
    Sample long description
    ExecShield uses the segmentation feature on all
    x86 systems to prevent execution in memory
    higher than a certain address. It writes an address
    as a limit in the code segment descriptor, to control
    where code can be executed, on a per-process
    basis. When the kernel places a process's memory
    regions such as the stack and heap higher than
    this address, the hardware prevents execution in
    that address range.

    View Slide

  30. © 2015 Assimilation Systems Limited
    30/45
    Sample Security Rule check
    Sample Security Rule check
    The status of the "kernel.exec-shield" kernel parameter can
    be queried by running the following command:
    $ sysctl kernel.exec-shield
    $ grep kernel.exec-shield /etc/sysctl.conf
    The output of the command should indicate a value of "1". If
    this value is not the default value, investigate how it could
    have been adjusted at runtime, and verify it is not set
    improperly in "/etc/sysctl.conf".
    If the correct value is not returned, this is a finding.

    View Slide

  31. © 2015 Assimilation Systems Limited
    31/45
    Assimilation /proc/sys Rule
    Assimilation /proc/sys Rule
    Disallow executing code on writable pages
    “nist_V-38597”:
    {“rule”: “EQ($kernel.exec-shield, 1)”,
    “category”: “security”
    }

    View Slide

  32. © 2015 Assimilation Systems Limited
    32/45
    Assimilation Networking Rule
    Assimilation Networking Rule
    Buffer bloat prevention
    “itbp-0001”:
    {“rule”: “IN($kernel.core.default_qdisc,
    fq_codel, codel)”,
    “category”: “networking”
    }

    View Slide

  33. © 2015 Assimilation Systems Limited
    33/45
    D
    Discovery / Monitoring / Best
    iscovery / Monitoring / Best
    Practices Demo
    Practices Demo

    Demonstrate basic capabilities
    – Discovery-driven monitoring configuration
    – Discovery-driven 'tripwire-like' checksums
    – Monitoring – failures / successes
    – Host down notification
    – Best Practices

    No configuration was supplied
    – everything comes from discovery
    http://assimilationsystems.com/90_second_demo/

    View Slide

  34. © 2015 Assimilation Systems Limited
    34/45
    Current Status
    Current Status

    1.0 (Independence Day) release out 4 July 2015

    Security is our next major emphasis

    Great unit and system tests

    Strongly encrypted communication

    Quite a few discovery methods written

    Extensible Automated Discovery Triggers

    Discovery => Automatic Monitoring + Network-Facing Checksums

    Compatible with Nagios remote monitoring agent API

    REST + Command Line Queries

    View Slide

  35. © 2015 Assimilation Systems Limited
    35/45
    Get Involved!
    Get Involved!

    Trials! Early Adopters!

    Contributors
    – Testers, Continuous Integration
    – Best practice experts
    – Designers
    – Developers (C, Python, Shell, PowerShell, JavaScript)
    – Porters (esp Windows)
    – Promoters, Publicists, Packagers, etc.

    View Slide

  36. © 2015 Assimilation Systems Limited
    36/45
    Resistance Is Futile!
    Resistance Is Futile!
    These slides: bit.ly/DOSUG0915
    Mailing List: bit.ly/AssimML
    @OSSAlanR
    #assimilation on irc.freenode.net
    Project Web Site: assimproj.org
    Company Web Site: assimilationsystems.com
    Download: assimilationsystems.com/download

    View Slide

  37. © 2015 Assimilation Systems Limited
    37/45
    Risk Management/Mitigation
    Risk Management/Mitigation

    Intrusions

    Vulnerable Software

    Licensed Software

    Audit Risk

    Outages

    System management

    View Slide

  38. © 2015 Assimilation Systems Limited
    38/45
    Monitoring Pros and Cons
    Monitoring Pros and Cons
    Pros
    Simple & Scalable
    Uniform work distribution
    No single point of failure
    Distinguishes switch vs
    host failure
    Easy on LAN, WAN
    Multi-tenant approach
    Cons
    Active agents
    Potential slowness
    at power-on

    View Slide

  39. © 2015 Assimilation Systems Limited
    39/45
    Sixth Dimension:
    Sixth Dimension:
    Graph Schema
    Graph Schema
    Two Schema subgraphs

    Client / server
    dependency

    Switch interconnect

    View Slide

  40. First Dimension
    First Dimension:
    :
    Problems Addressed
    Problems Addressed

    Discovering and maintaining documentation
    (CMDB) using continuous discovery
    – Services, Systems, Dependencies, Switches, Interconnects,
    Configuration

    Monitoring and alerting: services, systems and
    compliance

    Managing compliance

    Mitigating risk

    View Slide

  41. © 2015 Assimilation Systems Limited
    43/45
    Why Discovery? (DevOps)
    Why Discovery? (DevOps)

    Documentation: incomplete, incorrect

    Dependencies: unknown

    Planning: Needs accurate data

    Best Practices: Verification needs data

    ITIL CMDB (Configuration Management
    Data Base)
    Our Discovery: continuous, low-profile

    View Slide

  42. © 2015 Assimilation Systems Limited
    44/45
    Second Dimension:
    Second Dimension:
    Unique Powerful Features
    Unique Powerful Features
    1. Continuous Discovery
    2. Discovery: Zero network footprint
    3. Centralized graph database
    4. We know everything that changes
    5. Discover and update dependency information
    6. Discovery and monitoring tightly integrated –
    discovery drives automation

    View Slide

  43. © 2015 Assimilation Systems Limited
    45/45
    (even more) Features...
    (even more) Features...
    7. Discovery and monitoring easily extensible
    8. Naturally scalable to > 100K systems
    9. Minimal network load
    10.Server failures distinguishable from switch failures
    11.Best practice and vulnerability alerts
    12.Multi-tenant support

    View Slide

  44. © 2015 Assimilation Systems Limited
    46/45
    Third Dimension:
    Third Dimension:
    Fully distributed work
    Fully distributed work
    Two philosophical underpinnings
    1. Monitoring and Discovery are fully distributed
    2. Reliable “no news is good news”
    Only responses to changes are centralized

    View Slide

  45. © 2015 Assimilation Systems Limited
    47/45
    Sample /proc/sys Rules
    Sample /proc/sys Rules
    “BPC-00002-1”:
    {“rule”: “OR(EQ($kernel.core_uses_pid, 1),
    NE($kernel.core_pattern, ""))”
    “url”: “https://trello.com/c/6LOXeyDD” },
    “BPC-00003-1”: {“rule”: “EQ($kernel.ctrl-alt-del, 0)”,
    “url”: “https://trello.com/c/aUmn4WFg”},
    “BPC-00006-1”: {“rule”: “EQ($kernel.sysrq, 0)”,
    “url”: “https://trello.com/c/QSovxhup” },

    View Slide