Monitoring with Sensu Go

98f9dfc2e5e1318ac78b8c716582cd30?s=47 portertech
November 06, 2018

Monitoring with Sensu Go

Applications are complex systems. Their many moving parts, component and dependency services, may span any number of infrastructure technologies and platforms, from bare metal to serverless. As the number of services increases, teams responsible for them will naturally develop their own preferences, such as how they instrument their code or how and when they receive alerts. Sean will demonstrate how Sensu Go is designed to monitor these ever changing heterogeneous environments. Sensu Go is the next release of the open source monitoring framework, rewritten in Go, with new capabilities and reduced operational overhead. Sean will go over various patterns of data collection, including scraping Prometheus metrics, and show how Sensu enables self-service monitoring and alerting for service owners.

98f9dfc2e5e1318ac78b8c716582cd30?s=128

portertech

November 06, 2018
Tweet

Transcript

  1. Monitoring with Sensu Go Co-Founder & CTO, Sensu Inc. Sean

    Porter OSMC 2018
  2. • Sean Porter • Author of Sensu • CTO for

    Sensu Inc. • @portertech
  3. We solve our problems with technology.

  4. We create new problems with technology.

  5. None
  6. Illustration by Fredrik Skarstedt

  7. HOST APP APP APP APP

  8. HOST VM VM APP APP APP APP

  9. HOST VM VM APP APP APP APP

  10. COMPLEXITY TIME

  11. None
  12. Ephemeral infrastructure is the new normal.

  13. # OF THINGS TIME Containers Servers VMs Functions

  14. None
  15. Multigenerational infrastructure is a thing.

  16. None
  17. 32%

  18. None
  19. None
  20. None
  21. My Story

  22. Sonian (2010)

  23. • Multiple “clouds” • Ephemeral infrastructure • Public networks •

    Service check investment
  24. July 2011

  25. None
  26. February 2013

  27. January 2017

  28. Sensu Go

  29. None
  30. Design

  31. None
  32. None
  33. None
  34. None
  35. None
  36. None
  37. None
  38. None
  39. None
  40. None
  41. None
  42. { timestamp: 1516663186, entity: { … }, check: { …

    }, metrics: { ... } }
  43. None
  44. None
  45. None
  46. None
  47. • Backend REST API • sensuctl (CLI tool) • WebUI

    Configuration
  48. • RBAC • Namespaces Configuration

  49. None
  50. None
  51. 3 Methods The three methods of data collection with Sensu.

  52. 1. Service Checks

  53. • Script • STDOUT (message and data) • Exit code

    (severity) Service Checks
  54. check_mysql -H localhost -P 3360 Uptime: 798 Threads: 1 Questions:

    5 Slow queries: 0 Opens: 107 Flush tables: 1 Open tables: 26 Queries per second avg: 0.006|Connections=9c;;; Open_files=6;;; Open_tables=27;;; Qcache_free_memory=16760152;;; Qcache_hits=0c;;; Qcache_inserts=0c;;; Qcache_lowmem_prunes=0c;;; Qcache_not_cached=1c;;; Qcache_queries_in_cache=0;;; Queries=6c;;; Questions=4c;;; Table_locks_waited=0c;;; Threads_connected=1;;; Threads_running=1;;; Uptime=798c;;; Exit 0 (OK) Service Checks
  55. check_mysql -H localhost -P 3360 Can't connect to MySQL server

    on 'localhost' Exit 2 (CRITICAL) Service Checks
  56. { timestamp: 1516663186, entity: { … }, check: { command:

    “check_mysql -H ...” output: “Can’t connect ... ”, status: 2, … }, metrics: { ... } }
  57. Symptoms

  58. None
  59. check_mysql -H localhost -P 3360 Uptime: 798 Threads: 1 Questions:

    5 Slow queries: 0 Opens: 107 Flush tables: 1 Open tables: 26 Queries per second avg: 0.006|Connections=9c;;; Open_files=6;;; Open_tables=27;;; Qcache_free_memory=16760152;;; Qcache_hits=0c;;; Qcache_inserts=0c;;; Qcache_lowmem_prunes=0c;;; Qcache_not_cached=1c;;; Qcache_queries_in_cache=0;;; Queries=6c;;; Questions=4c;;; Table_locks_waited=0c;;; Threads_connected=1;;; Threads_running=1;;; Uptime=798c;;; Exit 0 (OK) Service Checks
  60. { timestamp: 1516663186, entity: { … }, check: { …

    }, metrics: { handlers: [influxdb], points: [{ name: mysql.connections, value: 9, tags: [ … ] }] } }
  61. None
  62. None
  63. • Simple • Accessible • Shareable • Legacy Service Checks

  64. 2. Events API

  65. • REST API (Agent & Backend) • Entity management •

    External checks • Metrics Events API
  66. POST /events { timestamp: 1516663186, entity: { … }, check:

    { … }, metrics: { ... } }
  67. { timestamp: 1516663186, entity: { name: leviathan, class: application, tags:

    [ … ], ... }, check: { … }, metrics: { ... } }
  68. { timestamp: 1516663186, entity: { … }, check: { output:

    “Backup failed ... ”, status: 2, ttl: 6h, … }, metrics: { ... } }
  69. { timestamp: 1516663186, entity: { … }, check: { …

    }, metrics: { handlers: [influxdb], points: [{ name: mysql.connections, value: 9, tags: [ … ] }] } }
  70. 3. StatsD

  71. • Agent listeners (TCP & UDP) • Stats aggregation •

    Gauges, counters, etc. • Protocol enhancements (tags) StatsD
  72. <name>:<value>|c[|@<sample rate>]

  73. { timestamp: 1516663186, entity: { … }, check: { …

    }, metrics: { handlers: [influxdb], points: [{ name: http.requests, value: 42, tags: [{ name: app, value: store }] }] } }
  74. • Service checks • Events API • StatsD 3 Methods

    Recap
  75. Live demo!

  76. Summary

  77. COMPLEXITY TIME

  78. # OF THINGS TIME Containers Servers VMs Functions

  79. None
  80. None
  81. None
  82. Thank You Co-Founder & CTO, Sensu Inc. Sean Porter (@portertech)

    OSMC 2018