Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Monitoring system at Employment Hero

Luong Vo
November 25, 2018

Monitoring system at Employment Hero

Custom in-house built monitoring solution overview @ employmenthero.com

Luong Vo

November 25, 2018
Tweet

More Decks by Luong Vo

Other Decks in Technology

Transcript

  1. Before we start - How to answer those questions? +

    Why is the system too slow? + Does everything work fine? + What’s the main bottleneck of our system? + What did happen at 10:00 AM this morning that made a lot of customers complain? + What’s the average time the user has to wait until they get the notification? + etc.
  2. Observability - Programmatically and continuously capture the states of a

    running system - Analyze and extract the information to produce a set of knowledge that the observer is interested in - Detect the abnormal behaviors and notify the responsible, and automatically take actions to resolve the situation - Archive the data in convenient forms that support future investigation or analyzing
  3. We need a solution that offers - Detailed (both real-time

    and aggregated) statistics about our microservices. - Alerting when usage peeks or accidents happen. - Easy method to implement for our microservices. - Supports a variety of ways to keep data. (counter, gauge, histogram ….) - Two-way integration with Kubernetes
  4. Prometheus and Grafana - Prometheus is an open-source systems monitoring

    and alerting toolkit originally built at SoundCloud. - Grafana is is an open source dashboard tool for data visualization. - They are our selected approach to extract/collect and display monitored data.
  5. Node 1 Push Model Application Node 3 Metrics collector Node

    2 Application POST /metrics POST /metrics
  6. Node 1 Pull Model Application Node 3 Metrics collector Node

    2 Application GET /metrics GET /metrics
  7. Node 1 Pull Model and Sidecar Model Application Node 3

    Metrics collector Node 2 GET /metrics GET /metrics Metric Server /tmp/monitoring Application Metric Server /tmp/monitoring
  8. - This gem helps you monitor your service with ease.

    - It abstracts away many infrastructural layer via a lot of helpers - Built-in native supports for gRPC, Kafka, Sidekiq (soon) EhMonitoring gem
  9. What’s next? - Support other common libraries, like Sidekiq -

    Apply EhMonitoring to all services - Dump Instana and create our own Tracing system
  10. Reference https://github.com/Thinkei/feature-flag-api/pull/81 - Add metrics to feature flag API. https://docs.google.com/document/d/1-wjTM600u5Q68ImhHHA2DTtlh8wX5mc9Xv5

    EEawFNFI/edit - Employment Hero microservices documents. https://github.com/Thinkei/eh-monitoring - EH monitoring gem http://monitor.staging.ehrocks.com/ - Our monitoring page.