Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Monitoring system at Employment Hero

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for Luong Vo Luong Vo
November 25, 2018

Monitoring system at Employment Hero

Custom in-house built monitoring solution overview @ employmenthero.com

Avatar for Luong Vo

Luong Vo

November 25, 2018
Tweet

More Decks by Luong Vo

Other Decks in Technology

Transcript

  1. Before we start - How to answer those questions? +

    Why is the system too slow? + Does everything work fine? + What’s the main bottleneck of our system? + What did happen at 10:00 AM this morning that made a lot of customers complain? + What’s the average time the user has to wait until they get the notification? + etc.
  2. Observability - Programmatically and continuously capture the states of a

    running system - Analyze and extract the information to produce a set of knowledge that the observer is interested in - Detect the abnormal behaviors and notify the responsible, and automatically take actions to resolve the situation - Archive the data in convenient forms that support future investigation or analyzing
  3. We need a solution that offers - Detailed (both real-time

    and aggregated) statistics about our microservices. - Alerting when usage peeks or accidents happen. - Easy method to implement for our microservices. - Supports a variety of ways to keep data. (counter, gauge, histogram ….) - Two-way integration with Kubernetes
  4. Prometheus and Grafana - Prometheus is an open-source systems monitoring

    and alerting toolkit originally built at SoundCloud. - Grafana is is an open source dashboard tool for data visualization. - They are our selected approach to extract/collect and display monitored data.
  5. Node 1 Push Model Application Node 3 Metrics collector Node

    2 Application POST /metrics POST /metrics
  6. Node 1 Pull Model Application Node 3 Metrics collector Node

    2 Application GET /metrics GET /metrics
  7. Node 1 Pull Model and Sidecar Model Application Node 3

    Metrics collector Node 2 GET /metrics GET /metrics Metric Server /tmp/monitoring Application Metric Server /tmp/monitoring
  8. - This gem helps you monitor your service with ease.

    - It abstracts away many infrastructural layer via a lot of helpers - Built-in native supports for gRPC, Kafka, Sidekiq (soon) EhMonitoring gem
  9. What’s next? - Support other common libraries, like Sidekiq -

    Apply EhMonitoring to all services - Dump Instana and create our own Tracing system
  10. Reference https://github.com/Thinkei/feature-flag-api/pull/81 - Add metrics to feature flag API. https://docs.google.com/document/d/1-wjTM600u5Q68ImhHHA2DTtlh8wX5mc9Xv5

    EEawFNFI/edit - Employment Hero microservices documents. https://github.com/Thinkei/eh-monitoring - EH monitoring gem http://monitor.staging.ehrocks.com/ - Our monitoring page.