Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building scalable logging solutions with ELK stack

Building scalable logging solutions with ELK stack


Anh Thi Nguyen

December 07, 2020


  1. Building scalable logging solution with Elastic Stack Presenter: Nguyễn Đăng

    Anh Thi
  2. Log Logs Log Log Log Log Log Log Log Log

  3. Problems  Different, un-ununified log format

  4. Problems  Different timestamp formats for every service

  5. Scenario  Boss ask: Could you find me logs of

    all service from 1/9 - 10/9 ? SED, GREP, AWK
  6. Problems  Different system paths to log file  /var/log/nginx

     /var/log/mysql  Difficult to monitor logs of a cluster system  We have to SSH into each server to check logs.
  7. Solution – We need a log centralized system

  8. None
  9. Elastic stack

  10. Full-text search engine Built-in distributed feature Built upon Apache Lucene

    and written in Java Use HTTP resful API to communicate with the database using JSON format
  11. Usecase Uber Netflix

  12. Terminologies Document A document is a JSON document which is

    stored in Elasticsearch. It is like a row in a table in a relational database. Index Is a collection of documents and each document is a collection of fields, which are the key-value pairs that contain your data.
  13. Comparision to SQL relational databases SQL Table Row Elasticsearch Index

  14. Storage mechanism - Inverted Index  An inverted index is

    an index data structure storing a mapping from content, such as words or numbers, to its locations in a document or a set of documents D1 : "This is a dog" D2 : "This is a cat" D3 : "Dog eats cat" "this" => {D1, D2} "is" => {D1, D2} "a" => {D1, D2} "dog" => {D1, D3} "cat" => {D2, D3} "eats" => {D3} Supposing we need to find: this dog this {D1, D2} ⋂ dog {D1, D3} = {D1} Documents Inverted Index Tokenize
  15. Distributed storage mechanism Shard 1 Shard 2 Shard 3 Index

    60GB 20GB 20GB 20GB
  16. Distributed storage mechanism Node 1 Node 2 Node 3 Shard

    1 Shard 2 Shard 3 Index
  17. Distributed storage mechanism Node 1 Node 2 Node 3 Shard

    1 Shard 2 Shard 3 Index Primary shard 1 Replica shard 1 Replica shard 1 Primary shard 2 Replica shard 2 Replica shard 2 Primary shard 3 Replica shard 3 Replica shard 3
  18. Cluster architecture

  19. Definitions

  20. ❑Manage Index, shard ❑Add, delete node to and from the

  21. ❑Participating in master selection ❑Can self-promote to master node if

    the master node failed
  22. ❑Data storage ❑Return query results

  23. ❑Routing, load balancing

  24. None
  25. Is a web application used for analytics and visualization of

    data from elasticsearch Show performance, metric, logs of application and system services
  26. None
  27. Sale dashboard

  28. Metric dashboard

  29. Machine learning

  30. App performance management(APM)

  31. Logstash is a log aggregator that collects data from various

    input sources, executes different transformations and enhancements and then ships the data to various supported output destinations like ElasticSearch, Kafka,…
  32. Processing data need a pipeline of 3 stages:  3

    stages: Input, Filter, Output  In every stages, we can use different plugins
  33. Input Plugins: beats, http, redis, kafka, rabbitmq, amazon cloudwatch

  34. Filter Allowing parsing and transforming data from and to diffferent

    formats Data enrichment
  35. Filter plugins Grok – Use Regex to parse data GeoIP

    – GeoIP location Date – Parse time stamp Mutate – Add, remove field
  36. Example: Grok plugin client: method: GET request: /index.html

  37. Output - Stash

  38. Config sample

  39. None
  40.  Beats are a collection of lightweight (resource efficient, no

    dependencies, small) and open source log shippers that act as agents installed on the different servers in your infrastructure for collecting logs or metrics.
  41. None
  42. Filebeat  Filebeat is an agent that specializes in monitoring

    log files and sending log entries to logstash or elasticsearch using supported modules.
  43. Metricbeat

  44. Heartbeat

  45. Filebeat supported modules that help parse log into json format

  46. Beats  Beats can send data directly into ElasticSearch 

    But usually, Beats are used with logstash to help reduce stress on elasticsearch database
  47. Elastic stack workflow

  48. ELK in production

  49. Use Redis, Kafka to buffer log message Where the is

    a huge spike in traffic. Using redis, kafka as a buffering layer can reduce the stress to the system
  50. Compared to other search engine technologies

  51. Deployment Elastic cloud

  52. Hardware requirements for production cluster Filter Số lượng event CPU

    Utilization RAM Grok 8k/s 310% 327MB JSON 8k/s 260% 322MB Nguồn: https://www.slideshare.net/sematext/tuning- elasticsearch-indexing-pipeline-for-logs
  53. Hardware requirements for elasticsearch production cluster Supposing: Data throughput 15GB/day

    Data storage 10TB/year
  54. Source: https://www.youtube.com/watch?v=nJeCmcUvtmE Master node(8 GB) Master eligible node(8GB) Master eligible

    node(8GB) Data node (32GB) Data node (32GB) Client node(8GB)  We need at least 5 machines(96GB RAM, 20 TB Storage)  Each machine (8 - 16) core CPU
  55. Q & A Thank you!