Building scalable logging solutions with ELK stack

Building scalable logging solution with Elastic Stack Presenter: Nguyễn Đăng
Anh Thi

Log Logs Log Log Log Log Log Log Log Log

Problems  Different, un-ununified log format

Problems  Different timestamp formats for every service

Scenario  Boss ask: Could you find me logs of
all service from 1/9 - 10/9 ? SED, GREP, AWK

Problems  Different system paths to log file  /var/log/nginx
 /var/log/mysql  Difficult to monitor logs of a cluster system  We have to SSH into each server to check logs.

Solution – We need a log centralized system

Elastic stack

Full-text search engine Built-in distributed feature Built upon Apache Lucene
and written in Java Use HTTP resful API to communicate with the database using JSON format

Usecase Uber Netflix

Terminologies Document A document is a JSON document which is
stored in Elasticsearch. It is like a row in a table in a relational database. Index Is a collection of documents and each document is a collection of fields, which are the key-value pairs that contain your data.

Comparision to SQL relational databases SQL Table Row Elasticsearch Index
Document

Storage mechanism - Inverted Index  An inverted index is
an index data structure storing a mapping from content, such as words or numbers, to its locations in a document or a set of documents D1 : "This is a dog" D2 : "This is a cat" D3 : "Dog eats cat" "this" => {D1, D2} "is" => {D1, D2} "a" => {D1, D2} "dog" => {D1, D3} "cat" => {D2, D3} "eats" => {D3} Supposing we need to find: this dog this {D1, D2} ⋂ dog {D1, D3} = {D1} Documents Inverted Index Tokenize

Distributed storage mechanism Shard 1 Shard 2 Shard 3 Index
60GB 20GB 20GB 20GB

Distributed storage mechanism Node 1 Node 2 Node 3 Shard
1 Shard 2 Shard 3 Index

Distributed storage mechanism Node 1 Node 2 Node 3 Shard
1 Shard 2 Shard 3 Index Primary shard 1 Replica shard 1 Replica shard 1 Primary shard 2 Replica shard 2 Replica shard 2 Primary shard 3 Replica shard 3 Replica shard 3

Cluster architecture

Definitions

❑Manage Index, shard ❑Add, delete node to and from the
cluster

❑Participating in master selection ❑Can self-promote to master node if
the master node failed

❑Data storage ❑Return query results

❑Routing, load balancing

Is a web application used for analytics and visualization of
data from elasticsearch Show performance, metric, logs of application and system services

Sale dashboard

Metric dashboard

Machine learning

App performance management(APM)

Logstash is a log aggregator that collects data from various
input sources, executes different transformations and enhancements and then ships the data to various supported output destinations like ElasticSearch, Kafka,…

Processing data need a pipeline of 3 stages:  3
stages: Input, Filter, Output  In every stages, we can use different plugins

Input Plugins: beats, http, redis, kafka, rabbitmq, amazon cloudwatch

Filter Allowing parsing and transforming data from and to diffferent
formats Data enrichment

Filter plugins Grok – Use Regex to parse data GeoIP
– GeoIP location Date – Parse time stamp Mutate – Add, remove field

Example: Grok plugin client: 55.3.244.1 method: GET request: /index.html

Output - Stash

Config sample

 Beats are a collection of lightweight (resource efficient, no
dependencies, small) and open source log shippers that act as agents installed on the different servers in your infrastructure for collecting logs or metrics.

Filebeat  Filebeat is an agent that specializes in monitoring
log files and sending log entries to logstash or elasticsearch using supported modules.

Metricbeat

Heartbeat

Filebeat supported modules that help parse log into json format

Beats  Beats can send data directly into ElasticSearch 
But usually, Beats are used with logstash to help reduce stress on elasticsearch database

Elastic stack workflow

ELK in production

Use Redis, Kafka to buffer log message Where the is
a huge spike in traffic. Using redis, kafka as a buffering layer can reduce the stress to the system

Compared to other search engine technologies

Deployment Elastic cloud

Hardware requirements for production cluster Filter Số lượng event CPU
Utilization RAM Grok 8k/s 310% 327MB JSON 8k/s 260% 322MB Nguồn: https://www.slideshare.net/sematext/tuning- elasticsearch-indexing-pipeline-for-logs

Hardware requirements for elasticsearch production cluster Supposing: Data throughput 15GB/day
Data storage 10TB/year

Source: https://www.youtube.com/watch?v=nJeCmcUvtmE Master node(8 GB) Master eligible node(8GB) Master eligible
node(8GB) Data node (32GB) Data node (32GB) Client node(8GB)  We need at least 5 machines(96GB RAM, 20 TB Storage)  Each machine (8 - 16) core CPU

Q & A Thank you!

Building scalable logging solutions with ELK stack

Building scalable logging solutions with ELK stack

More Decks by Anh Thi Nguyen

Other Decks in Technology

Featured

Transcript