Slide 1

Slide 1 text

Getting Started with Vector Cloud native meetup tokyo #9 This document includes the work that is distributed in the Apache License 2.0

Slide 2

Slide 2 text

profile: name: Wataru Matsui org: [ Z Lab, 3bi.tech ] twitter: @watawuwu

Slide 3

Slide 3 text

● What’s Vector? ● Usage ● VS ... ● Roadmap ● Conclusions Agenda

Slide 4

Slide 4 text

What’s Vector? https://vector.dev

Slide 5

Slide 5 text

Logs, Metrics & Events Router Is like Fluentd?

Slide 6

Slide 6 text

Developed by Timber.io https://timber.io

Slide 7

Slide 7 text

Feature ● Log, Metrics, or Events ● Agent Or Service ● Fast ● Correct ● Clear Guarantee ● Vendor Neutral ● Easy To Deploy ● Hot Reload

Slide 8

Slide 8 text

● Fluentd ● Fluent Bit ● Filebeat ● Logstash Similar tool

Slide 9

Slide 9 text

 Summary ©timber.io

Slide 10

Slide 10 text

©timber.io

Slide 11

Slide 11 text

Topologies: Distributed ©timber.io

Slide 12

Slide 12 text

Topologies: Centralized ©timber.io

Slide 13

Slide 13 text

Topologies: Stream-Based ©timber.io

Slide 14

Slide 14 text

How to use Vector

Slide 15

Slide 15 text

Source types ● file ● statsd ● syslog ● tcp ● vector ● stdin(debug)

Slide 16

Slide 16 text

[sources.my_file_source_id]
 
 # REQUIRED - General 
 type = "file" # must be: "file"
 include = ["/var/log/nginx/*.log"]
 exclude = [""]
 Source config

Slide 17

Slide 17 text

[sources.my_tcp_source_id]
 
 # REQUIRED - General 
 type = "tcp" # must be: "tcp"
 address = ["0.0.0.0:9000"]
 Source config

Slide 18

Slide 18 text

Sink types ● aws ○ cloudwatch_logs ○ kinesis_streams ○ s3 ● elasticsearch ● http ● kafka ● prometheus ● splunk_hec ● tcp ● vector ● console ● blackhole(/dev/null)

Slide 19

Slide 19 text

[sinks.my_tcp_sink_id]
 
 # REQUIRED - General 
 type = "tcp" # must be: "tcp"
 input = ["my_tcp_source_id"]
 address = ["92.12.333.224:5000"]
 
 # OPTIONAL - Requests 
 encoding = "json" # default, enum: "json", "text" 
 
 Sinks config

Slide 20

Slide 20 text

[sinks.my_s3_sink_id]
 
 # REQUIRED - General 
 type = "s3" # must be: "s3"
 input = ["my_file_source_id"]
 bucket = "my-bucket"
 region = "ap-northeast-1"
 encoding = "ndjson" # enum: "ndjson", "text" 
 
 # OPTIONAL - Requests
 key_prefix = "date=%F/" # default Sinks config

Slide 21

Slide 21 text

[sinks.my_prometheus_sink_id]
 
 # REQUIRED - General 
 type = "prometheus" # must be: "prometheus" 
 input = ["my_log2metrics_source_id"]
 address = "0.0.0.0:9598" Sinks config

Slide 22

Slide 22 text

Transform types ● Fileld ○ add_fields ○ remove_filed ○ filed_filter ● Paser ○ grok_parser ○ json_parser ○ regex_parser ○ tokenizer ● log_to_metric ● sampler ● lua ● vector ● console ● blackhole(/dev/null)

Slide 23

Slide 23 text

[transforms.my_regex_trans_id]
 
 # REQUIRED - General
 type = "regex_parser" # must be: "regex_parser" 
 inputs = ["my_file_source_id"]
 regex = "^(?P[\\w\\.]+) - (?P[\\w]+) (?P[\\d]+) \\[(?P.*)\\] \"(? P[\\w]+) (?P.*)\" (?P[\\d]+) (?P[\\d]+)$"
 
 # OPTIONAL - Types
 [transforms.my_regex_trans_id.types]
 status = "int"
 method = "string"
 bytes_in = "int"
 bytes_out = "int" Transform config

Slide 24

Slide 24 text

[transforms.my_prometheus_trans_id]
 
 # REQUIRED - General
 type = "log_to_metric" # must be: "log_to_metric"
 inputs = ["my_file_source_id"]
 
 # OPTIONAL - Types
 [[transforms.my_regex_trans_id.metrics]]
 type = "counter" # enum: "counter", "gauge"
 field = "duration" 
 increment_by_value = false
 name = "duration_total"
 labels = {host = "${HOSTNAME}", region = "us-east-1"} 
 
 Transform config

Slide 25

Slide 25 text

[sources.logs] 
 type = 'file'
 include = ['/var/log/*.log']
 
 [transforms.tokenizer] 
 inputs = ['logs']
 type = 'tokenizer'
 field_names = ["timestamp", "level", "message"]
 
 [transforms.sampler] 
 inputs = ['tokenizer']
 type = 'sampler'
 hash_field = 'request_id'
 rate = 10
 [sinks.search] 
 inputs = ['sampler']
 type = 'elasticsearch'
 host = '123.123.123.123:5000'
 
 [sinks.backup] 
 inputs = ['tokenizer']
 type = 's3'
 region = 'ap-northeast-1'
 bucket = 'log-backup'
 key_prefix = 'date=%F'
 Vector config

Slide 26

Slide 26 text

VS

Slide 27

Slide 27 text


 Vector
 FluentBit
 FluentD
 File to TCP
 76.7MiB/s
 35MiB/s
 26.1MiB/s
 Regex Parsing
 13.2MiB/s
 20.5MiB/s
 2.6MiB/s
 TCP to HTTP
 26.7MiB/s
 19.6MiB/s
 <1MiB/s
 Performance report by Timber.io

Slide 28

Slide 28 text


 Vector
 FluentBit
 FluentD
 Memory
 188.1MiB
 370MiB
 890MiB
 CPU
 1.51 1m avg
 0.56 1m avg
 0.57 1m avg
 Performance report by Timber.io

Slide 29

Slide 29 text

Don't trust the reports. Measure, Measure, Measure!

Slide 30

Slide 30 text

Measure using GKE ● Kubernetes: v1.13.7 ● Node x4 ○ 4 CPU ○ 3.6 GB Memory ○ 100 GB Storage(Standard) ● Manifests ○ https://github.com/watawuwu/vector-test

Slide 31

Slide 31 text

Memory Usage Mem usage is low Why fluent-bit uses memory? Vector
 26 MiB/s
 Fluent Bit
 1.091 GiB/s
 Fluentd
 92 MiB/s


Slide 32

Slide 32 text

CPU Usage CPU usage is high Vector
 1.84 core
 Fluent Bit
 0.26 core
 Fluentd
 1.25 core


Slide 33

Slide 33 text

IO Throughput Vector Fluentd Fluentd Bit Throughput is low Error in the test method? Vector
 9.39 MiB/s
 Fluent Bit
 8.26 MiB/s
 Fluentd
 13.64 MiB/s


Slide 34

Slide 34 text

Roadmap

Slide 35

Slide 35 text

Roadmap ● v0.4 Schemas(current) ● v0.5 Stream Consumers ● v0.6 Columnar Writing ● v0.7 CLI ● v0.8 Wire Level Tailing ● v1.0 Stable => 2019/12 Release!!

Slide 36

Slide 36 text

Conclusions

Slide 37

Slide 37 text

ADAPT
 TRIAL
 ASSESS
 HOLD
 watawuwu’s TECH RADAR

Slide 38

Slide 38 text

Thanks! Kubernetes, Cloud Native zlab.co.jp