Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Grafana Lokiで構築する大規模ログモニタリング基盤 / Grafana Lok...
Search
LINE Developers
PRO
November 04, 2021
Technology
11
8.9k
Grafana Lokiで構築する大規模ログモニタリング基盤 / Grafana Loki Deep Dive
CloudNative Days Tokyo 2021での登壇資料です
https://event.cloudnativedays.jp/cndt2021/talks/1252
LINE Developers
PRO
November 04, 2021
Tweet
Share
More Decks by LINE Developers
See All by LINE Developers
LINEスタンプのSREing事例集:大きなスパイクアクセスを捌くためのSREing
line_developers
PRO
1
1.9k
Java 21 Overview
line_developers
PRO
6
1k
Code Review Challenge: An example of a solution
line_developers
PRO
1
1.1k
KARTEのAPIサーバ化
line_developers
PRO
1
440
著作権とは何か?〜初歩的概念から権利利用法、侵害要件まで
line_developers
PRO
5
2k
生成AIと著作権 〜生成AIによって生じる著作権関連の課題と対処
line_developers
PRO
3
2k
マイクロサービスにおけるBFFアーキテクチャでのモジュラモノリスの導入
line_developers
PRO
9
3k
A/B Testing at LINE NEWS
line_developers
PRO
3
830
LINEのサポートバージョンの考え方
line_developers
PRO
2
1.1k
Other Decks in Technology
See All in Technology
オープンソースAIとは何か? --「オープンソースAIの定義 v1.0」詳細解説
shujisado
7
800
データプロダクトの定義からはじめる、データコントラクト駆動なデータ基盤
chanyou0311
2
300
【Startup CTO of the Year 2024 / Audience Award】アセンド取締役CTO 丹羽健
niwatakeru
0
990
Can We Measure Developer Productivity?
ewolff
1
150
Adopting Jetpack Compose in Your Existing Project - GDG DevFest Bangkok 2024
akexorcist
0
110
TanStack Routerに移行するのかい しないのかい、どっちなんだい! / Are you going to migrate to TanStack Router or not? Which one is it?
kaminashi
0
580
社内で最大の技術的負債のリファクタリングに取り組んだお話し
kidooonn
1
550
透過型SMTPプロキシによる送信メールの可観測性向上: Update Edition / Improved observability of outgoing emails with transparent smtp proxy: Update edition
linyows
2
210
Lambda10周年!Lambdaは何をもたらしたか
smt7174
2
110
EventHub Startup CTO of the year 2024 ピッチ資料
eventhub
0
110
RubyのWebアプリケーションを50倍速くする方法 / How to Make a Ruby Web Application 50 Times Faster
hogelog
3
940
Amplify Gen2 Deep Dive / バックエンドの型をいかにしてフロントエンドへ伝えるか #TSKaigi #TSKaigiKansai #AWSAmplifyJP
tacck
PRO
0
370
Featured
See All Featured
Optimizing for Happiness
mojombo
376
70k
Designing on Purpose - Digital PM Summit 2013
jponch
115
7k
Designing for humans not robots
tammielis
250
25k
How to Think Like a Performance Engineer
csswizardry
20
1.1k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
229
52k
Stop Working from a Prison Cell
hatefulcrawdad
267
20k
GraphQLとの向き合い方2022年版
quramy
43
13k
Being A Developer After 40
akosma
86
590k
The Invisible Side of Design
smashingmag
298
50k
Fontdeck: Realign not Redesign
paulrobertlloyd
82
5.2k
Art, The Web, and Tiny UX
lynnandtonic
297
20k
Java REST API Framework Comparison - PWX 2021
mraible
PRO
28
8.2k
Transcript
1 Grafana LokiͰߏங͢Δ େنϩάϞχλϦϯάج൫ CNDT2021 LINEגࣜձࣾ Hiroki Sakamoto / @taisho6339
2 ࣗݾհ - Title: Senior Software Engineer@LINE Corp - Role:
Private Cloud։ൃ৫ͷSRE - Mission Private CloudΛԣஅͨ͠৴པվળ - Interest: Kubernetes, ࢄγεςϜ, ےτϨ, OSS׆ಈ - Twitter: @taisho6339
like Prometheus but for logs • ϩάͷอଘͱݕࡧػೳ • ϩάϕʔεͷΞϥʔςΟϯάػೳ •
ϩάϕʔεͷϝτϦΫε࡞ػೳ • ϚϧνςφϯτͷDefault Support 3 LokiͱԿ͔ʁ
4 LokiͱԿ͔ʁ
5 LokiͱԿ͔ʁ
6 ҆͘େ༰ྔͷϩάΛอଘՄೳ
7 Private Cloud “Verda” in LINE is based on OpenStack.
since 2016~ FaaS PaaS IaaS NAT LB Bare metal
8 Private Cloud “Verda” in LINE Virtual Machines 74000+ 30000+
4000+ Physical Machines Hypervisors
9 20 TB / day application logs
10 Loki is suitable for us!
11 Lokiͷ͠͞ LokiϚΠΫϩαʔϏε • ֤ίϯϙʔωϯτɺ֤ΩϟογϡͷΈͱׂ͕ෆ໌ྎ • ϩάσʔλͲ͜ͰͲΜͳܗࣜͰͲͷ͘Β͍อ࣋͞ΕΔ͔ෆ໌ྎ • ετϨʔδো࣌Ͳ͏͍͏ڍಈʹͳΔͷ͔ෆ໌ྎ •
ຊ൪Ͱӡ༻͢ΔͳΒԿΛߟྀ͠ͳ͍ͱ͍͚ͳ͍ͷ͔͕ෆ໌ྎ
12 ຊηογϣϯͷΰʔϧ ࠃͰ࠷ৄࡉʹղઆ͢Δ͜ͱΛࢦ͠·͢ • ମܥతʹLokiͷίϯϙʔωϯτͷׂͱΈΛΔ • τϥϒϧ࣌ʹݪҼͷಛఆ͕ਝʹͰ͖ΔΑ͏ʹͳΔ • ࣗྗͰΩϟύγςΟཧɺύϑΥʔϚϯενϡʔχϯάͰ͖ΔΑ ͏ʹͳΔ
13 ຊηογϣϯఆߏ Loki Version: v2.3.0 Cache: Memcached Chunk Storage: AWS
S3 Index Storage: AWS S3 + BoltDB Shipper
14 1) ϩάͷॻ͖ࠐΈϓϩηεΛΔ 2) ϩάͷಡΈࠐΈϓϩηεΛΔ 3) ো࣌ͷڍಈΛΔ ຊηογϣϯͷRoadmap
15 1) ϩάͷॻ͖ࠐΈϓϩηε
16 Overview
17 ॻ͖ࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Loki Clients Storage Distributor
Ingesters Clients (Promtail, Fluentd)
18 ॻ͖ࠐΈϓϩηε Overview Amazon S3 Chunk Cache Index Write Cache
Loki Clients Storage Distributor Ingesters Clients (Promtail, Fluentd) • FluentdPromtailͳͲͷɺϩάૹ৴Client • ϩάΛಡΈऔΓLokiͷΤϯυϙΠϯτૹ৴͢Δ
19 Amazon S3 Chunk Cache Index Write Cache Loki Clients
Storage Distributor Ingesters Clients (Promtail, Fluentd) • ϦΫΤετͷόϦσʔγϣϯΛߦ͏ • దͳIngesterϧʔςΟϯά͢Δ ॻ͖ࠐΈϓϩηε Overview
20 Amazon S3 Chunk Cache Index Write Cache Loki Clients
Storage Distributor Ingesters Clients (Promtail, Fluentd) ϩάΛ࣮ࡍʹετϨʔδʹอଘ͢Δ Ұఆ࣌ؒόοϑΝϦϯά͔ͯ͠Βετ Ϩʔδʹอଘ ॻ͖ࠐΈϓϩηε Overview
21 Amazon S3 Loki Clients Storage Distributor Ingesters Clients (Promtail,
Fluentd) • S3ʹϩάΛӬଓԽ • MemcachedʹϩάͷΩϟογϡΛอଘ Chunk Cache ॻ͖ࠐΈϓϩηε Overview
22 Client͔ΒDistributorͷૹ৴
23 Client -> Distributor Clients (Promtail, Fluentd) Distributor HTTP Headers
X-Scope-OrgID : <Tenant ID> TenantIDΛRequest Headerʹهࡌ
24 Client -> Distributor Clients (Promtail, Fluentd) Distributor {service=“keystone”, hostname=“host1”}
00:00:02 keystone log body {service=“keystone”, hostname=“host1”} 00:00:03 keystone log body {service=“keystone”, hostname=“host1”} 00:00:04 keystone log body LokiૹΔϩάσʔλߏ
25 Client -> Distributor {service=“keystone”, hostname=“host1”} 00:00:02 keystone log body
Stream Log Body TS
26 Client -> Distributor {service=“keystone”, hostname=“host1”} 00:00:02 keystone log body
ϩά͍͔ͭ͘ͷϥϕϧΛ࣋ͭɻ TenantIDͱϥϕϧͷΈ߹Θͤͷ ҰͭҰͭΛɺʮStreamʯͱݺͿ Stream Log Body TS
27 DistributorͰͷόϦσʔγϣϯ • ϥϕϧͷܗࣜਖ਼͍͔͠ʁ • Rate limitΛ͑ͳ͍͔ʁ Clients (Promtail, Fluentd)
Distributor
28 Distributor DistributorͰͷόϦσʔγϣϯ Distributor Distributor Distributor Distributorಉ࢜ͰΫϥελϦϯά • StatusΛޓ͍ʹ૬ޓʹࢹ •
ingestion_rate_strategy͕global ͳΒΫϥελશମͰingestion rateΛ੍ޚ͢Δ
29 Distributor DistributorͰͷόϦσʔγϣϯ Distributor Distributor Distributor DistributorશମͰόϦσʔγϣϯ ͢ΔͨΊͷΫϥελϦϯά
30 Distributor͔ΒIngesterͷૹ৴
31 Distributor -> Ingesters Ingester Ingesters Ingester Ingester Distributor ݕࡧͷͨΊʹ(ޙड़)
ෳͷIngesterʹ Խͯ͠ϩάΛૹΔ
32 Distributor -> Ingesters Ingester Ingesters Ingester Ingester Distributor Ingesterಉ࢜ͰΫϥελϦϯά
• StatusΛޓ͍ʹ૬ޓʹࢹ • Consistent HashͷRingʹͳͬͯ ͍Δ
33 Distributor -> Ingesters Ingester Ingesters Ingester Ingester FNV1-32bitͰHashΛੜ tenantID
+ {service=“keystone”, hostname=“host1”} a6965cd7
34 Distributor -> Ingesters Ingester Ingesters Ingester Ingester Distributor a6965cd7
Consistent Hashʹج͍ͮͯɺ ࢉग़ͨ͠HashʹରԠ͢ΔIngester Λreplication factorཁٻ
35 Distributor -> Ingesters Ingester Ingesters Ingester Ingester Distributor ؼ͖ͬͯͨෳͷIngesterʹɺ
ಉ͡ϩάΛಉ࣌ૹ৴
36 Distributor -> Ingesters Ingester Ingesters Ingester Ingester Distributor OK
OK Fail
37 Distributor -> Ingesters Ingester Ingesters Ingester Ingester Distributor ա͕OKͳΒޭ
OK OK Fail
38 IngesterͷRequest Handling
39 IngesterͷRequest Handling Memory Tenant1 Disk Tenant2 {service=“keystone”, hostname=“host1”} Ingester
40 IngesterͷRequest Handling Memory Tenant1 Disk Tenant2 Stream1 chunks StreamͰGrouping͞Εɺ
Chunkͱ͍͏ܗࣜͰAppend Ingester
41 IngesterͷRequest Handling Memory Tenant1 Disk Tenant2 Stream1 chunks WAL
Segment Ingester WALʹه
42 IngesterͷRequest Handling Memory Tenant1 Disk Tenant2 Stream1 chunks WAL
Segment Ingester OK
43 IngesterͷRequest Handling Memory Tenant1 Disk Tenant2 Stream1 chunks WAL
Segment Ingester ͠StreamͰ࠷ޙʹड͚औͬͨϩάͷ࣌ؒΑΓ લͷ࣌ؒͷϩά͕དྷͨ߹ϦΫΤετΛࣦഊͤ͞Δ Out of order entry error
44 ChunkόοϑΝϦϯά
45 IngesterͷChunkόοϑΝϦϯά • ҰఆͷϩάΛStream͝ͱʹ ʮChunkʯʹ·ͱΊΔ • ChunkϝϞϦ্ʹอଘ͞ΕΔ • HeadɺBlocksͱ͍͏ྻΛอ࣋ Head
Blocks compressed block compressed block compressed block compressed block compressed block MemoryChunk
46 IngesterͷChunkόοϑΝϦϯά Head Blocks compressed block compressed block compressed block
compressed block compressed block MemoryChunk {service=“keystone”, hostname=“host1”}
47 IngesterͷChunkόοϑΝϦϯά Head Blocks compressed block compressed block compressed block
compressed block compressed block MemoryChunk Log Append ड͚औͬͨϩάҰ୴Head Append͞ΕΔ
48 IngesterͷChunkόοϑΝϦϯά Head Blocks compressed block compressed block compressed block
compressed block compressed block MemoryChunk Log Log Append 1 block sizeཷ·Δ·Ͱ܁Γฦ͢
49 IngesterͷChunkόοϑΝϦϯά Head Blocks compressed block compressed block compressed block
compressed block compressed block MemoryChunk Log Log Log Log Log Append 1 block sizeཷ·Δ·Ͱ܁Γฦ͢
50 IngesterͷChunkόοϑΝϦϯά Head Blocks compressed block compressed block compressed block
compressed block compressed block compressed block MemoryChunk HeadʹՃ BlockʹՃ ҰఆαΠζཷ·ͬͨΒ ઃఆͨ͠ܗࣜͰѹॖ͠ɺ blocksʹՃ͢Δ
51 IngesterͷChunkόοϑΝϦϯά Head Blocks compressed block compressed block compressed block
compressed block compressed block compressed block MemoryChunk HeadʹՃ Block͕Ұఆཷ·ͬͨΒ Read Only ModeʹͳΓɺFlush Queue (Default 10 blocks, target chunk sizeͰࢦఆ)
52 Ingester͔ΒStorageFlush
53 ChunkͷFlush Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Chunk Goroutine Goroutine Amazon S3 Chunk Cache Disk సஔIndex
54 ChunkͷFlush Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Chunk Goroutine Goroutine Amazon S3 Index Write Cache Chunk Cache Disk సஔΠϯσοΫε ݅Λຬͨ͢ChunkΛݕ • ࢦఆαΠζʹ౸ୡ • ࠷ޙͷߋ৽͔Βchunk idle periodܦա • max_chunk_ageܦա
55 ChunkͷFlush Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Chunk Goroutine Goroutine Amazon S3 Disk సஔΠϯσοΫε Flush QueueEnqueue
56 ChunkͷFlush Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Chunk Goroutine Goroutine Amazon S3 Disk Enqueue͞ΕͨͷΛFlush 1. ChunkΛS3อଘ 2. Chunk Cacheอଘ(Write Through) 3. సஔIndexΛϩʔΧϧBoltDBʹอଘ Chunk Cache సஔIndex
57 ChunkͷFlush Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Goroutine Goroutine Amazon S3 Disk IngesterͷϦΫΤετ ෳ͞Ε͍ͯΔͷͰɺ طʹCacheʹೖ͍ͬͯΔChunk Storageͷॻ͖ࠐΈ͕ൃੜ͠ͳ ͍Α͏ʹ੍ޚ͍ͯ͠Δ Chunk Chunk Cache సஔIndex
58 Cacheͷෛՙࢄ Chunk Cache Ingesters Chunk Cache Chunk Cache Chunk1
Key Chunk2 Key Chunk3 Key ઃఆʹΑͬͯConsistent HashʹΑΓɺ ChunkͷKeyΛݩʹࢄͯ͠อଘͰ͖Δ
59 Write Ahead Log
60 ChunkͷFlush(࠶ܝ) Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Chunk Goroutine Goroutine Amazon S3 Index Write Cache Chunk Cache Disk సஔΠϯσοΫε
61 IngesterͷChunkͷ࣋ͪํ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Chunk Goroutine Goroutine Amazon S3 Index Write Cache Chunk Cache Disk సஔΠϯσοΫε Flush͞ΕΔલʹϓϩηε͕ࢭ·Δͱ Chunk͕شൃ͢Δ
62 WALͷҙٛ ӬଓԽͯ͠Memoryͷشൃʹඋ͑Δ
63 LokiͷWALͷಛ • ϩάड৴࣌ʹɺMemoryͱWAL྆ํʹॻ͖ࠐΉ • WALॻ͖ࠐΈ͕ࣦഊͯ͠ॲཧΛࣦഊͤ͞ͳ͍ • Ingesterͷϓϩηεىಈ࣌ʹWAL͔Βͷ෮چॲཧ͕ೖΔ • Ұఆظؒ͝ͱʹෆཁͳWALύʔδ͞ΕΔ
64 WALͷΈ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Goroutine Disk Log Entry ϩάΛૹ৴
65 WALͷΈ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Goroutine Disk Segment1 Append SegmentϑΝΠϧʹ rawσʔλͷ··ه͞ΕΔ
66 WALͷΈ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Goroutine Disk Segment1 Create! 1ϑΝΠϧ͕େ͖͘ͳͬͯ͘Δͱ ผͷSegmentϑΝΠϧΛ࡞ Segment2
67 WALͷΈ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Goroutine Goroutine Disk Segment1 Segment2 Ұఆظؒ͝ͱʹ ෆཁͳSegmentϑΝΠϧͷ ύʔδॲཧ͕ೖΔ
68 WALͷΈ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Goroutine Disk Segment1 Segment2 Segment3 Create! SegmentΛҰͭਐΊΔ
69 WALͷΈ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Goroutine Disk Segment1 Segment2 Segment3 Checkpoint1 IngesterͷະFlush ChunkΛ CheckpointͱݺΕΔ εφοϓγϣοτͱͯ͠อଘ
70 WALͷΈ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Goroutine Disk Segment1 Segment2 Segment3 Checkpoint1 MemoryChunkߏͷ··อଘ͢Δ ͷͰblockѹॖܗࣜɺ headঢ়ଶͷϩάඇѹॖͱͳΔ
71 WALͷΈ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Goroutine Disk Segment3 Checkpoint1 ࠷৽ͷCheckpointͱSegmentΛ ͯͯ͢͠আ͢Δ
72 WALͷΈ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Goroutine Disk Segment3 Checkpoint1 ͦͷ࣌ͰFlush͞Ε͍ͯͳ͍ ͯ͢ͷϩάؚ͕·Ε͍ͯΔ
73 WALͷΈ Ingester Tenant1 Memory Tenant2 Goroutine Disk Segment3 Checkpoint1
Ingesterͷىಈ࣌ʹ Disk͔Β SegmentͱCheckpointΛಡΈऔΔ
74 WALͷΈ Ingester Tenant1 Memory Tenant2 Goroutine Disk Segment3 Checkpoint1
Stream1 Stream2 Stream1 Stream2 Memoryʹ෮ݩ͢Δ
75 WALͷΈ Ingester Tenant1 Memory Tenant2 Goroutine Disk Segment3 Checkpoint1
Stream1 Stream2 Stream1 Stream2 ෮ݩ͕ྃ͢Δ·ͰϓϩηεΛىಈ͠ͳ͍
76 Ingester্ͷσʔλͱEncodeܗࣜ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Disk Segment1 Segment2 Segment3 Checkpoint1 1block sizeҎԼͷraw + ѹॖ per Chunk raw Memory Chunkͱಉ༷
77 2) ϩάͷಡΈࠐΈϓϩηε
78 Overview
79 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend Queriers Query Result Cache Ingesters
80 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend ϦΫΤετΛड͚Δ
81 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend ड͚औͬͨΫΤϦΛ࣌ؒͳͲͰ ׂͯ͠ΩϡʔΠϯά͢Δ
82 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend ෳͷQuerier͕Ωϡʔ͔ΒQuery Λड͚औΓϋϯυϦϯά͢Δ Queriers
83 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend ͯ͢ͷIngesterʹରͯ͠ɺ MemoryChunk͔ΒQueryʹMatch ͢ΔͷΛཁٻ Queriers Ingesters
84 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend QueryʹରԠ͢ΔసஔIndexΛऔಘ͢Δ ͜ͷͱ͖CacheʹಁաతʹΞΫηε Queriers
85 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend Cacheʹଘࡏ͠ͳ͍߹ɺ ϩʔΧϧͷBoltDB͔ΒMatch͢ΔIndexΛऔಘ ͦͯ݁͠ՌΛCacheʹอଘ͢Δ(snappy) Queriers BoltDB
86 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend సஔIndex͔ΒରͷChunkΛׂΓग़͢ Queriers
87 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend ChunkΛऔಘ͢Δ Cacheʹͳ͍ͷStorage͔Βऔಘ͠ɺ Cacheʹอଘ͢Δ Queriers
88 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend ͯ͢ͷQuerier͔Βͷ݁ՌΛड͚औΓɺ ूɺιʔτɺॏෳഉআΛ࣮ࢪ Queriers
89 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend ݁ՌΛQuery Result Cacheอଘ͢Δ Queriers Query Result Cache
90 ಡΈࠐΈϓϩηεͷొਓ Amazon S3 Chunk Cache Index Read Cache Loki
Storage Query Frontend Ϩεϙϯεฦ٫ Queriers Query Result Cache
91 సஔIndex͔ΒରChunkͷબఆ
92 సஔIndexͷత S3͔ΒChunkΛ࠷খ࿑ྗͰऔಘ͢Δ͜ͱ
93 సஔIndex͔ΒରChunkͷߜΓࠐΈ 1.LabelͷKeyͱValueͷΈ߹Θ͔ͤΒStreamͷID (Series ID)Λऔಘ͢Δ 2.Series IDͱ࣌ؒൣғ͔ΒChunkͷKeyΛऔಘ͢Δ 3.ChunkͷKey͔ΒS3্ɺMemcached্ͷύεΛׂ Γग़͠ɺChunkΛDownload
94 SeriesID {service=“keystone”, hostname=“host1”} 00:00:02 keystone log body key, valueͷΈ߹Θͤͷsha256
9ac2adda49e899b312a9abb895656b1ab26c9858fd500f2ae3983d5309b39363/ ϩά SeriesID
95 Chunk Key {service=“keystone”, hostname=“host1”} 00:00:02 keystone log body Tenant1/a6965cd7:Chunk։࢝࣌ؒ:Chunkऴྃ࣌ؒ
key, valueͷΈ߹ΘͤͷHash a6965cd7 ϩά FingerPrint Chunk Key
Hash Value Range Value Value TenantID + LabelName Hash(Label Value)
+ SeriesID Label Value సஔIndexͷߏΠϝʔδ Label Key-ValueͷHash͔ΒSeriesIDΛҾ͘Index
Hash Value Range Value Value TenantID + LabelName Hash(Label Value)
+ SeriesID Label Value Label Key-ValueͷHash͔ΒSeriesIDΛҾ͘Index Hash + RangeͰϢχʔΫߦΛಛఆ͢Δ సஔIndexͷߏΠϝʔδ
Hash Value Range Value Value TenantID + LabelName Hash(Label Value)
+ SeriesID Label Value Label Key-ValueͷHash͔ΒSeriesIDΛҾ͘Index సஔIndexͷߏΠϝʔδ Range Valueൣғݕࡧɺιʔτʹར༻Ͱ͖Δ
Hash Value (TenantID + Label Name) Range Value (Hash(Label Value)
+ SeriesID) Value (Label Value) Tenant1:service abc680ab:c79abadeff keystone Tenant1:host cfe960ab:bcfe12ea hostname1 Tenant1:type can860ab:c79abadeff api Tenant1:service cdc680ab:c79abadeff nova Tenant1:host bee960ab:bcfe12ea hostname21 Tenant1:type abd860ab:c79abadeff scheduler ςʔϒϧΠϝʔδ సஔIndexͷߏΠϝʔδ
͜ͷIndex LabelͷKeyͱValueͷύλʔϯ͚ͩ࡞ΒΕΔ ΧʔσΟφϦςΟͷߴ͍ϥϕϧ ͜ͷIndex͕େྔʹ࡞ΒΕΔ͜ͱʹͳΔ సஔIndexͷߏΠϝʔδ
SeriesID͔ΒChunk KeyΛҾ͘Index సஔIndexͷߏΠϝʔδ Hash Value Range Value Value TenantID +
SeriesID Chunkͷ։࢝࣌ؒ + Chunk Key nil
102 Index͔ΒͷChunk KeyׂΓग़͠ {service=“keystone”, hostname=“host1”} |= “level=ERROR” LogQL ϥϕϧϚον෦ ϑΟϧλ෦
103 Index͔ΒͷChunk KeyׂΓग़͠ {service=“keystone”, hostname=“host1”} |= “level=ERROR” LogQL ϥϕϧϚον෦ ϑΟϧλ෦
IndexΛ͏ͷ͜ͷ෦ͷධՁ
104 Index͔ΒͷChunk KeyׂΓग़͠ {service=“keystone”, hostname=“host1”} {service=“keystone”} {hostname=“host1”} ׂ
105 Index͔ΒͷChunk KeyׂΓग़͠ {service=“keystone”, hostname=“host1”} {service=“keystone”} {hostname=“host1”} ͦΕͧΕͰϚονϯά݅ ͔ΒసஔIndexΛऔಘ సஔIndex
సஔIndex
Hash Value (TenantID + Label Name) Range Value (Hash(Label Value)
+ SeriesID) Value Tenant1:service abc680ab:c79abadeff keystone Tenant1:host cfe960ab:bcfe12ea hostname1 Tenant1:type can860ab:c79abadeff api Tenant1:service cdc680ab:c79abadeff nova Tenant1:host bee960ab:bcfe12ea hostname21 Tenant1:type abd860ab:c79abadeff scheduler Index͔ΒͷChunk KeyׂΓग़͠ {service=“keystone”} ώοτ͢Δͷ͜ͷϨίʔυ
Hash Value (TenantID + Label Name) Range Value (Hash(Label Value)
+ SeriesID) Value Tenant1:service abc680ab:c79abadeff keystone Tenant1:host cfe960ab:bcfe12ea hostname1 Tenant1:type can860ab:c79abadeff api Tenant1:service cdc680ab:c79abadeff nova Tenant1:host bee960ab:bcfe12ea hostname21 Tenant1:type abd860ab:c79abadeff scheduler Index͔ΒͷChunk KeyׂΓग़͠ {service=“keystone”} ରSeriesID
108 Index͔ΒͷChunk KeyׂΓग़͠ {service=“keystone”, hostname=“host1”} {service=“keystone”} {hostname=“host1”} Index͔ΒSeriesIDΛநग़ ྆ํʹڞ௨͢Δͷ͚ͩ࠾༻ SeriesIDs
సஔIndex సஔIndex
109 Index͔ΒͷChunk KeyׂΓग़͠ {service=“keystone”, hostname=“host1”} {service=“keystone”} {hostname=“host1”} SeriesIDͱ࣌ؒൣғ͔Β Chunk KeyΛநग़
SeriesIDs సஔIndex Chunk Keys సஔIndex సஔIndex
SeriesID = c79abadeff, ൣғ=2021/10/26 21:52:00 + 5min Hash Value (TenantID
+ SeriesID) Range Value (Chunk։࢝࣌ؒ + Chunk Key) Value Tenant1:c79abadeff 1635252768:chunk1 nil Tenant1:c79abadeff 1635256368:chunk2 nil Tenant1:c79abadeff 1635260768:chunk3 nil Index͔ΒͷChunk KeyׂΓग़͠
SeriesID = c79abadeff, ൣғ=2021/10/26 21:52:00 + 5min Hash Value (TenantID
+ SeriesID) Range Value (Chunk։࢝࣌ؒ + Chunk Key) Value Tenant1:c79abadeff 1635252768:chunk1 nil Tenant1:c79abadeff 1635256368:chunk2 nil Tenant1:c79abadeff 1635260768:chunk3 nil Index͔ΒͷChunk KeyׂΓग़͠ ରChunkͷRecord
112 Ϩεϙϯεͷੜͱฦ٫
113 Ϩεϙϯεͷੜ Chunk Keys Lazy Chunks ॳظঢ়ଶͰ࣮ମΛ࣋ͨͣɺ ಡΈࠐΈ໋ྩ͕͞ΕͨλΠϛϯάͰετϨʔδ(Ωϟογϡ) ʹChunkΛऔΓʹߦ͘Lazy ChunkΛੜ
114 Ϩεϙϯεͷੜ Chunk Keys Lazy Chunks Ingester͔ΒͷChunks Iterator Ingester͔ΒͷChunkΛ͋Θͤͯɺ IteratorΛੜ
115 Ϩεϙϯεͷੜ Chunk Keys Lazy Chunks Ingester͔ΒͷChunks Iterator Response Ϩεϙϯε͕نఆ݅ʹୡ͢Δ·Ͱɺ
IteratorΛಡΈࠐΜͰ͍͘ ϩάͷFilter݅͜͜ͰධՁ͞ΕΔ |= “level=ERROR”
116 Ϩεϙϯεͷੜ Chunk Keys Lazy Chunks Ingester͔ΒͷChunks Iterator Response LazyChunkΛಡΈࠐΉ߹ɺ
Ωϟογϡɺͳ͚ΕStorage ChunkΛ͍߹ΘͤΔ Amazon S3 Chunk Cache
117 Ϩεϙϯεͷੜ Chunk Keys Lazy Chunks Ingester͔ΒͷChunks Iterator Response Storage͔ΒಡΈࠐΜͩ߹ɺ
औಘޙʹCacheʹอଘ͢Δ Amazon S3 Chunk Cache
118 Query Sharding
119 Queryͷׂઓུ 1. ࣌ؒ͝ͱʹׂ͢Δ • 1࣌ؒͷϩάΛݕࡧ͢Δ߹ɺ15Ͱׂ͢ΔઃఆͳΒ4 ͭͷΫΤϦʹղ͞Ε࣮ͯߦ͞ΕΔ 2. సஔIndexΛSharding͢Δ •
͋Β͔͡ΊసஔIndexʹShard൪߸Λ͍Ε͓͖ͯɺ QueryFrontend͕QueryΛׂ͠ɺͦΕͧΕʹShard൪߸Λ ૠೖͯ͠QuerierʹΘͨ͢
Hash Value Range Value Value TenantID + LabelName Shard Number
+ Hash(Label Value) + SeriesID Label Value సஔIndexͷShard൪߸ຒΊࠐΈ Shard Number = SeriesID % shard count
121 సஔIndexͷShard൪߸ຒΊࠐΈ {service=“keystone”, hostname=“host1”} {service=“keystone”} {service=“keystone”,hostname=“host2”} 1 12 16 Shard
Number Stream
122 Shard൪߸ʹΑΔQueryׂ {service=“keystone”} |= “level=ERROR” LogQL Querier for shard 1
Querier for shard 12 Querier for shard 16 QueryΛShardͰׂ
123 Shard൪߸ʹΑΔQueryׂ {service=“keystone”, hostname=“host1”} {service=“keystone”} {service=“keystone”,hostname=“host2”} 1 12 16 {service=“keystone”}
|= “level=ERROR” LogQL Querier for shard 1 Querier for shard 12 Querier for shard 16 Chunk Keys Chunk Keys Chunk Keys औΕΔChunk͕shardͰׂ͞ΕΔ
124 Shard൪߸ʹΑΔQueryׂ {service=“keystone”, hostname=“host1”} {service=“keystone”} {service=“keystone”,hostname=“host2”} 1 12 16 {service=“keystone”}
|= “level=ERROR” LogQL Querier for shard 1 Querier for shard 12 Querier for shard 16 Chunk Keys Chunk Keys Chunk Keys |= “level=ERROR”ͷfilterॲཧΛׂॲཧͰ͖Δ
125 BoltDB ShipperʹΑΔIndexཧ
126 BoltDB Shipper Ingester Shipper Disk Querier Shipper Disk Index
1 Index 2 Index 1 Index 2 Amazon S3
127 BoltDB Shipper - Ingester side Ingester Shipper Disk Querier
Shipper Disk Index 1 Index 2 Index 1 Index 2 Amazon S3 Ұఆ࣌ؒ͝ͱʹϩʔΧϧDiskʹ͋ΔIndex ΛS3Ξοϓϩʔυ͢Δ Ξοϓϩʔυޙʹআ͢Δ
128 BoltDB Shipper - Querier side Ingester Shipper Disk Querier
Shipper Disk Index 1 Index 2 Index 1 Index 2 Amazon S3 • ىಈ࣌ʹS3ʹ͋ΔIndexΛDownload • Query࣌ʹΓͳ͍IndexS3͔Β μϯϩʔυ • Ұఆ࣌ؒ͝ͱʹ࠷ऴ༻͔Β CacheTTLܦաͨ͠IndexΛআ
129 BoltDB Shipper Ingester Shipper Disk Querier Shipper Disk Index
1 Index 2 Index 1 Index 2 Amazon S3 Ingester, QuerierϩʔΧϧͰIndexΛѻ͍ɺ Shipper͕ඇಉظͰIndexΛStorageͱಉظ͢Δ
130 ֤ίϯϙʔωϯτͷׂ·ͱΊ
131 Name ׂ λΠϓ σʔλͷ࣋ͪԽ ΫϥελϦϯά༗ແ Distributor όϦσʔγϣϯͱIngesterͷϧʔςΟϯά Stateless ༗
Ingester σʔλͷόοϑΝϦϯάͱFlush Stateful Memory: Chunks(raw + ѹॖ) Disk: WAL(raw + ѹॖ) సஔIndex(ѹॖ) ༗ Query Frontend ΫΤϦͷׂɺΩϡʔ੍ޚ Stateless ແ Querier ΫΤϦͷ࣮ߦ Stateful Disk: సஔIndexͷCache(ѹॖ) ແ Chunk Cache ChunkͷΩϟογϡ Stateful Memory: Chunks(ѹॖ) ༗(ΫϥΠΞϯταΠυ) Index Read Cache IndexͷRead༻Ωϟογϡ Stateful Memory: సஔIndex(Snappy) ༗(ΫϥΠΞϯταΠυ) Index Write Cache ಉ͡Indexͷॻ͖ࠐΈ͕ෳൃੜ͠ͳ͍Α͏ʹ ͢ΔͨΊͷ੍ޚ༻Ωϟογϡ (BoltDB ShipperͰෆཁ) Stateful Memory: Chunk Key(raw) ༗(ΫϥΠΞϯταΠυ) Query Result Cache ΫΤϦͷ݁ՌͷΩϟογϡ Stateful Memory: Query Result(raw) ༗(ΫϥΠΞϯταΠυ)
132 3) ো࣌ͷڍಈΛΔ
133 ॻ͖ࠐΈ࣌ͷোઃܭ
134 ॻ͖ࠐΈ࣌ͷোઃܭ S3ো࣌ʹඋ͑Δ
135 ॻ͖ࠐΈ࣌ͷোઃܭ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Chunk Goroutine Goroutine Amazon S3 Index Write Cache Chunk Cache Disk Segment1 Checkpoint1 ͕ࣦ͜͜ഊ సஔIndex
136 ॻ͖ࠐΈ࣌ͷোઃܭ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Chunk Goroutine Goroutine Amazon S3 Index Write Cache Chunk Cache Disk సஔΠϯσοΫε Segment1 Checkpoint1 FlushඇಉظͳͷͰ ॻ͖ࠐΈࣗମࣦഊ͠ͳ͍
137 ॻ͖ࠐΈ࣌ͷোઃܭ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Chunk Goroutine Goroutine Amazon S3 Index Write Cache Chunk Cache Disk సஔΠϯσοΫε Segment1 Checkpoint1 ͕ࣦ͜͜ഊ MemoryͱDiskʹͨ·Γଓ͚Δ
138 ॻ͖ࠐΈ࣌ͷোઃܭ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Chunk Chunk Goroutine Goroutine Amazon S3 Index Write Cache Chunk Cache Disk సஔΠϯσοΫε Segment1 Checkpoint1 ͕ࣦ͜͜ഊ ઌʹMemory͕͋;ΕɺOOM
139 WALͷΈ(࠶ܝ) Ingester Tenant1 Memory Tenant2 Goroutine Disk Segment3 Checkpoint1
Ingesterͷىಈ࣌ʹ Disk͔Β SegmentͱCheckpointΛಡΈऔΔ
140 WALͷΈ(࠶ܝ) Ingester Tenant1 Memory Tenant2 Goroutine Disk Segment3 Checkpoint1
Stream1 Stream2 Stream1 Stream2 Memoryʹ෮ݩ͢Δ
141 ॻ͖ࠐΈ࣌ͷোઃܭ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Amazon S3 Index Write Cache Chunk Cache Disk సஔΠϯσοΫε Segment1 Checkpoint1 WAL࣮࣭MemoryͷSnapshot ϩʔυޭ͍ͯۙ͠͏ͪʹOOM ϩʔυࣦഊͨ͠Βͦͦىಈ͠ͳ͍
142 ॻ͖ࠐΈ࣌ͷোઃܭ Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2
Amazon S3 Index Write Cache Chunk Cache Disk సஔΠϯσοΫε Segment1 Checkpoint1 ϘτϧωοΫIngesterͷMemory
143 Ingester Tenant1 Memory Stream1 Stream2 Tenant2 Stream1 Stream2 Disk
Segment1 Segment2 Segment3 Checkpoint1 1block sizeҎԼͷraw + ѹॖ per Chunk raw Memory Chunkͱಉ༷ Ingester্ͷσʔλͱEncodeܗࣜ(࠶ܝ)
144 ॻ͖ࠐΈ࣌ͷোઃܭ ରࡦ1. ͍͑ͨ࣌ؒͷMemoryΛੵΉ • ࣌ؒ͋ͨΓͷϩάྔ / ѹॖൺ * ࣌ؒ
* replication factor ※ѹॖൺฐڥͰgzipѹॖͰ10~18ഒͷѹॖൺ
145 ॻ͖ࠐΈ࣌ͷোઃܭ 1ͷྲྀྔ10TBͷڥͰ1࣌ؒ͑Δ(ฐࣾͷ1Ϧʔδϣϯ) 1000 / 24 = 41.6 GB /
hour • Replication Factor = 1 • Ingester * 11 • Memory: 4GiB • Disk: 8GiB(ϚʔδϯΛऔͬͯMemoryͷ2ഒ) • Chunk Cache * 14 • Memory: 3GiB
146 ॻ͖ࠐΈ࣌ͷোઃܭ Replication Factor1Ͱ͍͍ͷ͔ʁ • Flush͞ΕΔ·ͰʹIngesterϓϩηε͕μϯ͢Δͱͦͷؒ ͚ͩͦʹ͋ͬͨϩάܽଛ͢Δ ͋ΔఔׂΓΓΛ͢Δ • WAL͕͋ΔͷͰ࠶ىಈޙʹ͙͢ʹ෮چͰ͖Δ
• ࡉ͔͍ܽଛΑΓো࣌ʹՔಇܧଓͰ͖ΔՄೳੑΛߴΊΔํ ʹৼΔ
147 ॻ͖ࠐΈ࣌ͷোઃܭ ରࡦ2. ো࣌Ұ࣌తʹWALΛແޮʹͯ͠ىಈ͢Δ • WALϩʔυ͕Βͳ͍ͷͰɺMemory͔Β͋;ΕΔྔཷ·͍ͬͯ ͯϓϩηεىಈͰ͖ΔΑ͏ʹͳΔ • ࠶༗ޮʹ͢Δͱ͖ʹϩά͕شൃ͠ͳ͍Α͏ɺreplication factorɺ
update strategyʹྀ͢Δ
148 ಡΈऔΓ࣌ͷোઃܭ
149 ಡΈऔΓ࣌ͷোઃܭ Chunk Keys Lazy Chunks Ingester͔ΒͷChunks Iterator Response Amazon
S3 Chunk Cache
150 ಡΈऔΓ࣌ͷোઃܭ Storageো࣌ʹϩάΛݕࡧ͢ΔͨΊͷ݅ • Ingester͕࠷Ұ݈ͭࡏͰ͋Δ͜ͱ • ݕࡧ݁ՌΛCache͔IngesterͷσʔλͰΧόʔͰ͖Δ͜ͱ • Cacheʹͳ͍࣌ؒൣғΛΫΤϦʹࢦఆ͠ͳ͍͜ͱ
151 ·ͱΊ
152 ·ͱΊ LokiͷίΞͰ͋Δॻ͖ࠐΈͱಡΈࠐΈϓϩηεΛৄղ • ಈ࡞ݪཧ͕Θ͔ͬͨ͜ͱͰɺτϥϒϧγϡʔςΟϯάνϡ ʔχϯά͕Մೳʹ • Ͳ͜ͰͲͷΤϯίʔσΟϯάͰσʔλΛ͔࣋ͭΛѲ͢Δ͜ ͱͰΩϟύγςΟϓϥϯχϯά͕Մೳʹ •
ো࣌ͷڍಈΛѲ͢Δ͜ͱͰదͳ४උ͕ݕ౼Մೳʹ
153 ·ͱΊ ͑ΒΕͳ͔ͬͨ͜ͱ • ϩά͔ΒͷϝτϦΫεੜΞϥʔςΟϯά • ϩάͷϦςϯγϣϯཧʹ͍ͭͯ • LokiࣗମͷϞχλϦϯάʹ͍ͭͯ •
֤ίϯϙʔωϯτͷΩϟύγςΟཧʹ͍ͭͯ • Out of order entryͷରࡦʹ͍ͭͯ
154 ผ్ຊΛॻ͖·͢
155 Twitter: @taisho6339 ࣭͝
156 THANK YOU