01
02
03
Three Pillars of Observability
Case Study: LINE INVOICE
Introduction to Observability
CONTENT
04 Conclusion
Slide 6
Slide 6 text
SECTION 01
Introduction to Observability
Slide 7
Slide 7 text
What should be taken when system errors occurs?
Slide 8
Slide 8 text
Purpose of Observability
Slide 9
Slide 9 text
SECTION 02
Three Pillars of Observability
Slide 10
Slide 10 text
Three Pillars of Observability
Slide 11
Slide 11 text
Logs
1. Immutable / Timestamped record of discrete events
2. Record necessary info for each request
Source: https://grafana.com/products/cloud/logs/
Slide 12
Slide 12 text
● Unstructured - PlainText
● Structured - JSON format
● Binary
○ MySQL binlogs
○ systemd journal logs
Logs Format
Slide 13
Slide 13 text
● Unstructured - PlainText
● Structured - JSON format
● Binary
○ MySQL binlogs
○ systemd journal logs
Logs Format
Slide 14
Slide 14 text
● Unstructured - PlainText
● Structured - JSON format
● Binary
○ MySQL binlogs
○ systemd journal logs
Logs Format
Source: https://www.percona.com/blog/binlog-encryption-percona-server-mysql/
Slide 15
Slide 15 text
Logs Collection Flow
Slide 16
Slide 16 text
Metrics
Quantitative insight into system performance and resource utilization
Source: https://grafana.com/products/cloud/metrics/
Slide 17
Slide 17 text
Metrics Supported Data Types
Counter Gauge Histogram Summary
Slide 18
Slide 18 text
Metrics Supported Data Types
Counter Gauge Histogram Summary
• Only increases, never decreases
• Application: HTTP request times
Slide 19
Slide 19 text
Metrics Supported Data Types
Counter Gauge Histogram Summary
• Increase or decrease at any time
• Application: num of concurrent reqs
Slide 20
Slide 20 text
Metrics Supported Data Types
Counter Gauge Histogram Summary
• Only increases, never decreases
• Application: request durations
Slide 21
Slide 21 text
Metrics Supported Data Types
Counter Gauge Histogram Summary
• Provides precise sampling of observations
• Application: request durations
Slide 22
Slide 22 text
Metrics Collection Flow
Slide 23
Slide 23 text
Traces
● Record and visualize the complete path of a request through the system
● Identify specific points
Source: https://grafana.com/docs/grafana/latest/panels-visualizations/visualizations/traces/
https://www.oreilly.com/library/view/distributed-systems-observability/9781492033431/ch04.html
Slide 24
Slide 24 text
Traces Span
Slide 25
Slide 25 text
Traces Span Structure
Slide 26
Slide 26 text
Traces Collection Flow
Slide 27
Slide 27 text
SECTION 03
Case Study
LINE INVOICE
Slide 28
Slide 28 text
Gary Hu
Education
• M.S. in Computer Science @ NTU
• B.B.A in Information Management @ NTU
Experience
• 2023 - 2024 | TECH FRESH @ LINE Taiwan
• 2022 - 2023 | Software Engineer Intern @ KKCompany
• 2022 | Research Assistant @ Academia Sinica
Slide 29
Slide 29 text
LINE INVOICE 發票管家
Slide 30
Slide 30 text
LINE Sticker 貼圖傳送任務
Slide 31
Slide 31 text
SECTION 03
Case Study
LINE INVOICE
Slide 32
Slide 32 text
Case 1: Mystery Behind the Blank Screen
Scenario
Thousands of users simultaneously
accessing our system
Problem
Users are met with blank screens and
error messages.
Challenges
We need to investigate the error, and
identify its cause.
Case 2: Peak Traffic Monitoring
Scenario
Thousands of users simultaneously
accessing our system
Problem
1. Server cannot handle all requests
2. Timeouts and poor user experience
Challenges
Identify bottlenecks and optimize server
performance
Slide 35
Slide 35 text
Case 2: Peak Traffic Monitoring
Source: https://pixotech.com/blog/what-a-performance-how-site-speed-affects-ux/
https://smallbusinessweb.co/impact-of-website-loading-speed/
3.7s
Users start
getting frustrated
75%
Speed affects
user experience
53%
Users abandon
after three seconds
Slide 36
Slide 36 text
Case 2: Peak Traffic Monitoring
Steps
1. Collect metrics from all services
2. Visualize metrics to understand system
behavior
3. Monitor traffic volume and response
time
Slide 37
Slide 37 text
Case 2: Peak Traffic Monitoring
Steps
4. Collect CPU and memory usage
5. Identify issues
6. Address inappropriate
configurations
Slide 38
Slide 38 text
About Our System
Slide 39
Slide 39 text
Case 3: Mystery of the 5-Minute Workflow
Scenario
Many workflows are executed daily to fetch
user invoices
Problem
We discovered that numerous workflows are
taking over 5 minutes to complete.
Challenges
Identify the cause of the delays and optimize
the workflow performance.
Slide 40
Slide 40 text
Case 3: Mystery of the 5-Minute Workflow
Steps
1. Collect traces
2. Visualize traces
3. Analyze each spans
v
Slide 41
Slide 41 text
Case 3: Mystery of the 5-Minute Workflow
Findings
● Fetching invoices from the
government takes 27 seconds.
● Storing the invoices in the database,
however, takes 54 seconds.
54 secs
27 secs