2019 DevDay
gRPC Service Development in
Private Kubernetes Cluster
> Keiichiro Ui
> LINE Development Team H Server Side Engineer
Slide 2
Slide 2 text
Kubernetes User without Cloud Platforms?
Slide 3
Slide 3 text
About Me
> Started with LINE LIVE projects
> Developing new live commerce service
> Leveraging new technology
• With LINE development environments
Slide 4
Slide 4 text
Live Commerce Service with
> LINE LIVE technologies
> New Technologies
Slide 5
Slide 5 text
Why Use Kubernetes?
Slide 6
Slide 6 text
Traffic Spike in Live Cast Service
> Requiring large scale out
• For popular artists, performer, etc.
Slide 7
Slide 7 text
Scale Out Operations in LINE LIVE
> Create VM instances
> Run some setup scripts/ansible
> Register with:
• Deployment system
• Metrics observer
• etc.
> Deploy
> Register with load balancers
Slide 8
Slide 8 text
Why Use Kubernetes?
With Kubernetes
> Pod/Cluster autoscaler
> Open
• Ecosystem for OSS
Existing Way
> Manual scaling
> Developed for LINE devs
• Ecosystem for LINE services
Slide 9
Slide 9 text
Agenda
> Kubernetes Service for LINE Services
> Our Service Architecture Using gRPC
> Service Metrics/Observations
> Log Aggregation
Slide 10
Slide 10 text
IaaS/PaaS in LINE: Verda
> Kubernetes
> OpenStack
> MySQL
> Elasticsearch
> Redis
> Object Storage
> LoadBalancer
> etc.
Slide 11
Slide 11 text
Kubernetes for LINE Dev
> Resources (Pod, Deployments, etc...)
> Namespace
> Kubernetes Cluster
> OpenStack
> Infrastructure
+ Log
+ Observation
Service Dev
Kubernetes Team
Slide 12
Slide 12 text
Agenda
> Kubernetes Service for LINE Services
> Our Service Architecture Using gRPC
> Service Metrics/Observations
> Log Aggregation
Slide 13
Slide 13 text
Overview
> Microservices with gRPC
> Spring Boot
> Using Envoy to convert various
protocols to gRPC
Slide 14
Slide 14 text
App Interactions
> gRPC over Internet
> L7 load balancer
> NodePort as L4 load balancer
> Envoy for L7 load balancer
> Server application as headless service
Slide 15
Slide 15 text
Why Envoy?
> Load balancing does not work
appropriately
• External LB cannot identify pod locations
> NodePort works as L4 LB
• TCP connections are long-lived in
gRPC/HTTP2
Slide 16
Slide 16 text
Why Envoy?
Slide 17
Slide 17 text
Internal Interactions
> Spring Boot
> grpc-java
• With headless service
• Client-side load balancing
Slide 18
Slide 18 text
gRPC-Web
> For JavaScript on web browsers
• Admin system
• Service landing pages
> gRPC-Web supports any HTTP/*
• no-dependency on HTTP/2
Slide 19
Slide 19 text
REST API
> For other LINE services
• Envoy gRPC-JSON transcoding
> Protocol Buffers can be serialized into JSON
Slide 20
Slide 20 text
Why Avoid Istio?
https://istio.io/
Slide 21
Slide 21 text
Why Avoid Istio?
> Spring has the same functionalities
• Metrics for traffics
• Distributed tracing
• Fault injection, Circuit breaker, Retry
> Performance
Slide 22
Slide 22 text
Why Not Use Istio
https://istio.io/docs/concepts/performance-and-scalability/
13.7ms
1.89ms
Slide 23
Slide 23 text
Database
> LINE has managed MySQL
• with ACL
Slide 24
Slide 24 text
ACL Is Incompatible with Scaler
Slide 25
Slide 25 text
ACL Manager
> API for managing DB ACL
> Hooks autoscaler to add or delete a node
Slide 26
Slide 26 text
Agenda
> Kubernetes Service for LINE Services
> Our Service Architecture Using gRPC
> Service Metrics/Observations
> Log Aggregation
Slide 27
Slide 27 text
Prometheus
> Placed in our Kubernetes cluster
> Works as StatefulSet
Slide 28
Slide 28 text
Prometheus
> Persistent Volume is hard to use
> We try using a remote storage instead
• Developed TSDB in-house
Slide 29
Slide 29 text
Too Many Metrics
> Too many metrics in Prometheus with
Kubernetes
• How to visualize?
• Which metrics should we watch?
> Head time series: 240,638
Prometheus With Kubernetes
Slide 30
Slide 30 text
Too Many Metrics
> kubernetes-mixin adds many settings
• Grafana dashboards
• Pod crush loop detection
• Node hanging up
• Volume usage predication for
PersistentVolumes
• etc.
Prometheus with Kubernetes
Slide 31
Slide 31 text
Spring Application Metrics
> Spring Boot Actuator
> OpenCensus
• Metrics
• Distributed traces
• Various Language
Slide 32
Slide 32 text
OpenCensus Built into gRPC
OpenCensus
> Metrics
> Distributed tracing
Slide 33
Slide 33 text
Agenda
> Kubernetes Service for LINE Services
> Our Service Architecture Using gRPC
> Service Metrics/Observations
> Log Aggregation
Slide 34
Slide 34 text
Log Aggregations
> Fluentd + Elasticsearch + Kibana
• For generic logs
> IMON
• For error log alerts
Slide 35
Slide 35 text
Log Aggregation in Kubernetes Cluster
> Old pod logs may be removed
> Logs should be stored in external storage
Slide 36
Slide 36 text
EFK (Elasticsearch + Fluentd + Kibana)
Slide 37
Slide 37 text
Fluentd
> Referred to
• knative requirements
• kubernetes addon
• not for production
• manifests in Helm Chart
> plugins
• fluent-plugin-elasticsearch
• fluent-plugin-kubernetes_metadata_filter
• fluent-plugin-detect-exceptions
• fluent-plugin-multi-format-parser
Slide 38
Slide 38 text
IMON
> Log monitoring system
> Developed in-house
> Sending alerts for warning/error
logs
Slide 39
Slide 39 text
Agenda
> Kubernetes Service for LINE Services
> Our Service Architecture Using gRPC
> Service Metrics/Observations
> Log Aggregation
Slide 40
Slide 40 text
Our Booth
> Kotlin coroutine
> CI/CD with Drone CI + Kubernetes + Github
> Development flow with sharing proto files with server/client