Slide 1

Slide 1 text

2019 DevDay gRPC Service Development in Private Kubernetes Cluster > Keiichiro Ui > LINE Development Team H Server Side Engineer

Slide 2

Slide 2 text

Kubernetes User without Cloud Platforms?

Slide 3

Slide 3 text

About Me > Started with LINE LIVE projects > Developing new live commerce service > Leveraging new technology • With LINE development environments

Slide 4

Slide 4 text

Live Commerce Service with > LINE LIVE technologies > New Technologies

Slide 5

Slide 5 text

Why Use Kubernetes?

Slide 6

Slide 6 text

Traffic Spike in Live Cast Service > Requiring large scale out • For popular artists, performer, etc.

Slide 7

Slide 7 text

Scale Out Operations in LINE LIVE > Create VM instances > Run some setup scripts/ansible > Register with: • Deployment system • Metrics observer • etc. > Deploy > Register with load balancers

Slide 8

Slide 8 text

Why Use Kubernetes? With Kubernetes > Pod/Cluster autoscaler > Open • Ecosystem for OSS Existing Way > Manual scaling > Developed for LINE devs • Ecosystem for LINE services

Slide 9

Slide 9 text

Agenda > Kubernetes Service for LINE Services > Our Service Architecture Using gRPC > Service Metrics/Observations > Log Aggregation

Slide 10

Slide 10 text

IaaS/PaaS in LINE: Verda > Kubernetes > OpenStack > MySQL > Elasticsearch > Redis > Object Storage > LoadBalancer > etc.

Slide 11

Slide 11 text

Kubernetes for LINE Dev > Resources (Pod, Deployments, etc...) > Namespace > Kubernetes Cluster > OpenStack > Infrastructure + Log + Observation Service Dev Kubernetes Team

Slide 12

Slide 12 text

Agenda > Kubernetes Service for LINE Services > Our Service Architecture Using gRPC > Service Metrics/Observations > Log Aggregation

Slide 13

Slide 13 text

Overview > Microservices with gRPC > Spring Boot > Using Envoy to convert various protocols to gRPC

Slide 14

Slide 14 text

App Interactions > gRPC over Internet > L7 load balancer > NodePort as L4 load balancer > Envoy for L7 load balancer > Server application as headless service

Slide 15

Slide 15 text

Why Envoy? > Load balancing does not work appropriately • External LB cannot identify pod locations > NodePort works as L4 LB • TCP connections are long-lived in
 gRPC/HTTP2

Slide 16

Slide 16 text

Why Envoy?

Slide 17

Slide 17 text

Internal Interactions > Spring Boot > grpc-java • With headless service • Client-side load balancing

Slide 18

Slide 18 text

gRPC-Web > For JavaScript on web browsers • Admin system • Service landing pages > gRPC-Web supports any HTTP/* • no-dependency on HTTP/2

Slide 19

Slide 19 text

REST API > For other LINE services • Envoy gRPC-JSON transcoding > Protocol Buffers can be serialized into JSON

Slide 20

Slide 20 text

Why Avoid Istio? https://istio.io/

Slide 21

Slide 21 text

Why Avoid Istio? > Spring has the same functionalities • Metrics for traffics • Distributed tracing • Fault injection, Circuit breaker, Retry > Performance

Slide 22

Slide 22 text

Why Not Use Istio https://istio.io/docs/concepts/performance-and-scalability/ 13.7ms 1.89ms

Slide 23

Slide 23 text

Database > LINE has managed MySQL • with ACL

Slide 24

Slide 24 text

ACL Is Incompatible with Scaler

Slide 25

Slide 25 text

ACL Manager > API for managing DB ACL > Hooks autoscaler to add or delete a node

Slide 26

Slide 26 text

Agenda > Kubernetes Service for LINE Services > Our Service Architecture Using gRPC > Service Metrics/Observations > Log Aggregation

Slide 27

Slide 27 text

Prometheus > Placed in our Kubernetes cluster > Works as StatefulSet

Slide 28

Slide 28 text

Prometheus > Persistent Volume is hard to use > We try using a remote storage instead • Developed TSDB in-house

Slide 29

Slide 29 text

Too Many Metrics > Too many metrics in Prometheus with Kubernetes • How to visualize? • Which metrics should we watch? > Head time series: 240,638 Prometheus With Kubernetes

Slide 30

Slide 30 text

Too Many Metrics > kubernetes-mixin adds many settings • Grafana dashboards • Pod crush loop detection • Node hanging up • Volume usage predication for PersistentVolumes • etc. Prometheus with Kubernetes

Slide 31

Slide 31 text

Spring Application Metrics > Spring Boot Actuator > OpenCensus • Metrics • Distributed traces • Various Language

Slide 32

Slide 32 text

OpenCensus Built into gRPC OpenCensus > Metrics > Distributed tracing

Slide 33

Slide 33 text

Agenda > Kubernetes Service for LINE Services > Our Service Architecture Using gRPC > Service Metrics/Observations > Log Aggregation

Slide 34

Slide 34 text

Log Aggregations > Fluentd + Elasticsearch + Kibana • For generic logs > IMON • For error log alerts

Slide 35

Slide 35 text

Log Aggregation in Kubernetes Cluster > Old pod logs may be removed > Logs should be stored in external storage

Slide 36

Slide 36 text

EFK (Elasticsearch + Fluentd + Kibana)

Slide 37

Slide 37 text

Fluentd > Referred to • knative requirements • kubernetes addon • not for production • manifests in Helm Chart > plugins • fluent-plugin-elasticsearch • fluent-plugin-kubernetes_metadata_filter • fluent-plugin-detect-exceptions • fluent-plugin-multi-format-parser

Slide 38

Slide 38 text

IMON > Log monitoring system > Developed in-house > Sending alerts for warning/error logs

Slide 39

Slide 39 text

Agenda > Kubernetes Service for LINE Services > Our Service Architecture Using gRPC > Service Metrics/Observations > Log Aggregation

Slide 40

Slide 40 text

Our Booth > Kotlin coroutine > CI/CD with Drone CI + Kubernetes + Github > Development flow with sharing proto files with server/client

Slide 41

Slide 41 text

Related Sessions Kubernetes Client TSDB