Upgrade to Pro — share decks privately, control downloads, hide ads and more …

LINE TODAY - 微服務架構支撐千萬,活躍用戶的影音內容平臺

LINE TODAY - 微服務架構支撐千萬,活躍用戶的影音內容平臺

LINE Developers Taiwan
PRO

October 18, 2022
Tweet

More Decks by LINE Developers Taiwan

Other Decks in Technology

Transcript

  1. LINE TODAY - 微服務架構支撐千萬 活躍用戶的影音內容平臺 Libra Huang - LINE TODAY

    10/18/2022
  2. • LINE TODAY and its architecture • How miniservices and

    K8S changes our development and operation • Lessons learnt Agenda
  3. LINE TODAY Stays With You Today 個人化訊息 8:00 財經天氣 12:30

    國際/國內/娛樂 生活內容 15:00 話題/投票/電影 官方帳號 18:00 TODAY看世界 20:00 棒球/NBA/演唱會 新聞內容 直播
  4. Taiwan Thailand Hong Kong

  5. LINE TODAY Taiwan 18M MAU

  6. CDN Object Storage In Mem Cach e Backend Server Frontend

    Server Cache Server Mini Service Feeding Service Internal CMS External CMS Vue.js Third Party API Content Provider Internal Editor External Editor Report Service Data Warehouse LINE TODAY Architecture ML Data Analysis
  7. CDN 96% Object Storage In Mem Cach e Backend Server

    Frontend Server Cache Server Mini Service Feeding Service Internal CMS External CMS Vue.js Third Party API Content Provider Internal Editor External Editor Report Service Data Warehouse LINE TODAY Architecture ML Data Analysis 4%
  8. CDN Object Storage In Mem Cach e Backend Server Frontend

    Server Cache Server Mini Service Feeding Service Internal CMS External CMS Vue.js Third Party API Content Provider Internal Editor External Editor Report Service Data Warehouse LINE TODAY Architecture ML Data Analysis
  9. CDN Object Storage In Mem Cach e Backend Server Frontend

    Server Cache Server Mini Service Feeding Service Internal CMS External CMS Vue.js Third Party API Content Provider Internal Editor External Editor Report Service Data Warehouse LINE TODAY Architecture ML Data Analysis
  10. CDN Object Storage In Mem Cach e Backend Server Frontend

    Server Cache Server Mini Service Feeding Service Internal CMS External CMS Vue.js Third Party API Content Provider Internal Editor External Editor Report Service Data Warehouse LINE TODAY Architecture ML Data Analysis
  11. CDN Object Storage In Mem Cach e Backend Server Frontend

    Server Cache Server Mini Service Feeding Service Internal CMS External CMS Vue.js Third Party API Content Provider Internal Editor External Editor Report Service Data Warehouse LINE TODAY Architecture ML Data Analysis
  12. CDN Object Storage In Mem Cach e Backend Server Frontend

    Server Cache Server Mini Service Feeding Service Internal CMS External CMS Vue.js Third Party API Content Provider Internal Editor External Editor Report Service Data Warehouse LINE TODAY Architecture ML Data Analysis
  13. • LINE TODAY and its architecture • How miniservices and

    K8S changes our development and operation • Refactor to mini services • Lessons learnt Agenda
  14. Small change requires entire system rebuild and deployment - Break

    down coarse-grained deployments into functionally cohesive mini services - Move to Kubernetes Problem Solutions How to improve development and deployment efficiency? Module Module Module Module Module Module Module Module Module Module Module Module Module deployment (eg. war file) OCI image OCI image OCI image
  15. Migrate to mini services and Kubernetes Article Service Subscription Service

    Interaction Service Frontend Server Cache Server Ingress Controller Kubernetes Web Server (VM) API Server (VM) Observability logs metrics tracing Mini Services CD
  16. • LINE TODAY and its architecture • How miniservices and

    K8S changes our development and operation • Refactor to mini services • Refine CI/CD • Lessons learnt Agenda
  17. Build and test what was changed service1 service2 libA Changing

    service1 => rebuild service1 service1 service2 libA Changing libA => rebuild libA, service1, service2 libB libB
  18. Manage deployment via GitOps git repo Deploy Process service(s) version(s)

    update manifest ArgoCD + Kustomize
  19. • LINE TODAY and its architecture • How miniservices and

    K8S changes our development and operation • Refactor to mini services • Refine CI/CD • Integrate with observability • Lessons learnt Agenda
  20. Observability improves operation efficiencies Article Mini Service Subscription Mini Service

    Interaction Mini Service Frontend Server Cache Server Ingress Controller Kubernetes Web Server (VM) API Server (VM) Observability logs metrics tracing Mini Services CD
  21. Observability - service RPS / latency metrics

  22. Observability - metrics week-over-week

  23. Observability - abnormal spikes

  24. Troubleshooting - 1. receive alert from slack

  25. Troubleshooting - 2. link to alert panel and access logs

  26. Troubleshooting - 3. open trace viewer

  27. Troubleshooting via observability Alerts Metrics Logs Traces Logs Received alerts

    from slack Check error source and time period Inspect access logs Open trace viewer Jump to service logs of the trace Exemplars Split view with labels Metric queries Span metrics processor Trace to logs Followed Trace ID Metrics Traces Logs
  28. • LINE TODAY and its architecture • How miniservices and

    K8S changes our development and operation • Refactor to mini services • Refine CI/CD • Integrate with observability • Leverage K8S CronJob • Lessons learnt Agenda
  29. Run periodic simple tasks (java) on Kubernetes • Requirements •

    run at the specific time / interval • simple • concurrency control • running history and logs • monitor • easy to run in local and test env • Options • Spring @scheduled • Quartz • Spring Cloud Data Flow • AirFlow • K8S CronJob
  30. Use K8S CronJob to run periodic tasks

  31. K8S CronJob metrics

  32. • LINE TODAY and its architecture • How miniservices and

    K8S changes our development and operation • Lessons learnt Agenda
  33. Issue - intermittent errors during rolling update K8S API Server

    kube-proxy kubelet Pod Worker node kube-proxy Worker node 1a delete pod 1b. remove pod from service endpoint
  34. Solution - graceful shutdown Main container process SIGTERM SIGKILL Pre-stop

    hook Container killed (if running) Container shutdown deployment manifest spring boot application.yaml • Existing services allowed to complete • No new requests permitted K8S API Server kube-proxy kubelet Pod Worker node kube-proxy Worker node delete pod remove pod from service endpoint
  35. Issue - unpredictable request spikes • pod removed from endpoint

    at 30ish seconds • fewer pods available to serve requests • requeusts queue up • pod restarted at 60ish seconds • downward spiral
  36. Options to handle request spikes - it depends • Overprovision

    • $$$$ • Auto scaling • pod - 20+ seconds • node - ~5 minutes • serverless (lambda) - seconds • Protection via ingress controller / api gateway • circuit breaker - 503 • rate-limit - 429 • Improve design
  37. Issue - job killed without error log

  38. Root cause: linux kernel memory leak reboot

  39. • LINE TODAY and its architecture • Mini services and

    K8S helps dev / ops efficiency for large systems • Refactor to mini services • Refine CI/CD • Integrate with observability • Leverage K8S CronJob • Build in-depth DevOps and K8S skills Summary
  40. Thank you