Upgrade to Pro — share decks privately, control downloads, hide ads and more …

LINE MediaStorage at scale

LINE MediaStorage at scale

Eebedc2ee7ff95ffb9d9102c6d4a065c?s=128

LINE DevDay 2020

November 25, 2020
Tweet

Transcript

  1. None
  2. Agenda › Introduction to the LINE MediaPlatform › MediaStorage Forward

    Placement › Needs and Challenges › Benefits › Future works
  3. MediaPlatform OBS LIVE Vision-AI Delivery Processing Storage Live Streaming VOD

    OCR Adult Filter Object, QR Detection Functions Chat (OBject Storage)
  4. MediaPlatform OBS LIVE Vision-AI Delivery Processing Storage Live Streaming VOD

    OCR Adult Filter Object, QR Detection Functions Chat (OBject Storage)
  5. MediaPlatform OBS(OBject Storage) in numbers: Traffic, Storage Volume Peak Traffic

    500Gbps+ Storage Volume 60PB+ Requests / Day 7B+
  6. MediaPlatform OBS(OBject Storage) in numbers: Traffic, Storage Volume Traffic /

    Day 500Gbps+ Storage Volume 60PB+ Requests / Day 7B+ 2020. 1Q
  7. MediaPlatform OBS(OBject Storage) in numbers: Traffic of NewYear! Japan NewYear!

    (GMT +9) Taiwan NewYear! (GMT +8) 2x than usual 2.7x than usual
  8. MediaPlatform OBS(OBject Storage) in numbers: Traffic LINE Total Traffic OBS

    Traffic
  9. MediaPlatform OBS(OBject Storage) in numbers: Traffic LINE Total Traffic OBS

    Traffic 55%
  10. MediaPlatform OBS(OBject Storage) in numbers: Servers OBS Delivery Processing Storage

    2013 2020 (270EA+) 3PB 60PB (2900EA+) (x20) (x10.7) (+7) (200EA+) 10M+ requests/day 7B+ requests/day (750EA+) (x70) (x3.75) 340EA+ 250EA+ (x0.73)
  11. MediaPlatform OBS(OBject Storage) Internals ServiceBO Cache Processor Users Auth Storage

  12. MediaPlatform OBS(OBject Storage) Internals ServiceBO Cache Processor Users Auth Storage

  13. MediaPlatform OBS(OBject Storage) Internals ServiceBO Cache Processor Users Auth Storage

  14. GERMANY KOREA SINGAPORE THE UNITED STATES MediaPlatform OBS(OBject Storage) PoP(Point

    of Presence) JAPAN
  15. GERMANY KOREA SINGAPORE THE UNITED STATES MediaPlatform OBS(OBject Storage) PoP(Point

    of Presence) JAPAN
  16. GERMANY KOREA SINGAPORE THE UNITED STATES MediaPlatform OBS(OBject Storage) PoP(Point

    of Presence) JAPAN
  17. GERMANY KOREA SINGAPORE THE UNITED STATES MediaPlatform OBS(OBject Storage) PoP(Point

    of Presence) JAPAN
  18. GERMANY KOREA SINGAPORE THE UNITED STATES MediaPlatform OBS(OBject Storage) PoP(Point

    of Presence) JAPAN
  19. Storage Forward Placement Needs

  20. Storage Forward Placement OBS(OBject Storage) PoP(Point of Presence) ServiceBO Cache

    Auth Storage Processor
  21. Storage Forward Placement OBS(OBject Storage) PoP(Point of Presence) ServiceBO Cache

    Auth Storage Processor ServiceBO Cache ServiceBO Cache ServiceBO Cache ServiceBO Cache Japan Korea Singapore US Germany Japan
  22. Private Network Storage Forward Placement Needs #1: Reduce response speed

    differences between users ServiceBO Users Storage Cache
  23. Private Network Storage Forward Placement Needs #1: Reduce response speed

    differences between users ServiceBO Users Storage Cache (1) Download
  24. Private Network Storage Forward Placement Needs #1: Reduce response speed

    differences between users ServiceBO Users Storage Cache (1) Download (2) Is the object cached?
  25. Private Network Storage Forward Placement Needs #1: Reduce response speed

    differences between users ServiceBO Users Storage Cache (1) Download (2) Is the object cached? (3-1) If Yes: The cache server responds to the cached object.
  26. Storage Forward Placement Needs #1: Reduce response speed differences between

    users ServiceBO Users Storage Cache (1) Download (2) Is the object cached? Private Network
  27. Storage Forward Placement Needs #1: Reduce response speed differences between

    users ServiceBO Users Storage Cache (1) Download (2) Is the object cached? (3-2) If No: The cache server responds after pulling the object from the storage. Cached! Private Network
  28. Storage Forward Placement Needs #1: Reduce response speed differences between

    users ServiceBO Users Storage Cache (1) Download (2) Is the object cached? Good (3-2) If No: The cache server responds after pulling the object from the storage. Cached! Private Network
  29. Storage Forward Placement Needs #1: Reduce response speed differences between

    users Cache
  30. Storage Forward Placement Needs #1: Reduce response speed differences between

    users Cache 1:1 1:N Chatroom <
  31. Storage Forward Placement Needs #1: Reduce response speed differences between

    users Cache 1:N Chat Users Cache Hit 1:1 1:N Chatroom <
  32. Storage Forward Placement Needs #1: Reduce response speed differences between

    users Cache Storage 1:1 Chat Users Good 1:N Chat Users Cache Hit 1:1 1:N Chatroom <
  33. Storage Forward Placement Needs #1: Reduce response speed differences between

    users Cache 1:N Chat Users B, C, D … Cache Hit 1:N Chat User A Storage Good Cache Miss
  34. Storage Forward Placement Needs #1: Reduce response speed differences between

    users Cache Storage Singapore Users Singapore Users Singapore Japan ? ?
  35. Storage Forward Placement Needs #1: Reduce response speed differences between

    users Cache Storage Singapore Users Singapore Users Singapore Singapore
  36. Storage Forward Placement Needs #2: Saving Inter-IDC Traffic Leased Line

  37. Storage Forward Placement Needs #2: Saving Inter-IDC Traffic Leased Line

    Cache Storage
  38. Storage Forward Placement Needs #2: Saving Inter-IDC Traffic Leased Line

    Cache Storage HitRatio: 70%
  39. Storage Forward Placement Needs #2: Saving Inter-IDC Traffic Leased Line

    Cache Storage HitRatio: 70% TH Users
  40. Storage Forward Placement Needs #2: Saving Inter-IDC Traffic Leased Line

    Cache Storage HitRatio: 70% Cache Miss: 30% TH Users
  41. Storage Forward Placement Needs #2: Saving Inter-IDC Traffic Leased Line

    Cache Storage HitRatio: 70% Cache Miss: 30% TH Users
  42. Storage Forward Placement Needs #2: Saving Inter-IDC Traffic - by

    Example Cache Storage TH Users HitRatio: 70% Cache Miss(30%) 10G
  43. Storage Forward Placement Needs #2: Saving Inter-IDC Traffic - by

    Example Cache Storage TH Users HitRatio: 70% Cache Miss(30%) 10G 50G 10G 20G 30G 30G 30G 10G 15G 5G 10G 20G
  44. Storage Forward Placement Needs #2: Saving Inter-IDC Traffic - by

    Example Cache Storage TH Users HitRatio: 70% Cache Miss(30%) 30G 50G 10G 20G 30G 30G 30G 10G 15G 5G 10G 20G
  45. Storage Forward Placement Needs #2: Saving Inter-IDC Traffic Storage TH

    Users Cache HitRatio: 70% Cache Miss: 30% Storage Leased Line
  46. Storage Forward Placement Needs #3: Minimize Cache and Leased Line

    Dependencies Leased Line TH Users Cache Storage
  47. Storage Forward Placement Needs #3: Minimize Cache and Leased Line

    Dependencies Leased Line TH Users Cache Storage
  48. Storage Forward Placement Needs #3: Minimize Cache and Leased Line

    Dependencies Leased Line TH Users Cache Storage
  49. Storage Forward Placement Needs #3: Minimize Cache and Leased Line

    Dependencies Leased Line TH Users Cache Storage TW Users JP Users
  50. Storage Forward Placement Needs #3: Minimize Cache and Leased Line

    Dependencies Cache Hit Ratio Leased Line Traffic(SG-JP)
  51. Storage Forward Placement Needs #3: Minimize Cache and Leased Line

    Dependencies Leased Line TH Users Cache Storage
  52. Storage Forward Placement Needs #3: Minimize Cache and Leased Line

    Dependencies Leased Line TH Users Cache Storage
  53. Storage Forward Placement Needs #3: Minimize Cache and Leased Line

    Dependencies Leased Line TH Users Cache Storage TW Users JP Users Storage
  54. Storage Forward Placement Needs #3: Minimize Cache and Leased Line

    Dependencies Leased Line TH Users Cache Storage TW Users JP Users Storage
  55. Storage Forward Placement Needs #3: Minimize Cache and Leased Line

    Dependencies Leased Line TH Users Cache Storage TW Users JP Users Storage
  56. Storage Forward Placement Needs #4: Active-Active Clustered Storage TH Users

    Cache Storage Japan#1 Japan#2
  57. Storage Forward Placement Needs #4: Active-Active Clustered Storage TH Users

    Cache Storage Japan#1 Active DNS Update Japan#2
  58. Storage Forward Placement Needs #4: Active-Active Clustered Storage TH Users

    Cache Storage Japan#1 ObjectId Region A Japan#1 B Japan#2 C Singapore Region Mapping Table Active DNS Update Japan#2
  59. Storage Forward Placement Needs #4: Active-Active Clustered Storage TH Users

    Cache Storage Japan#1 ObjectId Region A Japan#1 B Japan#2 C Singapore Region Mapping Table 35% 50% 15% Active DNS Update Japan#2
  60. Storage Forward Placement Technical Decisions & Challenges

  61. MediaStorage MediaStorage Requirements › Resiliency › Scalability › Consistency ›

    Fault-Tolerant OBS Delivery Processing Storage
  62. MediaStorage MediaStorage Requirements › Resiliency › Scalability › Consistency ›

    Fault-Tolerant OBS Delivery Processing Storage › Cost-Efficiency
  63. MediaStorage Ceph, The Next Generation of the OBS Storage

  64. MediaStorage Ceph, Which one to use? Object Storage (Rados Gateway)

    Object: Key-Value mappings Amazon S3 Restful API supported. Graceful shutdown File Storage (CephFS) File: FileSystem FUSE API supported. MDS required. Block Storage Block: Sequence of bytes It provides a block device image
  65. MediaStorage Ceph Internals OBS FUSE REST API Ceph Cluster Key-Value

    DB Data MDS MetaPool DataPool RGW IndexPool DataPool POSIX Amazon S3 OSD Object Storage File Storage
  66. MediaStorage Ceph Internals OBS FUSE REST API Ceph Cluster Key-Value

    DB Data MDS MetaPool DataPool RGW IndexPool DataPool POSIX Amazon S3 OSD Object Storage File Storage
  67. MediaStorage Why Ceph Object Storage #1: S3 Compatibility RESTful API

    S3 API OBS OSD ServiceBO Rados Gateway Ceph RADOS
  68. MediaStorage Why Ceph Object Storage #1: S3 Compatibility RESTful API

    S3 API OBS S3 API OSD ServiceBO Rados Gateway Ceph RADOS
  69. MediaStorage Why Ceph Object Storage #2: Reusing common functions between

    OBS and Ceph Object Storage OSD Ceph RADOS Storage Dedicated Protocol Bucket-id Resolving Algorigthm Storage I/O Thread Management Storage Class Transition Object Expiration … RPC OBS ServiceBO Commercial Storage Rados Gateway
  70. MediaStorage Why Ceph Object Storage #3: Storage Class Transition High

    Performance Storage Standard Storage High-Density Storage IOPS, Cost Capacity MediaStorage Users After 7 Days After 14 Days
  71. MediaStorage Why Ceph Object Storage #3: Storage Class Transition High

    Performance Storage Standard Storage High-Density Storage IOPS, Cost Capacity MediaStorage Users After 7 Days After 14 Days
  72. MediaStorage Why Ceph Object Storage #4: CephFS MDS Active Promotion

    Active Standby MDS(Meta Data Server) Active Sync the active MDS states
  73. MediaStorage Why Ceph Object Storage #4: CephFS MDS Active Promotion

    Active Standby MDS(Meta Data Server) Active Sync the active MDS states (journal replaying: takes 3~4 seconds) Service BO read() write() read() write() read() write() Blocked!
  74. MediaStorage Architectural concerns on the OBS side › Object Immutability

    › Which services will be moved to Ceph? › Storage I/O Adapter Library OBS Delivery Processing Storage
  75. MediaStorage Enhanced Storage I/O Layer Design ServiceBO Commercial Storage Ceph

    Storage Library AWS S3 SDK Business Functions Common Functions Storage / Traffic Cost Optimization Storage I/O
  76. MediaStorage Enhanced Storage I/O Layer Design ServiceBO Commercial Storage Ceph

    Storage Library AWS S3 SDK RPC HTTP Business Functions Common Functions Storage / Traffic Cost Optimization Storage I/O
  77. MediaStorage Enhanced Storage I/O Layer Design ServiceBO Commercial Storage Ceph

    Storage Library AWS S3 SDK RPC HTTP Blocking I/O NonBlocking I/O Business Functions Common Functions Storage / Traffic Cost Optimization Storage I/O
  78. MediaStorage Enhanced Storage I/O Layer Design ServiceBO Commercial Storage Ceph

    Storage Library AWS S3 SDK RPC HTTP Blocking I/O NonBlocking I/O PUSH PULL Business Functions Common Functions Storage / Traffic Cost Optimization Storage I/O
  79. MediaStorage Enhanced Storage I/O Layer Design ServiceBO Commercial Storage Ceph

    Storage Library AWS S3 SDK RPC HTTP Blocking I/O NonBlocking I/O PUSH PULL Business Functions Common Functions Storage / Traffic Cost Optimization Storage I/O
  80. MediaStorage Enhanced Storage I/O Layer Design ServiceBO Commercial Storage Ceph

    Storage Library AWS S3 SDK RPC HTTP Blocking I/O NonBlocking I/O PUSH PULL Business Functions Common Functions Storage / Traffic Cost Optimization Storage I/O Video Processing Server Image Processing Server
  81. MediaStorage Enhanced Storage I/O Layer Design ServiceBO Business Functions Common

    Functions Storage / Traffic Cost Optimization Storage Abstraction Asynchronous Interface Storage I/O Adapter
  82. MediaStorage Enhanced Storage I/O Layer Design ServiceBO Business Functions Common

    Functions Storage / Traffic Cost Optimization Storage Abstraction Asynchronous Interface Storage I/O Adapter Commercial Storage Ceph RPC HTTP Blocking I/O NonBlocking I/O PUSH PULL Storage I/O Adapter Publisher Queue Subscriber request(n) Library AWS S3 Subscription onNext(data)
  83. MediaStorage Enhanced Storage I/O Layer Design ServiceBO Business Functions Common

    Functions Storage / Traffic Cost Optimization Storage Abstraction Asynchronous Interface Storage I/O Adapter Commercial Storage Ceph RPC HTTP Blocking I/O NonBlocking I/O PUSH PULL Storage I/O Adapter Publisher Queue Subscriber request(n) Library AWS S3 Subscription onNext(data) Image Processing Server Video Processing Server Storage I/O Adapter
  84. MediaStorage Enhanced Storage I/O Layer Design copy() OBS ServiceBO Commercial

    Storage
  85. MediaStorage Enhanced Storage I/O Layer Design copy() OBS ServiceBO Commercial

    Storage Friend A Copy
  86. MediaStorage Enhanced Storage I/O Layer Design copy() OBS ServiceBO Commercial

    Storage Friend A Copy Friend B Copy
  87. MediaStorage Enhanced Storage I/O Layer Design copy() OBS ServiceBO Commercial

    Storage Friend A Copy Friend B Copy Friend C Copy
  88. MediaStorage Enhanced Storage I/O Layer Design copy() OBS ServiceBO Commercial

    Storage Friend A Copy Friend B Copy Friend C Copy Ceph
  89. MediaStorage Enhanced Storage I/O Layer Design Commercial Storage Ceph Object

    Storage copy? copy? ServiceBO
  90. MediaStorage Enhanced Storage I/O Layer Design Commercial Storage Ceph Object

    Storage request(n) ServiceBO Storage Abstraction Asynchronous Interface Storage I/O Adapter CommercialPublisher CephSubscriber onNext(data) read() write()
  91. MediaStorage Enhanced Storage I/O Layer Design Commercial Storage Ceph Object

    Storage request(n) ServiceBO Storage Abstraction Asynchronous Interface Storage I/O Adapter CommercialPublisher CephSubscriber onNext(data) read() write() 100Mbps 50Mbps
  92. MediaStorage Resumable Upload Upload

  93. MediaStorage Resumable Upload Upload Network Failure

  94. MediaStorage Resumable Upload Upload Network Failure Click the Resend Button

  95. MediaStorage Resumable Upload Upload Network Failure Click the Resend Button

    Resumable Upload
  96. MediaStorage Resumable Upload Commercial File Storage Ceph Object Storage Upload

  97. MediaStorage Resumable Upload Commercial File Storage Ceph Object Storage Upload

  98. MediaStorage Resumable Upload Commercial File Storage Ceph Object Storage Upload

    Network Failure Data Loss
  99. MediaStorage Resumable Upload - Multipart PUT User Packets OBS(Service BO)

    Ceph Object Storage Part #1 Part #2 Part #3 Part #4 offset 0 64K 128K 192K 256K Aggregation & Make each part Object #1 Object #2 Object #3 Object #4
  100. MediaStorage Resumable Upload - Multipart PUT User Packets OBS(Service BO)

    Ceph Object Storage Part #1 Part #2 Part #3 Part #4 offset 0 64K 128K 192K 256K Aggregation & Make each part Object #1 Object #2 Object #3 Object #4 Complete
  101. MediaStorage Resumable Upload - Multipart PUT RTT OBS(Service BO) Ceph

    Object Storage Initiate Multipart Upload Request Accept request and returns UploadID Upload Parts Complete Multipart Upload Request Make Object RTT = Parts + 2 ?
  102. MediaStorage Resumable Upload - Improved Multipart PUT RTT OBS(Service BO)

    Ceph Object Storage Create UploadID and Upload with the 1st part Upload Parts(2nd~last) Complete Multipart Upload Request Make Object RTT = Parts + 1
  103. MediaStorage Which services will be moved to Ceph? Essential services

    even in a disaster situation With limited storage period Should have the high localities
  104. Message Services MediaStorage Which services will be moved to Ceph?

  105. Storage Forward Placement Future works

  106. Storage Forward Placement Future works API / Protocol Enhancement Find

    more Regions & Services Change the Message MediaStorage
  107. Thank you