Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Antman: Saving storage from a multimedia perspe...

Antman: Saving storage from a multimedia perspective

LINE DevDay 2020

November 25, 2020
Tweet

More Decks by LINE DevDay 2020

Other Decks in Technology

Transcript

  1. File size (bytes) 0 200 400 600 800 1000 1200

    JPEG HEIF File Size x2 ▲ Compression rate
  2. How LINE Handles Media Files OBS (OBject Storage) LINE Family

    Services Storage Delivery Processing In-house Server Platform
  3. LINE Family Services How LINE Handles Media Files In-house Server

    Platform Delivery Processing OBS (OBject Storage) Storage 60 PB ▲ 1 PB = 1,024 TB 1 TB = 1,024 GB
  4. What Happens If Storage Gets Bigger? Operation Cost ▲ $1.4

    million/ Month Space in Data Center ▼ 1) Figure the cost out using Amazon S3 storage pricing. 1)
  5. OBS Storage LINE Services That Use Storage Intensively LINE Album

    feature LINE Timeline … LINE Messenger Numerous JPEG files stored
  6. Goals Operation Cost ▼ Space in Data Center ▲ Text

    File JPEG JPEG JPEG JPEG … HEIF Text File … HEIF HEIF HEIF
  7. 15 PB 22.5 PB 30 PB 37.5 PB 45 PB

    Feb '19 Apr '19 Jun '19 Aug '19 Oct '19 Dec '19 Mar '20 May '20 Jul '20 Sep '20 Results The usage of Media Storage for LINE Album feature Antman Started
  8. 15 PB 22.5 PB 30 PB 37.5 PB 45 PB

    Feb '19 Apr '19 Jun '19 Aug '19 Oct '19 Dec '19 Mar '20 May '20 Jul '20 Sep '20 Antman exists If Antman not existed Results Antman Started The usage of Media Storage for LINE Album feature
  9. 15 PB 22.5 PB 30 PB 37.5 PB 45 PB

    Feb '19 Apr '19 Jun '19 Aug '19 Oct '19 Dec '19 Mar '20 May '20 Jul '20 Sep '20 Antman exists If Antman not existed Results Antman Started 21 PB Saved The usage of Media Storage for LINE Album feature
  10. Agenda › Struggling for Image Format Compatibility › Obsession with

    Image Quality › Antman Operation Cost › Results
  11. Deleted = Reduce Storage Usage Reducing Storage Usage Original JPEG

    HEIF 1) https://en.wikipedia.org/wiki/High_Efficiency_Image_File_Format 1)
  12. Reducing Storage Usage Original JPEG HEIF Introduced in 2015 Can

    reduce file size by about 50% of JPEG The latest image format compared to JPEG(1992) Maintaining equivalent image quality 1) https://en.wikipedia.org/wiki/High_Efficiency_Image_File_Format 2) https://en.wikipedia.org/wiki/High_Efficiency_Image_File_Format#JPEG_and_HEIF 1) 2)
  13. ! Device A (Support HEIF) OK ?? Reducing Storage Usage

    " Device B (Not support HEIF) HEIF
  14. ! Device A (Support HEIF) OK ! Device B (Not

    support HEIF) Reducing Storage Usage Restored JPEG Convert OK HEIF
  15. Reducing Storage Usage ! User A (can read HEIF) OK

    ! User B (can’t read HEIF) OK Needs to figure out the most optimal Quality Factor Convert Restored JPEG HEIF
  16. Disk Low Quality Factor High Quality Factor Disk Disk Disk

    Disk Disk Disk Disk Disk Disk HEIF JPEG JPEG JPEG
  17. Disk Low Quality Factor High Quality Factor Disk Disk Disk

    Disk Disk Disk HEIF JPEG JPEG Optimal Quality Factor Disk Disk Disk JPEG
  18. HEIF DQT Backup ‘Free’ Box %25 Restored JPEG %25 Transcoding

    * LINE Engineering Blog (EN) https://engineering.linecorp.com/en/blog/developing-the-antman-project
  19. Disk Low Quality Factor High Quality Factor Disk Disk Disk

    Disk Disk Disk Disk Disk Disk JPEG HEIF HEIF HEIF
  20. Disk Low Quality Factor High Quality Factor Disk Disk Disk

    Disk Disk Disk Disk Disk Disk JPEG HEIF HEIF HEIF
  21. Find Best Quality Factor Quality Factor Attempt #2 | Original

    JPEG HEIF PSNR 1) PSNR: Peak Signal-to-Noise Ratio 1)
  22. Find Best Quality Factor 1) PSNR: Peak Signal-to-Noise Ratio PSNR

    1) Quality Factor Attempt #2 | Original JPEG Recovered JPEG HEIF
  23. Image Scan Window P Good or Bad Sliding Window Summarize

    PSNR Results * LINE Engineering Blog (EN) https://engineering.linecorp.com/en/blog/developing-the-antman-project Attempt #4 |
  24. Process for Most Optimal Quality Factor 1 Low High Initial

    Quality Factor No quality loss Quality loss
  25. Process for Most Optimal Quality Factor 2 1 Low High

    Adjust Quality Factor No quality loss Quality loss
  26. Process for Most Optimal Quality Factor 3 2 1 Low

    High Adjust Quality Factor No quality loss Quality loss
  27. Process for Most Optimal Quality Factor 3 4 2 1

    Low High Adjust Quality Factor No quality loss Quality loss
  28. Process for Most Optimal Quality Factor 3 4 2 5

    1 Low High No quality loss Quality loss Adjust Quality Factor
  29. Process for Most Optimal Quality Factor 3 4 2 5

    1 Low High No quality loss Quality loss Best Quality Factor
  30. HEIF File Size Distribution # Samples HEIF bytes / JPEG

    bytes (percent) 10% 20% 30% 40% 50% 60% 70% 80%
  31. HEIF File Size Distribution # Samples HEIF bytes / JPEG

    bytes (percent) 10% 20% 30% 40% 50% 60% 70% 80% Generally 30% ~ 40%
  32. 50%

  33. Years 7+ Max image resolution 4K Images /day 60 million

    LINE Album feature Antman Operation Cost
  34. LINE Album feature Years 7+ Max image resolution 4K Images

    /day 60 million Antman Operation Cost JPEG HEIF CPU CPU CPU CPU CPU CPU CPU
  35. Video Transcoder Server Group GPU Server GPU Server GPU Server

    Antman Server Group GPU Server GPU Server GPU Server GPU Server GPU Server GPU Server GPU Server Busy state Share GPU Equipments
  36. Video Transcoder Server Group GPU Server GPU Server GPU Server

    Antman Server Group GPU Server GPU Server GPU Server GPU Server GPU Server GPU Server GPU Server Idle state Share GPU Equipments
  37. Video Transcoder Server Group GPU Server GPU Server GPU Server

    Antman Server Group GPU Server GPU Server GPU Server GPU Server GPU Server GPU Server GPU Server Rent ! Share GPU Equipments
  38. Video Transcoder Server Group GPU Server GPU Server GPU Server

    Antman Server Group GPU Server GPU Server GPU Server GPU Server GPU Server GPU Server GPU Server Need more GPU ! Share GPU Equipments
  39. Video Transcoder Server Group GPU Server GPU Server GPU Server

    Antman Server Group GPU Server GPU Server GPU Server GPU Server GPU Server GPU Server GPU Server Return ! Share GPU Equipments
  40. * LINE Engineering Blog (EN) https://engineering.linecorp.com/en/blog/developing-the-antman-project Zero Infrastructure Cost 0%

    25% 50% 75% 100% 24 Hours 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Transcoder Antman
  41. 15 PB 22.5 PB 30 PB 37.5 PB 45 PB

    Feb '19 Apr '19 Jun '19 Aug '19 Oct '19 Dec '19 Mar '20 May '20 Jul '20 Sep '20 Antman exists If Antman not existed Results Completed processing JPEG files stored before Antman Antman started The usage of Media Storage for LINE Album feature
  42. 15 PB 22.5 PB 30 PB 37.5 PB 45 PB

    Feb '19 Apr '19 Jun '19 Aug '19 Oct '19 Dec '19 Mar '20 May '20 Jul '20 Sep '20 Antman exists If Antman not existed Results Antman started 21 PB Saved The usage of Media Storage for LINE Album feature
  43. 15 PB 22.5 PB 30 PB 37.5 PB 45 PB

    Feb '19 Apr '19 Jun '19 Aug '19 Oct '19 Dec '19 Mar '20 May '20 Jul '20 Sep '20 Antman exists If Antman not existed Results Antman started 1) Figure the cost out using Amazon S3 storage pricing. $ 470K / Month Storage Operation Cost 1) The usage of Media Storage for LINE Album feature
  44. 15 PB 22.5 PB 30 PB 37.5 PB 45 PB

    Feb '19 Apr '19 Jun '19 Aug '19 Oct '19 Dec '19 Mar '20 May '20 Jul '20 Sep '20 Antman exists If Antman not existed Results $2 million ▲ saved 1) Figure the cost out using Amazon S3 storage pricing. 1) The usage of Media Storage for LINE Album feature