ZGC for Future LINE Base

53850955f15249a1a9dc49df6113e400?s=47 LINE Developers
September 05, 2019

ZGC for Future LINE Base

2019/09/05に台北で行われたJWJUG201909聚會での登壇資料です

53850955f15249a1a9dc49df6113e400?s=128

LINE Developers

September 05, 2019
Tweet

Transcript

  1. ZGC for Future LINE HBase LINE Corporation Shinya Yoshida 2019/09/05

    TWJUG 201909 聚會
  2. About me • Shinya Yoshida • HBase Unit, Z Part,

    Dept1, Dev1, LINE JP server Z Part Other services
  3. Agenda • Garbage Collection/or • How to choose GC •

    Evaluate HBase with ZGC • Applying ZGC to production HBase
  4. Garbage Collection/or • How to collect garbage • How to

    use heap spaces – Generational, Regioning – Skip this topic for time reason • Stop The World(STW) – World == Application – Application cannot do nothing while in STW
  5. HBase and GC • HBase = NoSQL running on JVM

    – JVM pause time(STW) by GC Worst response time ⇒ Worst – Response time Service ⇒ Worst • We use HBase for persistent store in LINE Messaging – So, Response time of HBase is important
  6. JVM pause time vs HBase slow

  7. Garbage Collection

  8. Garbage Collection • Finding garbage • Collect garbage, and Defrag.

    heap space
  9. Basic GC Algorithm idea • Finding non referenced objects(garbage) –

    Counting reference – Mark • Collecting – Sweep/Compaction • Remove and defrag – Copy(Relocate) • Move live objects to another space • Do either while or after marking
  10. There are fragmentation Fragmentation is solved, but the address will

    be changed Need another space Sweep/Compaction Copy
  11. GC in Java GC Algorithms How to collect Young GC

    Sequential/Parallel Copy Old GC Sequential/Parallel Sweep&Compact Concurrent Mark & Sweep (To be removed) Sweep only (No compaction) G1GC Copy ZGC(Java11, Experimental) Copy Shenandoah(Java12, Experimental) Copy
  12. GC with running application • Yes, we can – Known

    as Concurrent GC, however • Need to stop application(STW) while GC for – Soundness(Must):Never collect living object – Completeness(Option):Collect all garbage • STW duration depends on algorithm – How long, often – Scaling factor(Heap size? # of live objects? # of threads?)
  13. STW and scaling factor of each GCs GC Algorithms Mark

    Collect Old GC Sequential Parallel CMS(To be removed) Full GC somewhen G1GC Scales by heap size, live objects, etc, but manageable STW because address can be changed ZGC Scales by # of Threads STW:<=10ms? Shenandoah STW:<=10ms? Running application(e.g. Concurrent) STW GC cycles
  14. ZGC and Shenandoah • Core idea is mostly same between

    two – Changing object address/pointer while running GC and application • Check official slides for more details – https://wiki.openjdk.java.net/display/zgc/Main
  15. Choosing GC

  16. Choosing GC • Know GC’s properties – Pros/Cons • Consider

    your application properties and HWs • Choose better one
  17. What’s GC’s properties • Response time(STW duration) / Throughput? •

    How many CPU cores are required for best performance • Heap size • Live object ratio for best performance • Object sizes • And so on
  18. Properties for each Java GCs ∞ 85% 100% 1GB 8GB

    512GB 8TB〜 0ms 100ms 1s 1 2 4 8 16 32+ 31GB 128GB 2TB Sequential/Parallel G1GC Shenandoah/ZGC STW Throughput Preferred # of CPU cores Preferred heap size
  19. What about our HBase? • Response time is important than

    throughput – We may scale out(Need money though) • Want large heap for caching data • Don’t use CPU so much by HBase itself – Don’t use Phoenix(SQL for HBase) – Some compaction uses CPU core fully, but # of thread is small – Mostly IO and Networks • Server specs – 40 CPU cores – 256GB RAM
  20. Our HBase properties vs GC properties STW Throughput Preferred #

    of CPU cores Preferred heap size ∞ 85% 100% 1GB 8GB 512GB 8TB〜 0ms 100ms 1s 1 2 4 8 16 32+ 31GB 128GB 2TB Sequential/Parallel G1GC Shenandoah/ZGC
  21. HBase fits ZGC! (Shenandoah also, but let’s use ZGC cuz.

    we’re Z Part)
  22. Our internship student evaluated HBase with ZGC last year

  23. What we did • Construct HBase1.2.5+α on test cluster •

    Running it on Java11-ea – We used HBase built by JDK8 • Enable ZGC • Eval performance using YCSB, a benchmark tool – STW duration and slow response of HBase – Throughput – As contrast, G1GC
  24. STW Duration

  25. HBase slow Response G1GC ZGC Slow Response Threshold is 35ms

  26. Interesting result in Throughput Ops/Sec Higher(Bold) is better

  27. Apply ZGC to production now?

  28. Not yet Running HBase on Java11≦ ZGC is still experimental

  29. Difficult to do it for LINE Messenger everybody uses

  30. We want to evaluate it for production requests

  31. RPC Response JDK8 + G1GC server

  32. RPC Response JDK8 + G1GC Consumer RPC Info JDK11 +

    ZGC RPC Response server
  33. RPC Response JDK8 + G1GC Consumer RPC Info JDK11 +

    ZGC RPC Response server WORK IN PROGRESS
  34. Conclusions • New GCs coming to Java – ZGC、Shenandoah: Still

    Experimental – Choose it understanding properties of GC, your app and HW • We like ZGC – Shows better performance with our HBase • Evaluate it more using production requests – Before apply ZGC to our production HBase