Slide 1

Slide 1 text

ZGC for Future LINE HBase LINE Corporation Shinya Yoshida 2019/09/05 TWJUG 201909 聚會

Slide 2

Slide 2 text

About me ● Shinya Yoshida ● HBase Unit, Z Part, Dept1, Dev1, LINE JP server Z Part Other services

Slide 3

Slide 3 text

Agenda ● Garbage Collection/or ● How to choose GC ● Evaluate HBase with ZGC ● Applying ZGC to production HBase

Slide 4

Slide 4 text

Garbage Collection/or ● How to collect garbage ● How to use heap spaces – Generational, Regioning – Skip this topic for time reason ● Stop The World(STW) – World == Application – Application cannot do nothing while in STW

Slide 5

Slide 5 text

HBase and GC ● HBase = NoSQL running on JVM – JVM pause time(STW) by GC Worst response time ⇒ Worst – Response time Service ⇒ Worst ● We use HBase for persistent store in LINE Messaging – So, Response time of HBase is important

Slide 6

Slide 6 text

JVM pause time vs HBase slow

Slide 7

Slide 7 text

Garbage Collection

Slide 8

Slide 8 text

Garbage Collection ● Finding garbage ● Collect garbage, and Defrag. heap space

Slide 9

Slide 9 text

Basic GC Algorithm idea ● Finding non referenced objects(garbage) – Counting reference – Mark ● Collecting – Sweep/Compaction ● Remove and defrag – Copy(Relocate) ● Move live objects to another space ● Do either while or after marking

Slide 10

Slide 10 text

There are fragmentation Fragmentation is solved, but the address will be changed Need another space Sweep/Compaction Copy

Slide 11

Slide 11 text

GC in Java GC Algorithms How to collect Young GC Sequential/Parallel Copy Old GC Sequential/Parallel Sweep&Compact Concurrent Mark & Sweep (To be removed) Sweep only (No compaction) G1GC Copy ZGC(Java11, Experimental) Copy Shenandoah(Java12, Experimental) Copy

Slide 12

Slide 12 text

GC with running application ● Yes, we can – Known as Concurrent GC, however ● Need to stop application(STW) while GC for – Soundness(Must):Never collect living object – Completeness(Option):Collect all garbage ● STW duration depends on algorithm – How long, often – Scaling factor(Heap size? # of live objects? # of threads?)

Slide 13

Slide 13 text

STW and scaling factor of each GCs GC Algorithms Mark Collect Old GC Sequential Parallel CMS(To be removed) Full GC somewhen G1GC Scales by heap size, live objects, etc, but manageable STW because address can be changed ZGC Scales by # of Threads STW:<=10ms? Shenandoah STW:<=10ms? Running application(e.g. Concurrent) STW GC cycles

Slide 14

Slide 14 text

ZGC and Shenandoah ● Core idea is mostly same between two – Changing object address/pointer while running GC and application ● Check official slides for more details – https://wiki.openjdk.java.net/display/zgc/Main

Slide 15

Slide 15 text

Choosing GC

Slide 16

Slide 16 text

Choosing GC ● Know GC’s properties – Pros/Cons ● Consider your application properties and HWs ● Choose better one

Slide 17

Slide 17 text

What’s GC’s properties ● Response time(STW duration) / Throughput? ● How many CPU cores are required for best performance ● Heap size ● Live object ratio for best performance ● Object sizes ● And so on

Slide 18

Slide 18 text

Properties for each Java GCs ∞ 85% 100% 1GB 8GB 512GB 8TB〜 0ms 100ms 1s 1 2 4 8 16 32+ 31GB 128GB 2TB Sequential/Parallel G1GC Shenandoah/ZGC STW Throughput Preferred # of CPU cores Preferred heap size

Slide 19

Slide 19 text

What about our HBase? ● Response time is important than throughput – We may scale out(Need money though) ● Want large heap for caching data ● Don’t use CPU so much by HBase itself – Don’t use Phoenix(SQL for HBase) – Some compaction uses CPU core fully, but # of thread is small – Mostly IO and Networks ● Server specs – 40 CPU cores – 256GB RAM

Slide 20

Slide 20 text

Our HBase properties vs GC properties STW Throughput Preferred # of CPU cores Preferred heap size ∞ 85% 100% 1GB 8GB 512GB 8TB〜 0ms 100ms 1s 1 2 4 8 16 32+ 31GB 128GB 2TB Sequential/Parallel G1GC Shenandoah/ZGC

Slide 21

Slide 21 text

HBase fits ZGC! (Shenandoah also, but let’s use ZGC cuz. we’re Z Part)

Slide 22

Slide 22 text

Our internship student evaluated HBase with ZGC last year

Slide 23

Slide 23 text

What we did ● Construct HBase1.2.5+α on test cluster ● Running it on Java11-ea – We used HBase built by JDK8 ● Enable ZGC ● Eval performance using YCSB, a benchmark tool – STW duration and slow response of HBase – Throughput – As contrast, G1GC

Slide 24

Slide 24 text

STW Duration

Slide 25

Slide 25 text

HBase slow Response G1GC ZGC Slow Response Threshold is 35ms

Slide 26

Slide 26 text

Interesting result in Throughput Ops/Sec Higher(Bold) is better

Slide 27

Slide 27 text

Apply ZGC to production now?

Slide 28

Slide 28 text

Not yet Running HBase on Java11≦ ZGC is still experimental

Slide 29

Slide 29 text

Difficult to do it for LINE Messenger everybody uses

Slide 30

Slide 30 text

We want to evaluate it for production requests

Slide 31

Slide 31 text

RPC Response JDK8 + G1GC server

Slide 32

Slide 32 text

RPC Response JDK8 + G1GC Consumer RPC Info JDK11 + ZGC RPC Response server

Slide 33

Slide 33 text

RPC Response JDK8 + G1GC Consumer RPC Info JDK11 + ZGC RPC Response server WORK IN PROGRESS

Slide 34

Slide 34 text

Conclusions ● New GCs coming to Java – ZGC、Shenandoah: Still Experimental – Choose it understanding properties of GC, your app and HW ● We like ZGC – Shows better performance with our HBase ● Evaluate it more using production requests – Before apply ZGC to our production HBase