Slide 1

Slide 1 text

LINE B 2 B 
 Ryosuke Hasebe 20 22 . 0 6 . 19

Slide 2

Slide 2 text

Speaker 2 LINE (2013 9 ) 4 OA Dev 2 / OA SRE LINE LINE / LINE / LINE ( )/ LINE / LINE / LINE Login (OAuth 2 /OIDC) / LINE / LINE Profile+ / LINE Notify / LINE / Java/Kotlin / (Reactive Streams / Kotlin Coroutines) K 8 s Ryosuke Hasebe Github: be-hase 
 Twitter: be_hasee

Slide 3

Slide 3 text

1 . About LINE s B 2 B Platform 2 . Case 1 : Slow latency issue after updating to Lettuce v 6 3 . Case 2 : Direct buffer OOME issue due to bad usage of Spring WebClient Agenda 3

Slide 4

Slide 4 text

About LINE s B 2 B Platform

Slide 5

Slide 5 text

: LINE B 2 B LINE LINE /CRM API BOT LINE LINE 5 LINE LINE Talk Head View

Slide 6

Slide 6 text

: LINE 6 CPU 4 , 50 0 core 
 (request/sec) 10 Memory 14 TB ※ 2021೥9݄࣌఺

Slide 7

Slide 7 text

: (B 2 B) 7

Slide 8

Slide 8 text

\ / Kotlin/Java, Spring Boot, Armeria, gRPC/Thrift MySQL, HBase, Redis, Kafka, Elasticsearch, Centraldogma, nginx, fluentd Verda(OpenStack based Private Cloud) VM/PM, Kubernetes / Prometheus, Grafana, IU( ), Kibana, IMON( ) GHE, Jenkins, Drone, Circle CI, Ansible, ArgoCD 8

Slide 9

Slide 9 text

Case 1 : Slow latency issue after updating 
 to Lettuce v 6

Slide 10

Slide 10 text

Lettuce v 4 . 5 . 0 v 6 . 0 . 0 99 . 9 percentile latency ( 1 sec ) Lettuce = Redis client library for Java spring-data-redis Kafka Consumer 96 Redis Cluster 1 3K commands/sec HGETALL 1 0

Slide 11

Slide 11 text

Workaround v 5 (v 5 . 3 . 5 ) v 4 -> v 6 v 5 . 3 . 5 -> v 6 . 0 . 0 v 6 1 1

Slide 12

Slide 12 text


 (1 )

Slide 13

Slide 13 text

Lettuce 5 . 3 EOL 😨 > 5 . 3 .x is EOL (end-of-life) as of June 2 021 . https://github.com/lettuce-io/lettuce-core/wiki/Lettuce-Versions EOL Spring 4 Shell Lettuce v 6 . 1 . 6 1 3

Slide 14

Slide 14 text

(Lettuce version client-side ) Redis server-side latency SLOWLOG client-side(= java application = Lettuce ) 1 4 client-side(Lettuce) / server-side(Redis)

Slide 15

Slide 15 text

GC STW 99.9 percentile latency ( ) Lettuce GC(STW) or GC time Micrometer GC HeapDump Eclipse Memory Analyzer GC STW Unified JVM Logging safepoint log( ) STW 1 5 Stop The World(STW) [2022-03-14T17:30:16.483+0900][192775.478][info ][safepoint] Total time for which application threads were stopped: 0.xxx seconds, Stopping threads took: 0.xxx seconds safepoint log 
 https://krzysztofslusarski.github.io/ 2020 / 11 / 13 /stw.html

Slide 16

Slide 16 text

JVM Redis v 5 . 3 . 5 v 6 . 0 . 0 Try & Error Local 1 6

Slide 17

Slide 17 text

Kafka Consumer Consumer Group Lettuce 6 . 1 . 16 1 7 Lettuce v5.3.15 Lettuce v6.1.16

Slide 18

Slide 18 text

Lettuce v 6 RESP 3 RESP 3 Redis v 6 https://github.com/antirez/RESP 3 /blob/master/spec.md Lettuce v 6 RESP 3 Redis RESP 3 ⾒ RESP 2 fallback ( ) 1 8 RESP 3 ClusterClientOptions .builder() // RESP2ͷΈ࢖༻͢ΔΑ͏ʹ .protocolVersion(ProtocolVersion.RESP2) .build()

Slide 19

Slide 19 text

Lettuce 6 Big Keys HGETALL Redis Big Keys https://www.alibabacloud.com/blog/a-detailed-explanation-of-the-detection-and-processing-of- bigkey-and-hotkey-in-redis_ 59 8143 Log Hash Latency 1 9 Big Keys v5.3.15 v6.1.16

Slide 20

Slide 20 text

CPU 
 async-profiler framegraph 2 0 async-profiler v5.3.15 v6.1.16

Slide 21

Slide 21 text

Lettuce event-loop(non-blocking) ClusterTopologyRefresh.getNodeSpecificViews framegraph 頻 Cluster Topology Refresh 2 1 async-profiler v5.3.15 v6.1.16

Slide 22

Slide 22 text

Lettuce Cluster Topology Refresh(CTR) Redis Cluster key(slot) client-side Lettuce Cluster Topology Refresh(CTR) CLUSTER NODES 60 1 MOVED redirection https://github.com/lettuce-io/lettuce-core/issues/ 3 3 9 2 2

Slide 23

Slide 23 text

Lettuce Cluster Topology Refresh(CTR) Redis Cluster key(slot) client-side Lettuce Cluster Topology Refresh(CTR) CLUSTER NODES 60 1 MOVED redirection https://github.com/lettuce-io/lettuce-core/issues/ 3 3 9 2 3 e7d1eecce10fd6bb5eb35b9f99a514335d9ba9ca 127.0.0.1:30001@31001 master - 0 0 1 connected 0-5460 
 67ed2db8d677e59ec4a4cefb06858cf2a1a89fa1 127.0.0.1:30002@31002 master - 0 1426238316232 2 connected 5461-10922 292f8b365bb7edb5e285caf0b7e6ddc7265d2f4f 127.0.0.1:30003@31003 master - 0 1426238318243 3 connected 10923-16383 07c37dfeb235213a872192d90877d0cd55635b91 127.0.0.1:30004@31004 slave e7d1eecce10fd6bb5eb35b9f99a514335d9ba9ca 0 1426238317239 4 connected 6ec23923021cf3ffec47632106199cb7f496ce01 127.0.0.1:30005@31005 slave 67ed2db8d677e59ec4a4cefb06858cf2a1a89fa1 0 1426238316232 5 connected 824fe116063bc5fcf9f4ffd895bc17aee7731ac3 127.0.0.1:30006@31006 slave 292f8b365bb7edb5e285caf0b7e6ddc7265d2f4f 0 1426238317741 6 connected

Slide 24

Slide 24 text

Lettuce Cluster Topology Refresh(CTR) Redis Cluster key(slot) client-side Lettuce Cluster Topology Refresh(CTR) CLUSTER NODES 60 1 MOVED redirection https://github.com/lettuce-io/lettuce-core/issues/ 3 3 9 2 4

Slide 25

Slide 25 text

framegraph ? 60 1 (CTR) framegraph node n O(n^ 2 ) 9 6 node 1 sec Lettuce v 6 . 0 . 0 event loop 2 5 ← EpollEventLoop.run

Slide 26

Slide 26 text

CPU x 2 Event-loop 1 1sec (CTR) 
 Redis Command I/O 
 ( letency 99.9 percentile 1sec ) 2 6

Slide 27

Slide 27 text

2 7 CTR CTR latency latency

Slide 28

Slide 28 text

2 8 event-loop CTR CTR event-loop 
 latency

Slide 29

Slide 29 text

Lettuce Issue & PR 2 9 Issue https://github.com/lettuce-io/lettuce-core/issues/ 2 0 4 5 PR https://github.com/lettuce-io/lettuce-core/pull/ 2048 6.1.8 https://github.com/lettuce-io/lettuce-core/releases/ tag/ 6 . 1 . 8 .RELEASE

Slide 30

Slide 30 text

Other Solution 3 0 CTR dynamic refresh source ( : ) > CLUSTER NODES dynamic refresh source Initial Seed Nodes 頻 Cluster Initial Seed Nodes down CTR dynamic refresh source / / ͜͜Ͱࢦఆͨ͠ϊʔυ(Initial Seed Nodes)ʹݶఆ͢Δ͜ͱ͕Ͱ͖Δ RedisURI node1 = RedisURI.create("node1", 6379); RedisURI node2 = RedisURI.create("node2", 6379); RedisClusterClient clusterClient = RedisClusterClient.create(Arrays.asList(node1, node2));

Slide 31

Slide 31 text

/ 3 1 頻 Redis Cluster ⾒ ⾒ Lettuce version 6.1.8 Local

Slide 32

Slide 32 text

Case 2 : Direct buffer OOME issue due to bad usage of Spring WebClient

Slide 33

Slide 33 text

Out of Memory Error(OOME) 頻 CSV spec ( 2 0 core, 64 GB Mem, -Xmx 24 g) 3 3

Slide 34

Slide 34 text

Workaround 2,3 ( ) -XX:+ExitOnOutOfMemoryError JVM OOME JVM supervisord restart 3 4

Slide 35

Slide 35 text

(-Xmx 2 4 g) CSV ? Eclipse Memory Analyzer 
 3 5

Slide 36

Slide 36 text

OutOfMemoryError Direct buffer memory ( ) (native <-> ) / 3 6 OutOfMemoryError: Direct buffer memory Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.base/java.nio.Bits.reserveMemory(Bits.java:175) at java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:118) at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:317) at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:645) at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:621) ※ ͳ͓ɺJava13͔Β͸Τϥʔϝοηʔδ͕Θ͔Γ΍͘͢਌੾ʹͳ͍ͬͯ·͢ 
 https://bugs.openjdk.java.net/browse/JDK-8048192

Slide 37

Slide 37 text

(Micrometer ) https://github.com/micrometer-metrics/.../JvmMemoryMetrics.java 3 7

Slide 38

Slide 38 text

JDK https://github.com/openjdk/jdk/blob/jdk- 11 + 28 /src/java.base/share/classes/java/nio/Bits.java#L 1 75 Runtime.getRuntime().maxMemory() -Xmx ( JVM -Xmx * 2 ) -XX:MaxDirectMemorySize 
 OOME 3 8 OOME

Slide 39

Slide 39 text

netty netty ⾒ 2 netty Spring WebClient (WebFlux) Lettuce (Redis Client) Spring WebClient Lettuce 3 9

Slide 40

Slide 40 text

Spring WebClient Lettuce Spring WebClient 頻 CSV Mono Reactor Flux Spring WebClient Spring Boot 2 . 1 2 56 KB 
 4 0 webClient.get() .uri(uri) .retrieve() .bodyToMono(byte[].class) // ո͍͠ .block(); WebClient.builder() .codecs(configurer -> configurer.defaultCodecs() .maxInMemorySize(-1)) // ແ੍ݶʹ͍ͯͨ͠ .build(); Spring WebClient Lettuce

Slide 41

Slide 41 text

2 30 0 MB text file 4 1

Slide 42

Slide 42 text

OOME 4 4 * 300MB OOME 4 2 Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.base/java.nio.Bits.reserveMemory(Bits.java:175) at java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:118) at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:317) at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:648) at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:623) at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:202) at io.netty.buffer.PoolArena.tcacheAllocateNormal(PoolArena.java:186) at io.netty.buffer.PoolArena.allocate(PoolArena.java:136) at io.netty.buffer.PoolArena.allocate(PoolArena.java:126) at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:394) at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:188) at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:179) seq 4 | xargs -P 4 -I{} curl localhost:8080/mono -XX:MaxDirectMemorySize= 1 g

Slide 43

Slide 43 text

(jcmd PID GC.run ) OOME 4 3 watch curl localhost:8080/mem name=direct, count=76, memoryUsed=1008MB, totalCapacity=1008MB name=direct, count=76, memoryUsed=1008MB, totalCapacity=1008MB …

Slide 44

Slide 44 text

OOME ⾒ -XX:+ExitOnOutOfMemoryError report_java_out_of_memory https://github.com/openjdk/jdk 11 u/.../src/hotspot/share/utilities/debug.cpp#L 3 19 Systemd ⾒ ⾒ Micrometer 4 4 -XX:+ExitOnOutOfMemoryError !!

Slide 45

Slide 45 text

Flux Flux ( ) 4 5 seq 4 | xargs -P 4 -I{} curl localhost:8080/flux curl localhost:8080/mem name=direct, count=17, memoryUsed=80MB, totalCapacity=80MB

Slide 46

Slide 46 text

4 6 OOME Before After !! 570MB΄ͲͰऩ·ΔΑ͏ʹ ࠓ೥ͷ೥຤೥࢝͸Ժ΍͔ʹաͤͦ͝͏🙌

Slide 47

Slide 47 text

netty PooledByteBufAllocator jemalloc https://people.freebsd.org/~jasone/jemalloc/bsdcan 200 6 /jemalloc.pdf https://www.facebook.com/notes/ 1015879 1475 077200 / 4 7

Slide 48

Slide 48 text

WebClient( reactor-netty) CPU I/O CPU 20 頻 2GB CSV 20 * 2GB = 4 0 GB 24GB = -Xmx or -XX:MaxDirectMemorySize 4 8

Slide 49

Slide 49 text

/ 4 9 JVM OOME -XX:+ExitOnOutOfMemoryError ⾒ Reactive Streams netty

Slide 50

Slide 50 text

!!

Slide 51

Slide 51 text

We re Hiring !! 5 1 LINE B 2 B 2 SRE (https://linecorp.com/ja/career/position/ 3112 ) / (https://linecorp.com/ja/career/position/ 231 6 )

Slide 52

Slide 52 text

No content