Lessons We Learned Through Hell When Scaling Blibli.com

Disclaimer Presentations are intended for educational purposes only and not
to replace independent professional judgment. The views and opinions expressed in this presentation do not necessarily reflect the official policy or position of blibli.com. Audience discretion is advised.

Who am I? • Alex Xandra Albert Sim • Lead
Principal R&D Engineer at blibli.com • [email protected] • bertzzie(.sim)

The “Usual” Backend Architecture Gateway (Reverse Proxy) Server / Servlet
(Tomcat, Netty, etc.) App (Spring, JavaEE, etc.) Database Cache Other Services (Internal / External)

Case 1 THREADING, POOLING, AND SCHEDULING

Problem Area Server / Servlet (Tomcat, Netty, etc.) App (Spring,
JavaEE, etc.)

Background • Modern computers usually have multiple cores per CPU
or even multiple CPUs

Threads • Basic computational concept to handle concurrency Service Queue
Task 1 Task 2 … Thread Pool Thread 1 Thread 2 Thread 3 … The Java Threading Model

Optimizing Thread Utilization • For applications that you don’t write
(i.e. Tomcat), there’s usually a config for it • Too much threads will result in performance degradation due to thread switching cost • The common formula [0] for max thread size is: [0] Goetz, Brian. Java Concurrency in Practice Num of Threads = Num of Cores * (1 + Wait Time / Service Time) Wait time: time spent waiting for IO bound task to complete Service time: time spent processing • Remember, this is oversimplified. We usually have multiple thread pools (HTTP, JDBC, etc.) with different workload requirements

Know Your Limit • Check your backend capacity with Little’s
Law Capacity = Average Arrival Rate / average latency • Arrival rate is measured in request per second • This formula measures how many request you can measure with a stable response time • Repeatedly perf test your backend with these two rough calculation to see its real capacity

Handling Concurrency On Your Own • For your own code,
it’s usually better and easier to use a higher-level abstraction than threads • Battle tested, a lot of examples, easier to learn • Example: RxJava, Reactor, Akka • We choose RxJava (old projects) and Reactor (new projects)

RxJava • Reactive Extensions, help us in processing and composing
events • Single-threaded by default, could easily be made concurrent • Concurrency is achieved via Schedulers • Changing Scheduler can have a major performance impact, depending on your use case • Remember: test, test, test!

Common Reactive Schedulers • Immediate Scheduler – Blocks current task
and run task immediately on the same thread • Single Scheduler – Run task on another thread, but only one thread is provided • Computation Scheduler – RxJava only: run task on other threads. Thread count == CPU core count • IO Scheduler – RxJava only: run task on other threads. Threads are unbounded • Bounded Elastic – Reactor only: like IO Scheduler, but with cap on max thread count

Lesson 1 • Understand your basics, know your libraries •
No shortcut in performance tuning, it’s typically: test, test, test!

Case 2 BREAKING THE CIRCUIT-BREAKER

Problem Area App (Spring, JavaEE, etc.) Other Services (Internal /
External)

Background Story • Incoming traffic so huge it’s indistinguishable with
DDoS • At gateway level: port exhaustion keeps happening • At server level: never ending thread exhaustion • At application level: timeouts, timeouts, timeouts

Circuit Breaker • B-but we have circuit breaker? • First
thing first: what’s a circuit breaker? Normal Request Flow Service A Service B HTTP Request HTTP Response HTTP Library Service Layer With Circuit Breaker Service A Service B HTTP Request HTTP Response Circuit Breaker Service Layer HTTP Library

Circuit Breaker

Circuit breaker (Hystrix)

Queue? Service A Service B (Slow Response) HTTP Request Request
1 Come

1 Come Request 2 Come

1 Come Request 2 Come Request 3 Come (Thread Exhaustion)

1 Come Request 2 Come Request 3 Come (Thread Exhaustion) Request 4 Queued

1 Come Request 2 Come Request 3 Come (Thread Exhaustion) Request 4 Queued Request 5 Queued

1 Come Request 2 Come Request 3 Come (Thread Exhaustion) Request 4 Queued Request 5 Queued Request 6 Queued (Max Queue Reached)

What’s next?

Cascading • This Queue Problem is cascading! Tomcat Hystrix Threading
in General <Executor maxQueueSize= <Connector acceptCount= hystrix.threadpool.HystrixThreadPoolKey.maxQueueSize • There are caches involved in every layer – making us late to notice!

Lesson 2 • Timeouts, managed incorrectly could eat resources VERY
fast • Be careful with your queues and cache! • Drop the request you can’t handle ASAP • Plan and anticipate for cascading failure

Case 3 CONNECTING THE MYSTERIOUS BOXES

Problem Area App (Spring, JavaEE, etc.)

Monitoring Applications • On a high scale application, monitoring is
a very crucial tool for operations and optimization • Usually we can peek into performance details without much impact • Example trace:

Problem Sometimes, there’s something not detected :(

Enter Distributed Tracing Distributed tracing, also called distributed request tracing,
is a method used to profile and monitor applications, especially those built using a microservices architecture. Distributed tracing helps pinpoint where failures occur and what causes poor performance. https://opentracing.io/docs/overview/what-is-tracing/

Saved by Open Source Open Tracing to the Rescue

Service B Common Transactions Flow in Distributed System Service A
Executor Thread 1 Thread 2 Thread 3 … Request Executor Thread 1 Thread 2 Thread 3 … Request

Tracing Request Flow External Service Service A Service B Request
Response 1 Trace ID

Tracing Request Flow Service A Executor Thread 1 (Span 1)
Thread 2 (Span 3) Thread 3 (Span 2) Thread 4 (Span 4) Request 1 Trace ID

Traces and Spans • Trace ID represents the lifetime of
a request • Trace ID is the same across services, covering the whole request-response flow • Span ID is an individual unit of work • Span ID must contain a Trace ID to create a relationship between spans and trace • Span ID can be continued to link between spans • 1 Trace ID can have multiple Span IDs • To trace a request we: – mark every request with a Trace ID – create Spans for each thread or unit of work – mark every external request with the trace and span id created earlier

Final Instrumentation (Zipkin)

Lesson 3 • To save yourselves from headaches, make sure
your monitoring tools work well • Use a standard architecture and tools whenever possible • Future discussion: baggage items, logging, tagging

CLOSING

Closing • Only way to know your real performance is
by testing (on production) • Manage your threads, caches, and queues VERY carefully • Failures could cascade if you are not careful • Your Metrics and Instrumentation tools should be prepared for your architecture

Thank You Slides at: https://s.id/7M1lM

Lessons We Learned Through Hell When Scaling Bl...

Lessons We Learned Through Hell When Scaling Blibli.com

More Decks by Alex Xandra Albert Sim

Other Decks in Technology

Featured

Transcript