Overview of the Technologies • Servlet 3.0 Asynchronous Support – Available in Jakarta EE since 2009 to solve blocking operations – It uses AsyncContext to ensure efficient usage of server threads • OCSG DAF (Dynamic API Framework) – Customizable SDK to implement staged-based pipelines – It is the underlying framework behind the APIP Policy SDK • Apache HTTP Async Client Framework – Open-source implementation of Doug Lea's Reactor Pattern – It uses an eventing model to minimize I/O bound thread usage Overview of Technologies used internally by the OCSG Runtime Servlet 3.0 Async OCSG DAF Apache HTTP Async Client
Gateway HTTP Implementation • Operation complete() is invoked when a response is ready: Response action chain has been completed "Happy Path" Gateway decides to abruptly interrupt the flow (i.e.: No Path Info) Certain errors occurs during the processing of the transaction API Endpoint is Implemented using Servlet 3.0 with Async Support @WebServlet(name = "Dynamic API Framework", urlPatterns = {"/"}, asyncSupported = true) public final class DafHttpServlet extends HttpServlet { protected void service(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException { AsyncContext ctx = req.startAsync(); // Transaction is handled completely asynchronously... ctx.complete(); // Response is sent back to caller... } }
Gateway HTTP Implementation • DAF or "Dynamic API Framework" is the core of OCSG – Customizable SDK to implement request/response actions – Responsible for the execution of actions (i.e.: API Policies) – It binds the request/response execution using a single SDK – It dynamically loads the API configuration from the cache – Use "oracle.sdp.daf" for debugging in the log4jconfig.xml – Encapsulated in the following WAR file: • DAF ensures that built-in EDR's are always generated – Built-in EDR's are helpful for troubleshooting – They are the foundation of the analytics engine Overview about the OCSG DAF Layer
Gateway HTTP Implementation Sequence Diagram of the Runtime Interaction (Response) AyncContext ActionChainManager CalloutCallback HttpClientDelegator continueActionChain() completed(response) InternalIODispatch completed(response) Action complete() process(httpContext) If there are no other actions to execute, then finish the API call If the response is from a callout request, then the pipeline must continue.
Understanding Thread Pools • Multiple Thread Pools are created in a single JVM – Understanding each one of them may help to troubleshoot threading problems – Most of them cannot be adjusted with a knob, but this is not necessarily a issue – Gateways are very efficient CPU wise, but scalability cannot be always ensured • The most important Thread Pools for the Gateway are: – WebLogic Single Thread Pool – I/O Dispatchers Thread Pool – CachedTheadPool from Executors – Batch Processors Thread Pool – Coherence Distributed Cache Overview of Thread Pools used in Traffic Requests
Understanding Thread Pools • Threads from this pool are used to handle inbound requests – Thread is taken from the pool when DafHttpServlet accepts a request – Same is used to execute the request action chain, thus it may use CPU – Then, it hand over the execution and goes back to the thread pool – The size of this pool is controlled by WLS using multiple algorithms* Overview of the WebLogic Single Thread Pool Backend Service Request Action Chain Request Transmission Response Action Chain API Endpoint Request Flow Response Flow Callback Handler * WLS has a maximum pool size set to 400 by default. This number represents the maximum number of concurrent/active threads doing some work. More here. Servlet 3.0 Async OCSG DAF Layer HTTP Async Client Current Thread Pool
Understanding Thread Pools • Threads from this pool are used to handle I/O related work – These threads are used to send/receive messages to/from the backend service – It uses eventing to initiate the sending/receiving within a active HTTP connection – They are always active and each request/response costs very little for the CPU – The size of this pool is the number of CPU cores multiplied by 2 or 4* Overview of the I/O Dispatchers Thread Pool * The actual number depends of which version of OCSG is being used. Starting from OCSG 6.1 build 486, the multiply factor number is four. Any prior releases uses two. Backend Service Request Action Chain Request Transmission Response Action Chain API Endpoint Request Flow Response Flow Callback Handler Servlet 3.0 Async OCSG DAF Layer HTTP Async Client Current Thread Pool
Understanding Thread Pools • Threads from this pool are used to handle outbound responses – Callback handler hands over the execution and a thread is created/reused – Same is used to execute the response action chain, thus it may use CPU – The size of this pool is unbounded and it increases/decreases on-demand* – During request callouts, threads from this pool might continue the pipeline Overview of the CachedThreadPool from Executors * Threads from this pool expires in 60 seconds if not being used. Therefore, a pool that remains idle for long enough will not consume any resources. More here. Backend Service Request Action Chain Request Transmission Response Action Chain API Endpoint Request Flow Response Flow Callback Handler Servlet 3.0 Async OCSG DAF Layer HTTP Async Client Current Thread Pool
Understanding Thread Pools • Threads from this pool are used to process generated EDRs – These threads are responsible for publishing EDRs to JMS listeners – Rest assured that their work doesn't block the API traffic flow – Each thread pulls data from a internal queue and builds a batch – The size of this pool is set to 10 by default. There is a knob for this. Overview of the Batch Processors Thread Pool API Client Request Action Chain Response Action Chain API Endpoint Request Flow Response Flow JMS Topic BP 1 BP 2 BP 3 BP N . . . EDR EDR Batch Batch Batch Batch
Understanding Thread Pools • Threads from this pool are used for the Distributed Cache Service – Coherence caches configured as distributed (a.k.a partitioned) uses threads from this pool to maximize the throughput of message transmission across the cluster members. Overview of the Coherence Distributed Cache – This infrastructure makes sense when the gateway is configured for multi-tier. But as of today; the API Gateway only supports the single-tier configuration and therefore, the usage of Coherence is unnecessary. – Current release of the API Gateway creates 100 threads for this service. This is a waste of resources and the A-Team filed a bug to remove/decrease this configuration.* * More information about this bug can be found here.
HTTP Connection Management • You have said that the gateway uses a finite/small thread pool to handle all I/O bound work. This seems to be a bottleneck to me, since the thread pool may be not large enough to handle too many concurrent requests. Right? "OK… let's back up a little bit…" Backend Service Request Action Chain Request Transmission Response Action Chain Request Flow Response Flow Callback Handler Multiple messages coming in Limited messages coming out Possible bottleneck? Possible bottleneck?
HTTP Connection Management • The gateway does not use the traditional threading pattern where each request is handled by its own thread. Instead, it uses the same reactive model found in popular technologies such as Ngnix, Node.js, Vert.x, etc. Reactor Pattern and Event-Driven for the Rescue Traditional Threading Pattern Reactive Threading Pattern The pizza shop has several operators to take orders over the phone. Each operator has a phone line. After taking the order, the customer is kept on the line until the pizza is done baking. Then operator tells that it is ready for pick up. The pizza shop only hires a few operators, but they trained the operator to hang up after taking the order, and call the customer back when the pizza becomes ready for pick up. The operator simply dispatches events as they happen !!! * This metaphor has been taken from a blog called "Explaining Event-Driven Web Servers to your Grandma". More information here.
HTTP Connection Management • Each route (i.e.: endpoint) creates a pool of persistent HTTP connections that are accessed from the I/O Dispatchers thread pool in a round-robin fashion. Each route has up to 200 connections by default, totaling 4000 per gateway. Reactor Pattern and Event-Driven for the Rescue Backend Service Request Action Chain Request Transmission Response Action Chain Request Flow Response Flow Callback Handler Multiple messages coming in Multiple messages coming out Same thread pool for both layers I/O Dispatchers HTTP Conn 01 HTTP Conn 02 HTTP Conn 03 HTTP Conn 04 HTTP Conn 05 HTTP Conn 01 HTTP Conn 02 HTTP Conn 03 HTTP Conn 04 HTTP Conn 05 I/O I/O * Although it may look like, there is no sequencing being guaranteed by the gateway. Responses may arrive in a different order.
Analyzing Thread Dumps • At this point you may have figured that the gateway is very efficient in terms of CPU usage, although scalability-related problems can always happen. • If there is a scalability problem going on, most likely there will be some sort of contention in the gateway. In order to investigate this properly; you may want to start by analyzing thread dumps, to identify the possible source of contention. • This section will show how to take thread dumps from the gateways, as well as showing how to identify the main thread pools discussed earlier. With this knowledge, you will be able to better assess if some possible stuck thread is from the gateway or – if this is something that the API developer did wrong. Investigating Issues on the Gateway related to Threads
Analyzing Thread Dumps • There are several tools available to take thread dumps from the gateway JVM. Choose the one that is better suitable for the environment where your gateways are deployed. – jstack -l <pid> <fileName> – jcmd <pid> Thread.Print <fileName> – kill -3 <pid> (for most Unix O.S) – JVisualVM from the Java JDK – JMC (Java Mission Control) Techniques to take Thread Dumps from the Gateway JVM