Slide 1

Slide 1 text

Caching
 for business applications Michael Plöd - innoQ Twitter: @bitboss Kraków, 17-19 May 2017

Slide 2

Slide 2 text

I will talk about Caching Types / Topologies Best Practices for Caching in Enterprise Applications I will NOT talk about Latency / Synchronization discussion What is the best caching product on the market HTTP / Database Caching Caching in JPA, Hibernate or other ORMs

Slide 3

Slide 3 text

Cache / kæʃ / 
 
 In computing, a cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere. If requested data is contained in the cache (cache hit), this request can be served by simply reading the cache, which is comparatively faster. Otherwise (cache miss), the data has to be recomputed or fetched from its original storage location, which is comparatively slower. Hence, the greater the number of requests that can be served from the cache, the faster the overall system performance becomes. Source: http://en.wikipedia.org/wiki/Cache_(computing)

Slide 4

Slide 4 text

That’s awesome. Let’s cache everything and everywhere and distribute it all in a Cluster in a transactional manner ohhh by the way: Twitter has been doing that for ages Are you crazy?

Slide 5

Slide 5 text

Business-Applications !=
 Twitter / Facebook & co.

Slide 6

Slide 6 text

Many enterprise grade projects are adapting caching too defensive or too offensive and are running into consistency or performance issues because of that

Slide 7

Slide 7 text

But with a well adjusted caching strategy you will make your application more scalable, faster and cheaper to operate.

Slide 8

Slide 8 text

CACHES Types of
 Places for Local Cache, Data Grid, Document Store, JPA First Level Cache, JPA Second Level Cache, Hybrid Cache Database, Heap, HTTP Proxy, Browser, Prozessor, Disk, Off Heap, Persistence- Framework, Application

Slide 9

Slide 9 text

We will focus on local and distributed caching at the application level

Slide 10

Slide 10 text

Which data shall I cache? Where shall I cache? Which cache shall I use? Which impact does it have on my infrastructure How about data-consistency How do I introduce caching? How do I abstract my cache implementation?

Slide 11

Slide 11 text

1 Identify suitable layers for caching

Slide 12

Slide 12 text

ComplaintManagementRestController ComplaintManagementBusinessService DataAggrgationManager Host Commands SAP Commands Spring Data Repository HTTP Caching Read Operations Read Operations Read Operations Read Operations Read and Write Operations Suitable Layers for
 Caching

Slide 13

Slide 13 text

2 Stay local as long as possible

Slide 14

Slide 14 text

Lokal In-Memory JVM Cache

Slide 15

Slide 15 text

Clustered JVM Cache JVM Cache JVM Cache JVM Cache

Slide 16

Slide 16 text

Which data shall I cache? Where shall I cache? Which cache shall I use? Which impact does it have on my infrastructure How about data-consistency How do I introduce caching? How about caching in Spring?

Slide 17

Slide 17 text

JVM JVM JVM JVM Clustered - with sync Cache Cache Cache Cache Invalidation Replication

Slide 18

Slide 18 text

3 Avoid real replication where possible

Slide 19

Slide 19 text

Cache Cache Cache Cache Invalidation - Option 1 #1 PUT (Insert) PUT (Insert) #1 #1 PUT (Insert) PUT (Insert) #1

Slide 20

Slide 20 text

Cache Cache Cache Cache #1 #1 PUT (Update) #1 inv #1 #1 Invalidation - Option 1

Slide 21

Slide 21 text

Cache Cache Cache Cache Invalidation - Option 2 #1 PUT (Insert) PUT (Insert) #1 #1 PUT (Insert) PUT (Insert) #1

Slide 22

Slide 22 text

Cache Cache Cache Cache #1 #1 #1 Replication #1 PUT (Insert) PUT (Update) #1

Slide 23

Slide 23 text

As of now every cache could potentially hold every data which consumes heap memory

Slide 24

Slide 24 text

Big Heap ?

Slide 25

Slide 25 text

Which data shall I cache? Where shall I cache? Which cache shall I use? Which impact does it have on my infrastructure How about data-consistency How do I introduce caching? How about caching in Spring?

Slide 26

Slide 26 text

4 Avoid big heaps just for caching

Slide 27

Slide 27 text

Big heap leads to long major GCs Application Data Cache 32 GB

Slide 28

Slide 28 text

Long GCs can destabilize your cluster JVM Cache JVM Cache JVM Cache JVM Cache GC GC

Slide 29

Slide 29 text

Small caches are a bad idea! Many evictions, fewer hits, no „hot data“.
 
 This is especially critical for replicating caches.

Slide 30

Slide 30 text

5 Use a distributed cache for big amounts of data

Slide 31

Slide 31 text

Distributed Caches JVM JVM JVM JVM Cache Node 1 Cache Node 2 Cache Node 3

Slide 32

Slide 32 text

1 Customer #23 Customer #30 Customer #27 Customer #32 2

Slide 33

Slide 33 text

1 2 Customer #23 Customer #30 Customer #27 Customer #32 BACKUP #27 BACKUP #32 BACKUP #23 BACKUP #30 Data is being distributed and backed up

Slide 34

Slide 34 text

1 2 Customer #23 Customer #30 Customer #27 Customer #32 BACKUP #27 BACKUP #32 BACKUP #23 BACKUP #30 3

Slide 35

Slide 35 text

3 1 2 Customer #23 Customer #30 Customer #27 Customer #32 BACKUP #27 BACKUP #32 BACKUP #23 BACKUP #30 4

Slide 36

Slide 36 text

4 3 1 2 Customer #23 Customer #30 Customer #27 Customer #32 BACKUP #27 BACKUP #32 BACKUP #23 BACKUP #30

Slide 37

Slide 37 text

A distributed cache leads to smaller heaps, more capacity and is easy to scale Application Data Cache 2 - 4 GB … Cache

Slide 38

Slide 38 text

6 The operations specialist is your new best friend

Slide 39

Slide 39 text

Clustered caches are complex. Please make sure that operations and networking are involved as early as possible.

Slide 40

Slide 40 text

Which data shall I cache? Where shall I cache? Which cache shall I use? Which impact does it have on my infrastructure How about data-consistency How do I introduce caching? How about caching in Spring?

Slide 41

Slide 41 text

7 Make sure that only suitable data gets cached

Slide 42

Slide 42 text

The best cache candidates are read-mostly data, which are expensive to obtain

Slide 43

Slide 43 text

If you urgently must cache write- intensive data make sure to use a distributed cache and not a replicated or invalidating one

Slide 44

Slide 44 text

Which data shall I cache? Where shall I cache? Which cache shall I use? Which impact does it have on my infrastructure How about data-consistency How do I introduce caching? How about caching in Spring?

Slide 45

Slide 45 text

8 Only use existing cache implementations

Slide 46

Slide 46 text

NEVER write your own cache implementation EVER

Slide 47

Slide 47 text

CACHE
 Implementations Infinispan, EHCache, Hazelcast, Couchbase, Memcache, OSCache, SwarmCache, Xtreme Cache, Apache DirectMemory Terracotta, Coherence, Gemfire, Cacheonix, WebSphere eXtreme Scale, Oracle 12c In Memory Database

Slide 48

Slide 48 text

Which data shall I cache? Where shall I cache? Which cache shall I use? Which impact does it have on my infrastructure How about data-consistency How do I introduce caching? How about caching in Spring?

Slide 49

Slide 49 text

9 Introduce Caching in three steps

Slide 50

Slide 50 text

Optimize your application Local Cache Distributed Cache Performance Boost Performance Loss

Slide 51

Slide 51 text

10 Optimize Serialization

Slide 52

Slide 52 text

Example: Hazelcast
 putting and getting 10.000 objects locally GET Time PUT Time Payload Size Serializable ? ? ? Data
 Serializable ? ? ? Identifier
 Data
 Serializable ? ? ?

Slide 53

Slide 53 text

Example: Hazelcast
 putting and getting 10.000 objects locally GET Time PUT Time Payload Size Serializable 1287 ms 1220 ms 1164 byte Data
 Serializable 443 ms 408 ms 916 byte Identifier
 Data
 Serializable 264 ms 207 ms 882 byte

Slide 54

Slide 54 text

JAVA SERIALIZATION
 SUCKS for Caching if alternatives are present

Slide 55

Slide 55 text

11 Use Off-Heap Storage for Cache instances with more than 4 GB Heap Size

Slide 56

Slide 56 text

JVM Cache Runtime Cache Data 32 GB HEAP

Slide 57

Slide 57 text

Off Heap 30 GB RAM JVM Cache Runtime Cache Data 2 GB HEAP No Garbage Collection Very short Garbage Collections

Slide 58

Slide 58 text

12 Mind the security gap

Slide 59

Slide 59 text

Application „CRM“ „Host“ DB Security Security Security Cache CRM Data SAP Data DB Data ? Mind security when reading data from the cache

Slide 60

Slide 60 text

13 Abstract your cache provider

Slide 61

Slide 61 text

public Account retrieveAccount(String accountNumber) { Cache cache = ehCacheMgr.getCache(„accounts“); Account account = null; Element element = cache.get(accountNumber); if(element == null) { //execute some business logic for retrieval //account = result of logic above cache.put(new Element(accountNumber, account)); } else { account = (Account)element.getObjectValue(); } return account; } Tying your code to a cache provider is bad practice

Slide 62

Slide 62 text

public Account retrieveAccount(String accountNumber) { Cache cache = ehCacheMgr.getCache(„accounts“); Account account = null; Element element = cache.get(accountNumber); if(element == null) { //execute some business logic for retrieval //account = result of logic above cache.put(new Element(accountNumber, account)); } else { account = (Account)element.getObjectValue(); } return account; } Try switching from EHCache to Hazelcast You will have to adjust these lines of code to the Hazelcast API

Slide 63

Slide 63 text

public Account retrieveAccount(String accountNumber) { Cache cache = ehCacheMgr.getCache(„accounts“); Account account = null; Element element = cache.get(accountNumber); if(element == null) { //execute some business logic for retrieval //account = result of logic above cache.put(new Element(accountNumber, account)); } else { account = (Account)element.getObjectValue(); } return account; } You can’t switch cache providers between environments EHCache is tightly coupled to your code

Slide 64

Slide 64 text

public Account retrieveAccount(String accountNumber) { Cache cache = ehCacheMgr.getCache(„accounts“); Account account = null; Element element = cache.get(accountNumber); if(element == null) { //execute some business logic for retrieval //account = result of logic above cache.put(new Element(accountNumber, account)); } else { account = (Account)element.getObjectValue(); } return account; } You mess up your business logic with infrastructure This is all caching related code without any business relevance

Slide 65

Slide 65 text

@Cacheable("Customers") public Customer getCustomer(String customerNumber) { … } Introducing Spring’s cache abstraction

Slide 66

Slide 66 text

Michael Plöd - @bitboss Kraków, 17-19 May 2017 THANK YOU!
 
 Follow me on Twitter for slides
 @bitboss