Upgrade to Pro — share decks privately, control downloads, hide ads and more …

In-Memory Data Grids Essentials. Oracle Coherence

In-Memory Data Grids Essentials. Oracle Coherence

IMDG talk that has been performed at GridDynamics meetup http://www.meetup.com/Grid-Dynamics-Meetup/events/151407992/

Max Myslyvtsev

November 30, 2013
Tweet

More Decks by Max Myslyvtsev

Other Decks in Programming

Transcript

  1. Scalable eCommerce Platform Solutions Preconditions of Using IMDG • Big

    amounts of data (10-100 GB) • Low latency • High availability • Distributed calculations
  2. Scalable eCommerce Platform Solutions Showcase Environment • Data StackOverflow.com dump

    (Jun 2013) 2.1M Users 14.6M Posts • Cluster 5 nodes x 7 GB = 35 GB total • Preload time 20 minutes
  3. Scalable eCommerce Platform Solutions Communication • Custom protocols over TCP

    or UDP • Senior node • Heartbeats – Cluster heartbeats – Node heartbeats Senior
  4. Scalable eCommerce Platform Solutions Communication • Custom protocols over TCP

    or UDP • Senior node • Heartbeats – Cluster heartbeats – Node heartbeats Senior
  5. Scalable eCommerce Platform Solutions Communication • Custom protocols over TCP

    or UDP • Senior node • Heartbeats – Cluster heartbeats – Node heartbeats Senior
  6. Scalable eCommerce Platform Solutions Distributed Cache Cluster Client Node Cache

    Interface Storage Node Data Storage Cache Interface
  7. Scalable eCommerce Platform Solutions Cache Topology • Replicated Storage 1

    A D B E C F Storage 2 A D B E C F Storage 3 A D B E C F Storage 1 A D B E Storage 2 C A D F Storage 3 E B F C • Partitioned
  8. Scalable eCommerce Platform Solutions Cache Operations • Get/Put/Remove • Query

    • Event handling • Invocation • MapReduce • Indexes
  9. Scalable eCommerce Platform Solutions Key hashing Cache Operations: Put Storage

    1 Storage 2 Client Primary A A B C D E F • Client knows responsible storage – Key hash is used to find it • Automatic backups Backup A Backup F Primary F
  10. Scalable eCommerce Platform Solutions Storage 2 Storage 1 Cache Operations:

    Query • Broadcast request – Unless Query is Key-Associated • All entries are evaluated – Unless Indexes are used key: field1: field2: D 70 80 key: field1: field2: C 50 60 key: field1: field2: B 30 40 key: field1: field2: A 10 20
  11. Scalable eCommerce Platform Solutions Storage 2 Storage 1 Cache Operations:

    Query • Broadcast request – Unless Query is Key-Associated • All entries are evaluated – Unless Indexes are used field1=10 or field2>70 key: field1: field2: C 50 60 key: field1: field2: B 30 40 key: field1: field2: D 70 80 key: field1: field2: A 10 20
  12. Scalable eCommerce Platform Solutions Storage 2 Storage 1 Cache Operations:

    Query • Broadcast request – Unless Query is Key-Associated • All entries are evaluated – Unless Indexes are used field1=10 and key=A key: field1: field2: D 70 80 key: field1: field2: C 50 60 key: field1: field2: B 30 40 key: field1: field2: A 10 20
  13. Scalable eCommerce Platform Solutions Cache Operations: Event handling • Subscription

    – Configuration-time – Runtime • Attachment – Key – Query • Flexible event data – Key, Value, Fields – Old, New Storage Client 1 key: field1: field2: A 30 40 Put Client 2 key: field1: field2: A 10 20
  14. Scalable eCommerce Platform Solutions Cache Operations: Event handling • Subscription

    – Configuration-time – Runtime • Attachment – Key – Query • Flexible event data – Key, Value, Fields – Old, New Storage Client 1 Notify key: oldField1: newField1: A 10 30 Client 2 key: field1: field2: A 30 40
  15. Scalable eCommerce Platform Solutions Cache Operations: Invocation • Task is

    serialized and distributed – May hold any data • Variable execution scope – Specific nodes – All nodes – Nodes bound to data Node 1 Node 2 Client Task
  16. Scalable eCommerce Platform Solutions Cache Operations: Invocation • Task is

    serialized and distributed – May hold any data • Variable execution scope – Specific nodes – All nodes – Nodes bound to data Node 1 Node 2 Task Task Client Task
  17. Scalable eCommerce Platform Solutions Cache Operations: Invocation • Task is

    serialized and distributed – May hold any data • Variable execution scope – Specific nodes – All nodes – Nodes bound to data Node 1 Node 2 Task Task Client Task
  18. Scalable eCommerce Platform Solutions • Data-bound mapping – By keys

    – By Query • Parallel execution Storage 2 Storage 1 key: field1: field2: B 30 40 key: field1: field2: C 50 60 key: field1: field2: D 70 80 Cache Operations: MapReduce sum(field2) where field1>20 key: field1: field2: A 10 20
  19. Scalable eCommerce Platform Solutions • Data-bound mapping – By keys

    – By Query • Parallel execution Storage 2 Storage 1 key: field1: field2: A 10 20 key: field1: field2: B 30 40 key: field1: field2: C 50 60 key: field1: field2: D 70 80 Cache Operations: MapReduce sum(field2) where field1>20
  20. Scalable eCommerce Platform Solutions • Data-bound mapping – By keys

    – By Query • Parallel execution Storage 2 Storage 1 key: field1: field2: A 10 20 key: field1: field2: B 30 40 key: field1: field2: C 50 60 key: field1: field2: D 70 80 Cache Operations: MapReduce sum(field2) where field1>20 40 60 80
  21. Scalable eCommerce Platform Solutions • Data-bound mapping – By keys

    – By Query • Parallel execution Storage 2 Storage 1 key: field1: field2: A 10 20 key: field1: field2: B 30 40 key: field1: field2: C 50 60 key: field1: field2: D 70 80 Cache Operations: MapReduce sum(field2) where field1>20 40 140
  22. Scalable eCommerce Platform Solutions • Data-bound mapping – By keys

    – By Query • Parallel execution Storage 2 Storage 1 key: field1: field2: A 10 20 key: field1: field2: B 30 40 key: field1: field2: C 50 60 key: field1: field2: D 70 80 Cache Operations: MapReduce sum(field2) where field1>20 180
  23. Scalable eCommerce Platform Solutions Cache operations: Indexes • Forward and

    Reverse maps • Runtime addition and removal • Local to Storage nodes • Used for all applicable operations • Efficient for high-cardinality data Value Key Value Key Key Value Key Value Reverse map Forward map
  24. Scalable eCommerce Platform Solutions Persistence Integration • Read-through • Write-through

    Storage Client Durable Storage Data Grid Client Durable Storage Data Grid Storage
  25. Scalable eCommerce Platform Solutions • Refresh-ahead • Write-behind Asynchronous Persistence

    Integration Storage Queue Client Durable Storage Data Grid Storage Queue Client Durable Storage Data Grid
  26. Scalable eCommerce Platform Solutions Storage Storage Node Near Caching •

    Local access by primary key • Lazy population • Invalidation – On entry change – By timeout Near Cache Distributed Cache A A B
  27. Scalable eCommerce Platform Solutions Continuous Caching • Local access by

    any Query • Eager population • Near real-time data Storage Storage Node Continuous Cache Distributed Cache A B A B
  28. Scalable eCommerce Platform Solutions Showcase Environment: Memory Usage Cache Scope

    Number Size in memory Storage Indexes Total* Users per Node 420K 0.1 GB 35 MB 0.25 GB Users across Cluster 2.1M 0.5 GB 175 MB 1.25 GB Posts per Node 2.92M 1.85 GB 0.55 GB 4.4 GB Posts across Cluster 14.6M 9.25 GB 2.75 GB 22 GB * – this includes primary storage, backup storage, indexes and operational costs
  29. Scalable eCommerce Platform Solutions Showcase Environment: Memory Optimization Mode Storage

    across Cluster Nodes Heap used No compression 24.6 GB 9 6.4/7 GB GZIP 9.75 GB 5 4.8/7 GB
  30. Scalable eCommerce Platform Solutions Pros and Cons Pros: • Robust

    caching and processing solution • Horizontal scalability Cons: • Data should fit into memory • No durability guarantees • GC influences stability
  31. Scalable eCommerce Platform Solutions References • Workshop – https://github.com/mmyslyvtsev/imdg-workshop •

    Coherence Knowledge Base – http://coherence.oracle.com • Developers Guide – http://docs.oracle.com/cd/E18686_01/coh.37/e18677/toc.htm • Book – http://www.amazon.com/Oracle-Coherence-3-5-Aleksandar- Seovic/dp/1847196128