Coherence - Architecture key notes

Oracle Coherence Architecture key notes Alexey Ragozin [email protected] May 2012

Presentation overview Topics • Network model • Threading model •
Cache operations scalability • Few well known pit falls * Usage of partitioned cache is assumed unless stated otherwise

Network overview Inter member transport in Coherence cluster • Message
based protocol TCMP • In-order guaranteed delivery between members • NACKs for low latency communications • Can work over UDP, TCP, SSL (+ Oracle ExoBus) • Limited use of multicast (deprecated) Cluster housekeeping protocol • TCP Ring – fast way to detect killed processes • Witness protocol for disconnecting members • Communication pausing

Network overview Coherence*Extend • TCP or SSL transport • Each
remote service have one connection at time  Single TCP is shared between all service users  Number of connection can be increase used multiple services • Request pass-through  if Extend connection and cluster are using same serialization

Network pipeline API Cache service Packet publisher Packet speaker OS
Packet listener Packet receiver OS Service thread Worker thread Packet receiver Packet publisher Packet speaker Packet listener Packet receiver Service thread Cache service Packet publisher Packet speaker Packet listener API Service thread Packet receiver Packet listener OS OS Packet speaker Packet publisher Service thread Worker thread Serialization Deserialization Client thread Simple cache GET request

Network pipeline • Heavy thread usage  More cores is
better  Starving on CPU – more context switches • Network IO is effectively single threaded  Multiple nodes per server may be required to utilize network • Each service has single control thread

Data distribution #3 #0 #4 #1 #2 #5 Cache A
Cache B Member 1 Member 2 Backing map Backing map Backing map Backing map

Data distribution  Same partition distribution is used for all
caches in same service • can be exploited for collocating data in caches  Balancing by partition count  Single backing map per cache • per node (default) • per partition (can be configured)  Partition backup is stored in separate backing map

Threading overview  Control thread – one per service •
receive network messages • perform cache operations in no thread pool configured  Thread pool – optional, size is configured • desterilize data in request (if needed) • perform operation (aggregators, processor, backing map access) • serialize result data (if needed)  Event thread – one per service • call map listeners

Locking and job distribution • Update operations require partition locks
• Reads including aggregators – lock free • “read dirty” – cross operation visibility • Updates are atomic per job • Jobs – (only if thread pool enabled)  Key set based request are split – job per partition  Filter based request – one multiple partition job  Calculate key set, lock partitions, execute job

Problems of threading model • Event delivery is single threaded
• Dispatching of large request may block control thread, making service unresponsive • No discrimination between tasks  Few long running task may saturate thread pool, making cache unresponsive • Limited scheduling priorities • Key based requests are producing more jobs, occupying more threads  Single large getAll() request for DB backed cache may saturate all thread pools on all nodes for considerable time

Interaction with backing map  Backing map notifies content changes
by events • events received by thread, execution write operation, added to transaction change set (change sets are replicated atomically) • events received out of bound, replicated asynchronously • backup partition copy is passive

Operation scalability • Key based operations – linearly scalable 
growing cluster – linearly increase operation throughput • Indexed queries / aggregation  growing cluster – marginally contributes to throughput  more data – marginal decrease of throughput • Non indexed queries / aggregation  throughput proportional data / cluster core count

Well known pitfalls  Relying on reference walking • Problem
- network latency accumulation • Hierarchical organization – is typical example  Solutions • Denormalization • Data affinity • Indexes

Well known pitfalls  Too fine grained operations • accumulating
network latency  Too bulky operations • blocking control thread for long • Saturation thread pools  Solutions • Grouping operations in limited size batches • Grouping operations per member • Grouping operations per partitions

Well known pitfalls  Abusing grid-side (inplace) processing • CPU
on storage nodes is limited resources • grid-side processing may require more total serialization efforts  Solution, account all factors choosing • As is data retrieval requires no marshaling on grid side • Network bandwidth is rarely a limitation • Grid nodes CPUs are shared and limited resource

Thank you Alexey Ragozin [email protected] http://blog.ragozin.info - my articles http://code.google.com/p/gridkit
- my open source code

Coherence - Architecture key notes

Coherence - Architecture key notes

aragozin

More Decks by aragozin

Other Decks in Technology

Featured

Transcript

Oracle Coherence Architecture key notes Alexey Ragozin [email protected] May 2012

Presentation overview Topics • Network model • Threading model •

Network overview Inter member transport in Coherence cluster • Message

Network overview Coherence*Extend • TCP or SSL transport • Each

Network pipeline API Cache service Packet publisher Packet speaker OS

Network pipeline • Heavy thread usage  More cores is

Data distribution #3 #0 #4 #1 #2 #5 Cache A

Data distribution  Same partition distribution is used for all

Threading overview  Control thread – one per service •

Locking and job distribution • Update operations require partition locks

Problems of threading model • Event delivery is single threaded

Interaction with backing map  Backing map notifies content changes

Operation scalability • Key based operations – linearly scalable 

Well known pitfalls  Relying on reference walking • Problem

Well known pitfalls  Too fine grained operations • accumulating

Well known pitfalls  Abusing grid-side (inplace) processing • CPU

Thank you Alexey Ragozin [email protected] http://blog.ragozin.info - my articles http://code.google.com/p/gridkit