Slide 1

Slide 1 text

DAX 101 1 2020/2/15 Kenju Wagatsuma

Slide 2

Slide 2 text

DAX Wrap Up 2

Slide 3

Slide 3 text

DAX is… DynamoDB-specific in-memory cache store 3

Slide 4

Slide 4 text

[DAX] Glossary of Terms 4 5FSN %FTDSJQUJPO /PEFT "OPEFJTUIFNJOJNVNDPNQPOFOUGPSB%"9DMVTUFSl1SJNBSZOPEFzIBOEMFSFBE  XSJUFSFRVFTUTBOElSFQMJDBOPEFTzPOMZIBOEMFSFBESFRVFTUT $MVTUFST "DMVTUFSJTDPNQPTFEGSPN OPEFT 1BSBNFUFS(SPVQT :PVDBODIBOHF55-CZTFUUJOHDVTUPN1BSBNFUFS(SPVQT

Slide 5

Slide 5 text

[DAX] Comparison to other cache store 5 %"9 &MBTUJDBDIFNFNDBDIFE JONFNPSZ .BOBHFE ✅ ✅  4IBSFEDBDIF ✅ ✅  #BDLFOEEBUBTPVSDF %ZOBNP%#POMZ BOZUIJOH .Z42- %JTL 31$ FUD .FUSJDT ✅ ✅  55- ⚠ HMPCBM55- ✅ ⚠ IBSEXPSLSFRVJSFE 1FSGPSNBODF ЖdNT dNT ЖdЖ

Slide 6

Slide 6 text

[DAX] Single-leader Replication 6 +3 nodes for different Multi-A/Z for production Background replication (eventually consistent model) Single “primary node” handles all write requests

Slide 7

Slide 7 text

Subnet C Subnet B DAX Cluster Subnet A [DAX] Single-leader Replication - Write 7 Primary Replica Replica

Slide 8

Slide 8 text

DAX Cluster [DAX] Single-leader Replication - Write 8 Primary Replica Replica Application DynamoDB Table DynamoDB Table Write Read

Slide 9

Slide 9 text

DAX Cluster [DAX] Single-leader Replication - Replicate 9 Primary Replica Replica Application DynamoDB Table DynamoDB Table Write Read

Slide 10

Slide 10 text

DAX Cluster [DAX] Single-leader Replication - Read 10 Primary Replica Replica Application DynamoDB Table DynamoDB Table Write Read

Slide 11

Slide 11 text

[DAX] Cache Algorithm 11 Least Recently Used (LRU) cache Negative cache also implemented Eviction happens when items are taken & cache is full Global Time-To-Live (TTL)

Slide 12

Slide 12 text

[DAX] Item Cache / Query Cache 12 Item Cache Query Cache Operation GetItem / BatchGetItem Query / Scan Metrics ItemCacheHits/ItemCacheMisses QueryCacheHits/QueryCacheMisses ScanCacheHits/ScanCacheMisses

Slide 13

Slide 13 text

[DAX] Write Strategy - Write-Through 13 #write #read Application DAX DynamoDB Application DAX DynamoDB

Slide 14

Slide 14 text

[DAX] Write Strategy - Write-Around 14 #write #read Application DynamoDB Application DAX DynamoDB

Slide 15

Slide 15 text

[DAX] Write Strategy 15 8SJUF5ISPVHI 8SJUF"SPVOE 1SPT 4ZODISPOJ[FEJUFNTCFUXFFO%"9 BOE%ZOBNP%# 8SJUFDBOTDBMF FHCVMLVQEBUF $POT ☹8SJUFPQFSBUJPOIBTPWFSIFBE ☹*UFNTDBOCFJODPOTJTUFOU JGPUIFS BQQXSJUFUP%ZOBNP%#EJSFDUMZ ☹&WFOUVBMMZ$POTJTUFOU JUFNTJO%"9 OPEFTDBOCFPVUEBUFE

Slide 16

Slide 16 text

[DAX] Within VPC + Multi-A/Z 16

Slide 17

Slide 17 text

[DAX] Limitation 17 Maximum nodes is 10 (1 primary node + ~9 replica nodes) Maximum nodes is 50 per Region. Maximum Parameter Groups is 20 per Region Maximum DAX Subnet Groups is 50 per Region. Maximum subnets per groups is 20.

Slide 18

Slide 18 text

Monitoring 18

Slide 19

Slide 19 text

[Monitoring] CloudWatch Metrics 19 NFUSJDT EFTDSJQUJPO $166UJMJ[BUJPO PG$16VUJMJ[BUJPOUTNBMMVTFW$16 ⚠:PVOFFEUPTDBMFVQJGUIJTNFUSJDTJTTBUVSBUFE *UFN$BDIF)JUT *UFN$BDIF.JTTFT $BMDVMBUF$BDIF)JU3BUJP  GSPNUIFTFNFUSJDT ⚠*UFN$BDIF2VFSZ$BDIF

Slide 20

Slide 20 text

[Monitoring] CloudWatch Metrics 20 NFUSJDT EFTDSJQUJPO 5ISPUUMJOH3FRVFTU$PVOU PGSFRVFTUTUISPUUMFECZUIFOPEFPSDMVTUFS ⚠5ISPUUMFEBU%"95ISPUUMFEBU%ZOBNP%# 'BJMFE3FRVFTU$PVOU PGSFRVFTUTUIBUSFTVMUFEJOBOFSSPSSFQPSUFE ⚠&SSPS3FR$PVOU5ISPUUMFE 'BJMFE

Slide 21

Slide 21 text

[Monitoring] CloudWatch Metrics 21 NFUSJDT EFTDSJQUJPO &WJDUFE4J[F ⚠*G$BDIF)JU3BUJP&WJDUFE4J[FJTHSPXJOH UIF XPSLJOHTFUNJHIUJODSFBTF4IPVMEDPOTJEFSTDBMFVQ $MJFOU$POOFDUJPOT PGTJNVMUBOFPVTDPOOFDUJPOTGSPNDMJFOUT ⚠$MJFOU$PO$PVOU㲈PG3VOOJOH5BTLT

Slide 22

Slide 22 text

[Monitoring] Grafana 22

Slide 23

Slide 23 text

[Monitoring] SLI/SLO 23

Slide 24

Slide 24 text

Alerting 24

Slide 25

Slide 25 text

[Alerting] CW Alarm - Infrastructure 25 CW Alarm SNS Topic Lambda Slack

Slide 26

Slide 26 text

[Alerting] CW Alarm - CPUUtilization 26

Slide 27

Slide 27 text

[Alerting] CW Alarm - Cache Hit Ratio 27 ItemCacheHitRatio = ItemCacheHits / ( ItemCacheHits + ItemCacheMisses )

Slide 28

Slide 28 text

[Alerting] CW Alarm - Cache Hit Ratio 28

Slide 29

Slide 29 text

[Alerting] DAX Events - Infrastructure 29 DAX Event SNS Topic Lambda Slack

Slide 30

Slide 30 text

[Alerting] DAX Events - Infrastructure 30 failure/success in adding a node changes to the security groups