Slide 1

Slide 1 text

Amazon DynamoDB Essentials Sungmin Kim AWS Solutions Architect (2023-05-10)

Slide 2

Slide 2 text

Agenda • Hash Table vs. DynamoDB • Partition Key + Sort Key • Local Secondary Index (LSI) • Global Secondary Index (GSI) • Scaling DynamoDB • DynamoDB Key Design Guide - Composite Key

Slide 3

Slide 3 text

Amazon DynamoDB Document or Key-Value Scales to Any Workload Fully Managed NoSQL Access Control Event Driven Programming Fast and Consistent

Slide 4

Slide 4 text

Why NoSQL? Optimized for storage Optimized for compute Normalized/relational Denormalized/hierarchical Ad hoc queries Instantiated views Scale vertically Scale horizontally Good for OLAP Built for OLTP at scale SQL NoSQL

Slide 5

Slide 5 text

Table Table Items Attributes Partition Key Sort Key Mandatory Key-value access pattern Determines data distribution Optional Model 1:N relationships Enables rich query capabilities All items for key ==, <, >, >=, <= “begins with” “between” “contains” “in” sorted results counts top/bottom N values

Slide 6

Slide 6 text

Hash vs. DynamoDB Key Hash Function ex) h(k) = k % TableSize • In Memory • Collision • Range Query • (Full) Scan • Resizing (Rehash) V1 V2 Vn Hash Table 11~20 • Disk • Partition + Sort Key • Secondary Index- Local (LSI), Global (GSI) • Parallel Scan • Split partitions & Add more storage Split & Add more storage Key Partition Key + Sort Key (Required) (Optional) Key Space PK1 PK11 PK7 PKn 1~5 6~10 . . . HF(k) HF(pk) . . .

Slide 7

Slide 7 text

00 55 A9 54 FF AA 00 FF Partition Keys • Partition Key uniquely identifies an item • Partition Key is used for building an unordered hash index • Allows table to be partitioned for scale Id = 1 Name = Jim Hash (1) = 7B Id = 2 Name = Andy Dept = Eng Hash (2) = 48 Id = 3 Name = Kim Dept = Ops Hash (3) = CD Key Space

Slide 8

Slide 8 text

Partition 3 Partition:Sort Key • Partition:Sort Key uses two attributes together to uniquely identify an Item • Within unordered hash index, data is arranged by the sort key • No limit on the number of items (∞) per partition key § Except if you have local secondary indexes 00:0 FF:∞ Hash (2) = 48 Customer# = 2 Order# = 10 Item = Pen Customer# = 2 Order# = 11 Item = Shoes Customer# = 1 Order# = 10 Item = Toy Customer# = 1 Order# = 11 Item = Boots Hash (1) = 7B Customer# = 3 Order# = 10 Item = Book Customer# = 3 Order# = 11 Item = Paper Hash (3) = CD 55 A9:∞ 54:∞ AA Partition 1 Partition 2

Slide 9

Slide 9 text

Partitions are three-way replicated Id = 2 Name = Andy Dept = Engg Id = 3 Name = Kim Dept = Ops Id = 1 Name = Jim Id = 2 Name = Andy Dept = Engg Id = 3 Name = Kim Dept = Ops Id = 1 Name = Jim Id = 2 Name = Andy Dept = Engg Id = 3 Name = Kim Dept = Ops Id = 1 Name = Jim Replica 1 Replica 2 Replica 3

Slide 10

Slide 10 text

Local Secondary Index (LSI) • Alternate sort key attribute • Index is local to a partition key A1 (partition) A3 (sort) A2 (item key) A1 (partition) A2 (sort) A3 A4 A5 LSIs A1 (partition) A4 (sort) A2 (item key) A3 (projected) Table KEYS_ONLY INCLUDE A3 A1 (partition) A5 (sort) A2 (item key) A3 (projected) A4 (projected) ALL 10 GB max per partition key, i.e. LSIs limit the # of range keys!

Slide 11

Slide 11 text

Global Secondary Index (GSI) • Alternate partition and/or sort key • Index is across all partition keys • Use composite sort keys for compound indexes A1 (partition) A2 A3 A4 A5 A5 (partition) A4 (sort) A1 (item key) A3 (projected) INCLUDE A3 A4 (partition) A5 (sort) A1 (item key) A2 (projected) A3 (projected) ALL A2 (partition) A1 (itemkey) KEYS_ONLY GSIs Table RCUs/WCUs provisioned separately for GSIs Online indexing

Slide 12

Slide 12 text

How Do GSI Updates Work? Table Primary table Primary table Primary table Primary table Global Secondary Index Client 1. Update request 2. Asynchronous update (in progress) 2. Update response If GSIs don’t have enough write capacity, table writes will be throttled!

Slide 13

Slide 13 text

Scaling NoSQL “We are stuck with technology when what we really want is just stuff that works.”

Slide 14

Slide 14 text

What bad NoSQL looks like … Partition Time Heat

Slide 15

Slide 15 text

Getting the most out of Amazon DynamoDB throughput • “To get the most out of DynamoDB throughput, create tables where the partition key element has a large number of distinct values, and values are requested fairly uniformly, as randomly as possible.” • —DynamoDB Developer Guide • Space: access is evenly spread over the key-space • Time: requests arrive evenly spaced in time

Slide 16

Slide 16 text

Much better picture …

Slide 17

Slide 17 text

Auto Scaling • Throughput automatically adapts to your actual traffic With Auto Scaling Without Auto Scaling

Slide 18

Slide 18 text

Composite Keys “Hierarchies are celestial. In hell all are equal.”

Slide 19

Slide 19 text

Secondary index Opponent Date GameId Status Host Alice 2014-10-02 d9bl3 DONE David Carol 2014-10-08 o2pnb IN_PROGRESS Bob Bob 2014-09-30 72f49 PENDING Alice Bob 2014-10-03 b932s PENDING Carol Bob 2014-10-03 ef9ca IN_PROGRESS David Bob Partition key Sort key Multi-value Sorts and Filters

Slide 20

Slide 20 text

Secondary Index Approach 1: Query Filter Bob Opponent Date GameId Status Host Alice 2014-10-02 d9bl3 DONE David Carol 2014-10-08 o2pnb IN_PROGRESS Bob Bob 2014-09-30 72f49 PENDING Alice Bob 2014-10-03 b932s PENDING Carol Bob 2014-10-03 ef9ca IN_PROGRESS David SELECT * FROM Game WHERE Opponent='Bob' ORDER BY Date DESC FILTER ON Status='PENDING' (filtered out)

Slide 21

Slide 21 text

Approach 2: Composite Key StatusDate DONE_2014-10-02 IN_PROGRESS_2014-10-08 IN_PROGRESS_2014-10-03 PENDING_2014-09-30 PENDING_2014-10-03 Status DONE IN_PROGRESS IN_PROGRESS PENDING PENDING Date 2014-10-02 2014-10-08 2014-10-03 2014-10-03 2014-09-30 + =

Slide 22

Slide 22 text

Secondary Index Approach 2: Composite Key Opponent StatusDate GameId Host Alice DONE_2014-10-02 d9bl3 David Carol IN_PROGRESS_2014-10-08 o2pnb Bob Bob IN_PROGRESS_2014-10-03 ef9ca David Bob PENDING_2014-09-30 72f49 Alice Bob PENDING_2014-10-03 b932s Carol Partition key Sort key

Slide 23

Slide 23 text

Opponent StatusDate GameId Host Alice DONE_2014-10-02 d9bl3 David Carol IN_PROGRESS_2014-10-08 o2pnb Bob Bob IN_PROGRESS_2014-10-03 ef9ca David Bob PENDING_2014-09-30 72f49 Alice Bob PENDING_2014-10-03 b932s Carol Secondary index Approach 2: Composite Key Bob SELECT * FROM Game WHERE Opponent='Bob' AND StatusDate BEGINS_WITH 'PENDING'

Slide 24

Slide 24 text

References • DynamoDB Deep Dive: Advanced Design Patterns • https://aws.amazon.com/ko/dynamodb/resources/reinvent-2019-advanced- design-patterns/ • Implementing advanced design patterns for Amazon DynamoDB • https://www.slideshare.net/AmazonWebServices/implementing-advanced- design-patterns-for-amazon-dynamodb-adb401-chicago-aws-summit • Amazon DynamoDB Labs (workshop) • https://catalog.us-east-1.prod.workshops.aws/workshops/3319b690-3a41- 4921-9af8-f31c7bef4cdb/en-US