Slide 1

Slide 1 text

simalexan A serverless database for everyone Amazon DynamoDB

Slide 2

Slide 2 text

simalexan Aleksandar Simovic Senior Software Engineer @ ScienceExchange AWS Serverless Hero coauthor of “Serverless Applications with Node.js” book AWS SAM & Lambda Builders Contributor Co-organizer of JS Belgrade, Serverless Belgrade, Wardley Maps Belgrade meetups

Slide 3

Slide 3 text

simalexan Which database should I choose? NoSQL Only RDB!

Slide 4

Slide 4 text

simalexan If relational databases solve
 all of our needs,
 
 why do we need NoSQL?

Slide 5

Slide 5 text

simalexan (1967-71)
 
 $1 Million per 1 MB $200 per 92k IPS Cost of Resources (2019) 
 
 $0.02 per 1 GB
 $500 per 300 bill. IPS

Slide 6

Slide 6 text

simalexan <<< Storage cost CPU time cost

Slide 7

Slide 7 text

simalexan SQL NoSQL Relational Hierarchical (Denormalized) Vertical Scale Horizontal Scale Queries Instantiated Views Maximize Storage Maximize compute (CPU)

Slide 8

Slide 8 text

simalexan Amazon DynamoDB Amazon’s NoSQL

Slide 9

Slide 9 text

simalexan Serverless DB
 (No patches / updates) Amazon DynamoDB Consistent and Fast
 (4M transc. / sec) Document or Key-Value Scales per Any Load Access control (Fine grained access 
 table, items, attributes, values) Event Driven Model
 (Connected to AWS Lambda)

Slide 10

Slide 10 text

simalexan • 2007, published paper • One of the authors is Werner Vogels 
 (CTO of Amazon)
 • Used by Amazon from day 1
 Amazon e-commerce Shopping Cart • 2012, GA Built by AWS / Used By AWS

Slide 11

Slide 11 text

simalexan • Hardware provisioning • Cross-availability zone replication
 (replicas whenever needed)
 • Monitoring and handling of failures • Patches / updates / fixes Managed for you

Slide 12

Slide 12 text

simalexan • First 25 GB stored per month is free • $0.25 per GB-month • Write $1.25 / million req units • Read $0.25 / million req units Pay-per-use database!

Slide 13

Slide 13 text

simalexan • Automatically partitioned • Partitions completely independent • No limits on workloads Scaling

Slide 14

Slide 14 text

simalexan Autoscaling

Slide 15

Slide 15 text

simalexan • Single digit (1-9) millisecond Put / Get • Custom SSD based platform
 - performance independent of table size
 - no need for working set to fit in memory Performance

Slide 16

Slide 16 text

simalexan • “Replication? We don’t need stinking replication” • “Automated” • “multi-region” • "multiple replica”
 tables 
 (one per region that you choose)
 • DynamoDB treats as a single unit Global Tables

Slide 17

Slide 17 text

simalexan Structure

Slide 18

Slide 18 text

simalexan • Scalar data types:
 - string, 
 - number, 
 - binary • MultiValue data types
 (string set, number set, binary set) Data Types Available

Slide 19

Slide 19 text

simalexan • Data Index by a Primary Key • Types of primary keys:
 - partitionKey (hash)
 - partitionKey + range Indexing Scientist field year Marie Curie Chemistry 1911 Marie Curie Physics 1903 John Bardeen Physics 1956 Leonid Hurwicz Economics 2007 hash + range example

Slide 20

Slide 20 text

simalexan DynamoDB Table

Slide 21

Slide 21 text

simalexan DynamoDB Table

Slide 22

Slide 22 text

simalexan Uniquely identifies a single item Partition Key Unordered hash index Allows table partitioning for scale 
 (chop up and throw in a storage node - automatic routing for service request)

Slide 23

Slide 23 text

simalexan Two attributes to uniquely identify an item Partition Sort Key Items arranged by the sort key No limits on the number of items per partition key

Slide 24

Slide 24 text

simalexan Local index to a single partition key Local Secondary Index Another sort key attribute 
 (alternate range key)

Slide 25

Slide 25 text

simalexan Country Scientist field year topic France Marie Curie Chemistry 1903 Radioactivity France Esther Duflo Economics 2019 Alleviating Global Poverty Get items 
 country = “France”
 scientist begin_with “Marie” NO Local Secondary Index

Slide 26

Slide 26 text

simalexan Get items 
 country = “France”
 year > 2000 Local Secondary Index Country Scientist field year topic France Marie Curie Chemistry 1903 Radioactivity France Esther Duflo Economics 2019 Alleviating Global Poverty LSI
 Partition Key: scientist
 range: year


Slide 27

Slide 27 text

simalexan Global Secondary Index (GSI) Index across all partitions
 
 Alternate partition (+sort) key
 Use composite sort keys for compound indexes

Slide 28

Slide 28 text

simalexan Artist Song Year of Release Album AC/DC Shoot to Thrill 1980 Back in Black Florian Pellieser Quintet Coup de foudre
 a Thessalonique 2018 Coup de foudre
 a Thessalonique NO Global Secondary Index Get items 
 artist = “AC/DC”
 order by song ASC

Slide 29

Slide 29 text

simalexan Artist Song Year of Release Album AC/DC Shoot to Thrill 1980 Back in Black Florian Pellieser Quintet Coup de foudre
 a Thessalonique 2018 Coup de foudre
 a Thessalonique Global Secondary Index Get items 
 album = “Back In Black”
 order by song DESC GSI 
 Partition Key: album
 range: song
 * this is why the GSI costs extra *

Slide 30

Slide 30 text

simalexan Updating the 
 Global Secondary Index (GSI) Update req Async updates Table Client GSIs Response

Slide 31

Slide 31 text

simalexan 1. Join two columns 2. Use “begins with” Composite Key is_verified date TRUE 11-05-2019 TRUE 11-05-2019 FALSE is_verified_date TRUE_11-05-2019 TRUE_11-05-2019 FALSE Useful for multivalue filter (and sort sometimes)

Slide 32

Slide 32 text

simalexan Limitations Item max size 400kb
 
 Max 5 LSI
 
 Initial limit of 20 max GSI
 Max 32 levels deep
 
 Max 40,000 reads and 40,000 write req. units / table
 
 256 tables per Account per AWS Region

Slide 33

Slide 33 text

simalexan Modeling / Queries?

Slide 34

Slide 34 text

simalexan SQL vs NoSQL SELECT * FROM aircrafts
 INNER JOIN bookings
 WHERE… SELECT * FROM aircrafts

Slide 35

Slide 35 text

simalexan Use Case Context Modeling steps Rick Houlihan Review -> Repeat -> Review Data modeling Avoid relational database patterns, use one table Access patterns • Read Write Workloads • Query dimensions and aggregations • Nature of the app (OLTP, OLAP, DSS) • ER Model • Data Lifecycle (TTL, Backup) • Data Sources • Query aggregations • Document all workflows • 1 application service = 1 table • Identify primary keys • Define indexes for secondary access
 patterns

Slide 36

Slide 36 text

simalexan Example Modeling by Rick Houlihan

Slide 37

Slide 37 text

simalexan Example Modeling by Rick Houlihan

Slide 38

Slide 38 text

simalexan Example Modeling by Rick Houlihan

Slide 39

Slide 39 text

simalexan Example Modeling by Rick Houlihan Relational Approach

Slide 40

Slide 40 text

simalexan Example Modeling by Rick Houlihan NoSQL Approach

Slide 41

Slide 41 text

simalexan Example Modeling by Rick Houlihan NoSQL Approach (Orders and Drivers)

Slide 42

Slide 42 text

simalexan Example Modeling by Rick Houlihan NoSQL Approach (Vendors and Deliveries)

Slide 43

Slide 43 text

simalexan Serverless Event Model

Slide 44

Slide 44 text

simalexan • As data enters a database,
 it can also leave.
 • AWS DynamoDB Streams
 • API Gateway -> DynamoDB
 DynamoDB -> Lambda Event Driven Model

Slide 45

Slide 45 text

simalexan Some Use cases Duolingo Amazon Shopping Cart CapitalOne Nordstrom Snapchat Nike GE Aviation Lyft Netflix Samsung

Slide 46

Slide 46 text

simalexan When (not) to use DynamoDB Real-time analytics and queries Complex queries and joins Bad use cases Good use cases Very high read/write Key-value simple queries Consistently low-latency Autosharding / multiple node scaling No tuning No size throughout put limits OLTP OLAP

Slide 47

Slide 47 text

simalexan serverless.pub/book “Serverless Applications with Node.js" claudia40 40% off Hvala!