$30 off During Our Annual Pro Sale. View Details »

Introduction to Amazon DynamoDB

Introduction to Amazon DynamoDB

Presentation given at Percona 2013 conference in Santa Clara, California. Speaker is Simone Brunozzi.
You can follow him on Twitter: @simon

Simone Brunozzi

April 23, 2013
Tweet

More Decks by Simone Brunozzi

Other Decks in Technology

Transcript

  1. Amazon DynamoDB
    Simone Brunozzi ( @simon)
    Senior Technology Evangelist
    Amazon Web Services
    v 4.0 - Apr 20th, 2013
    http://bit.ly/dynamodb2013

    View Slide

  2. No-S-Q-What?

    View Slide

  3. Who invented “NoSQL” ?
    ]
    [
    3

    View Slide

  4. Who invented “NoSQL” ?
    ]
    [
    3
    “NoSQL”
    conceived in 1998
    by Carlo Strozzi
    (Italy)

    View Slide

  5. NoSQL
    4
    Scaling
    Structured
    Storage

    View Slide

  6. NoSQL
    4
    Scaling
    Feature first
    •Financial, CRM,
    Human resources
    •Dominated by RDBMS
    Structured
    Storage

    View Slide

  7. NoSQL
    4
    Scaling
    Scale first
    •Facebook, Gmail,
    Amazon.com, Twitter
    •RDBMS + Key-Value
    Feature first
    •Financial, CRM,
    Human resources
    •Dominated by RDBMS
    Structured
    Storage

    View Slide

  8. NoSQL
    4
    Scaling
    Simple structured
    •BLOB-store not enough
    •Need query/index
    •BerkeleyDB/SimpleDB
    Scale first
    •Facebook, Gmail,
    Amazon.com, Twitter
    •RDBMS + Key-Value
    Feature first
    •Financial, CRM,
    Human resources
    •Dominated by RDBMS
    Structured
    Storage

    View Slide

  9. NoSQL
    4
    Scaling
    Purpose-optimized
    •StreamBase, Vertica,
    VoltDB, Aster Data,
    Netezza, Greenplum
    Simple structured
    •BLOB-store not enough
    •Need query/index
    •BerkeleyDB/SimpleDB
    Scale first
    •Facebook, Gmail,
    Amazon.com, Twitter
    •RDBMS + Key-Value
    Feature first
    •Financial, CRM,
    Human resources
    •Dominated by RDBMS
    Structured
    Storage

    View Slide

  10. NoSQL
    4
    Scaling
    Purpose-optimized
    •StreamBase, Vertica,
    VoltDB, Aster Data,
    Netezza, Greenplum
    Simple structured
    •BLOB-store not enough
    •Need query/index
    •BerkeleyDB/SimpleDB
    Scale first
    •Facebook, Gmail,
    Amazon.com, Twitter
    •RDBMS + Key-Value
    Feature first
    •Financial, CRM,
    Human resources
    •Dominated by RDBMS
    Structured
    Storage
    Good for
    NoSQL

    View Slide

  11. Structured
    Storage
    5
    Scaling
    NoSQL

    View Slide

  12. Structured
    Storage
    5
    Scaling
    Document store
    •Popular: MongoDB,
    CouchDB
    •Semi-structured data
    (XML, JSON, etc)
    NoSQL

    View Slide

  13. Structured
    Storage
    5
    Scaling
    Document store
    •Popular: MongoDB,
    CouchDB
    •Semi-structured data
    (XML, JSON, etc)
    Key-Value store
    •Popular: Cassandra,
    Redis
    •Schema-less
    NoSQL

    View Slide

  14. Structured
    Storage
    5
    Scaling
    Document store
    •Popular: MongoDB,
    CouchDB
    •Semi-structured data
    (XML, JSON, etc)
    Key-Value store
    •Popular: Cassandra,
    Redis
    •Schema-less
    Graph Database
    •Popular: Neo4J,
    FlockDB
    •Stores the relationship
    of data as a graph
    NoSQL

    View Slide

  15. Structured
    Storage
    5
    Scaling
    Document store
    •Popular: MongoDB,
    CouchDB
    •Semi-structured data
    (XML, JSON, etc)
    Key-Value store
    •Popular: Cassandra,
    Redis
    •Schema-less
    Graph Database
    •Popular: Neo4J,
    FlockDB
    •Stores the relationship
    of data as a graph
    Etc.
    •Many others
    NoSQL

    View Slide

  16. Structured
    Storage
    5
    Scaling
    Document store
    •Popular: MongoDB,
    CouchDB
    •Semi-structured data
    (XML, JSON, etc)
    Key-Value store
    •Popular: Cassandra,
    Redis
    •Schema-less
    Graph Database
    •Popular: Neo4J,
    FlockDB
    •Stores the relationship
    of data as a graph
    Etc.
    •Many others
    NoSQL
    DynamoDB

    View Slide

  17. NoSQL
    6
    Structured
    Storage
    Scaling

    View Slide

  18. NoSQL
    6
    Structured
    Storage
    Easier than
    “YesSQL” ?
    •NoSQL is better for
    simple queries, Primary
    Key lookups
    •No maintenance
    windows
    Scaling

    View Slide

  19. NoSQL
    6
    Structured
    Storage
    Durability
    •Synchronous
    replication
    •Built-in durability
    Easier than
    “YesSQL” ?
    •NoSQL is better for
    simple queries, Primary
    Key lookups
    •No maintenance
    windows
    Scaling

    View Slide

  20. NoSQL
    6
    Structured
    Storage
    Evolution
    •MySQL: HandlerSocket
    •PostgreSQL 9.2: index
    only scan
    •SE PostgreSQL
    Durability
    •Synchronous
    replication
    •Built-in durability
    Easier than
    “YesSQL” ?
    •NoSQL is better for
    simple queries, Primary
    Key lookups
    •No maintenance
    windows
    Scaling
    “view leakage”,
    etc.

    View Slide

  21. Why scalability is important
    ]
    [
    7
    traditional
    IT capacity
    Your IT needs
    Time
    Capacity

    View Slide

  22. 8
    Usage patterns: traditional IT
    ]
    [

    View Slide

  23. 8
    On and Off Fast Growth
    Variable peaks Predictable peaks
    Usage patterns: traditional IT
    ]
    [

    View Slide

  24. 9
    Usage patterns: traditional IT
    ]
    [
    Variable peaks
    Fast Growth
    Predictable peaks
    On and Off

    View Slide

  25. 9
    Poor
    Service
    WASTE
    Usage patterns: traditional IT
    ]
    [
    Variable peaks
    Fast Growth
    Predictable peaks
    On and Off

    View Slide

  26. 10
    Elastic
    CLOUD capacity
    traditional
    IT capacity
    Your IT needs
    Usage patterns: Cloud Computing
    ]
    [
    Time
    Capacity

    View Slide

  27. 11
    Usage patterns: Cloud Computing
    ]
    [
    Variable peaks
    Fast Growth
    Predictable peaks
    On and Off

    View Slide

  28. 11
    Usage patterns: Cloud Computing
    ]
    [
    Variable peaks
    Fast Growth
    Predictable peaks
    On and Off

    View Slide

  29. A closer look at
    DynamoDB

    View Slide

  30. 13
    DynamoDB: Speeeed
    ]
    [
    (image)

    View Slide

  31. 13
    DynamoDB: Speeeed
    ]
    [
    Scale to 100,000+
    Writes/second
    (image)

    View Slide

  32. Eventually consistent
    Key-value store
    Unstructured
    NoSQL
    Horizontally scalable
    Non-Relational
    Schema-free
    Distributed
    DynamoDB keywords

    View Slide

  33. DynamoDB
    ]
    [
    15
    DynamoDB

    View Slide

  34. DynamoDB
    ]
    [
    15
    NoSQL
    • No schema (only Key)
    • Hash / Hash + Range
    • Local Secondary Index
    DynamoDB

    View Slide

  35. DynamoDB
    ]
    [
    15
    NoSQL
    • No schema (only Key)
    • Hash / Hash + Range
    • Local Secondary Index
    Speeeed
    • Provisioned throughput
    • Auto storage scaling
    • “Shared nothing”
    • Low latency (<10ms Wr)
    • Solid State Drives (SSD)
    •IOPS per Table
    DynamoDB

    View Slide

  36. DynamoDB
    ]
    [
    15
    NoSQL
    • No schema (only Key)
    • Hash / Hash + Range
    • Local Secondary Index
    Speeeed
    • Provisioned throughput
    • Auto storage scaling
    • “Shared nothing”
    • Low latency (<10ms Wr)
    • Solid State Drives (SSD)
    •IOPS per Table
    Robust
    • Built-in fault tolerance
    • Strong consistency
    • Atomic counters
    • Disk-only writes
    DynamoDB

    View Slide

  37. How do I... Create a table?

    View Slide

  38. Creating a table with the Java low-level API
    ]
    [
    17
    client = new AmazonDynamoDBClient(credentials);
    String tableName = "ProductCatalog";
    KeySchemaElement hashKey = new
    KeySchemaElement().withAttributeName("Id").withAttributeType("N");
    KeySchema ks = new KeySchema().withHashKeyElement(hashKey);
    ProvisionedThroughput provisionedThroughput = new
    ProvisionedThroughput()
    .withReadCapacityUnits(10L)
    .withWriteCapacityUnits(10L);
    CreateTableRequest request = new CreateTableRequest()
    .withTableName(tableName)
    .withKeySchema(ks)
    .withProvisionedThroughput(provisionedThroughput);
    CreateTableResult result = client.createTable(request);

    View Slide

  39. Creating a table with the Java low-level API
    ]
    [
    17
    client = new AmazonDynamoDBClient(credentials);
    String tableName = "ProductCatalog";
    KeySchemaElement hashKey = new
    KeySchemaElement().withAttributeName("Id").withAttributeType("N");
    KeySchema ks = new KeySchema().withHashKeyElement(hashKey);
    ProvisionedThroughput provisionedThroughput = new
    ProvisionedThroughput()
    .withReadCapacityUnits(10L)
    .withWriteCapacityUnits(10L);
    CreateTableRequest request = new CreateTableRequest()
    .withTableName(tableName)
    .withKeySchema(ks)
    .withProvisionedThroughput(provisionedThroughput);
    CreateTableResult result = client.createTable(request);
    Schema
    Throughput
    Table

    View Slide

  40. Creating a table with BOTO library (Python)
    ]
    [
    18
    >>> message_table_schema = conn.create_schema(
    hash_key_name='forum',
    hash_key_proto_value='S',
    range_key_name='subject',
    range_key_proto_value='S'
    )
    >>> table = conn.create_table(
    name='messages',
    schema=message_table_schema,
    read_units=5,
    write_units=5
    )
    >>>

    View Slide

  41. Creating a table with BOTO library (Python)
    ]
    [
    18
    >>> message_table_schema = conn.create_schema(
    hash_key_name='forum',
    hash_key_proto_value='S',
    range_key_name='subject',
    range_key_proto_value='S'
    )
    >>> table = conn.create_table(
    name='messages',
    schema=message_table_schema,
    read_units=5,
    write_units=5
    )
    >>>
    Schema
    Table
    Throughput

    View Slide

  42. Deleting a table? Careful...
    ]
    [
    19
    >>> conn.delete_table(table)

    View Slide

  43. Deleting a table? Careful...
    ]
    [
    19
    >>> conn.delete_table(table)
    Permanently
    deletes Table!
    I suggest to use
    THREE different users:
    1. Dev/Test
    2. Production
    3. Read-only
    You can manage
    permissions with IAM

    View Slide

  44. Managing DynamoDB permissions with IAM
    ]
    [
    20
    {
    "Statement": [
    {
    "Action": [
    "dynamodb:GetItem",
    "dynamodb:BatchGetItem",
    "dynamodb:Query",
    "dynamodb:Scan",
    "dynamodb:DescribeTable",
    "dynamodb:ListTables"
    ],
    "Effect": "Allow",
    "Resource": "*"
    }
    ]
    }

    View Slide

  45. Managing DynamoDB permissions with IAM
    ]
    [
    20
    {
    "Statement": [
    {
    "Action": [
    "dynamodb:GetItem",
    "dynamodb:BatchGetItem",
    "dynamodb:Query",
    "dynamodb:Scan",
    "dynamodb:DescribeTable",
    "dynamodb:ListTables"
    ],
    "Effect": "Allow",
    "Resource": "*"
    }
    ]
    }
    Read-only user

    View Slide

  46. Managing DynamoDB permissions with IAM
    ]
    [
    21
    {
    "Statement": [
    {
    "Action": [
    "dynamodb:*"
    ],
    "Effect": "Allow",
    "Resource": "*"
    }
    ]
    }

    View Slide

  47. Managing DynamoDB permissions with IAM
    ]
    [
    21
    {
    "Statement": [
    {
    "Action": [
    "dynamodb:*"
    ],
    "Effect": "Allow",
    "Resource": "*"
    }
    ]
    }
    Full-access User

    View Slide

  48. 22
    AWS Management Console: IAM (Identity and Access Mgmt)
    (image)

    View Slide

  49. Data Model

    View Slide

  50. DynamoDB Data Model
    ]
    [

    View Slide

  51. DynamoDB Data Model
    ]
    [
    Table(s)
    Item(s)
    Attribute(s)

    View Slide

  52. Example: Table, Items, Attributes
    ]
    [

    View Slide

  53. Products
    Example: Table, Items, Attributes
    ]
    [
    Table
    You must specify what type of
    Primary Key to use:
    “Hash”, or “Hash + Range”

    View Slide

  54. Products
    Example: Table, Items, Attributes
    ]
    [

    View Slide

  55. Products
    Example: Table, Items, Attributes
    ]
    [
    Item
    Item
    Item

    View Slide

  56. Products
    Example: Table, Items, Attributes
    ]
    [

    View Slide

  57. Products
    Example: Table, Items, Attributes
    ]
    [
    id=”301”
    id=”201”
    Author=”Simone”, “John”, “Erin”
    Title=”PHP basics”
    Title=”Learn C++”
    ISBN=”122938”
    id=”101”
    Price=”15.50”
    Cat=”Bycicle”

    View Slide

  58. Products
    Example: Table, Items, Attributes
    ]
    [
    id=”301”
    id=”201”
    Author=”Simone”, “John”, “Erin”
    Title=”PHP basics”
    Title=”Learn C++”
    ISBN=”122938”
    id=”101”
    Price=”15.50”
    Cat=”Bycicle”

    View Slide

  59. Products
    Example: Table, Items, Attributes
    ]
    [
    id=”301”
    id=”201”
    Author=”Simone”, “John”, “Erin”
    Title=”PHP basics”
    Title=”Learn C++”
    ISBN=”122938”
    id=”101”
    Price=”15.50”
    Cat=”Bycicle”
    Multi-valued
    data type
    Scalar
    data type

    View Slide

  60. DynamoDB Data Types
    ]
    [

    View Slide

  61. DynamoDB Data Types
    ]
    [
    Scalar
    Number (+/-, 38 digits)
    String (UTF-8)
    Binary
    Multi-valued
    Number Set
    String Set
    Binary Set
    • Values in a set must
    be unique.
    • Values not ordered.

    View Slide

  62. Primary Key: Hash / Hash + Range
    ]
    [
    30

    View Slide

  63. Primary Key: Hash / Hash + Range
    ]
    [
    30
    Hash
    The key is hashed over
    the different partitions
    to optimize workload
    distribution

    View Slide

  64. Primary Key: Hash / Hash + Range
    ]
    [
    30
    Hash
    The key is hashed over
    the different partitions
    to optimize workload
    distribution
    Hash + Range
    When querying, the
    hash attribute needs to
    be uniquely matched,
    but a range operation
    can be specified for the
    range attribute.
    (e.g. all orders in the
    last 60 minutes)

    View Slide

  65. Primary Key: Hash / Hash + Range
    ]
    [
    31

    View Slide

  66. Primary Key: Hash / Hash + Range
    ]
    [
    31
    id=100 paid=100.3
    id=103 paid=87.0
    id=201 paid=33.5
    Hash

    View Slide

  67. Primary Key: Hash / Hash + Range
    ]
    [
    31
    id=100 paid=100.3
    id=103 paid=87.0
    id=201 paid=33.5
    id=100 date=2012-09-18 paid=10.66
    id=100 date=2012-09-16 paid=71.0
    id=103 date=2012-09-10 paid=23.6
    Hash
    Hash + Range

    View Slide

  68. 32
    dynamo.create_table("Activity", {
    HashKeyElement: {AttributeName: "user", AttributeType: "S"},
    RangeKeyElement: {AttributeName: "created", AttributeType: "N"}},
    {ReadCapacityUnits: 5, WriteCapacityUnits: 5})
    dynamo.put_item("Activity", {
    user: {S: "roidrage"},
    created: {N: Time.now.tv_sec.to_s},
    activity: {S: "Checked in"}})
    items = activities.items.query(
    hash_key: "roidrage",
    range_greater_than: (Time.now - 85600).tv_sec)
    Table with “Hash + Range” Primary Key (Ruby)
    ]
    [

    View Slide

  69. 32
    dynamo.create_table("Activity", {
    HashKeyElement: {AttributeName: "user", AttributeType: "S"},
    RangeKeyElement: {AttributeName: "created", AttributeType: "N"}},
    {ReadCapacityUnits: 5, WriteCapacityUnits: 5})
    dynamo.put_item("Activity", {
    user: {S: "roidrage"},
    created: {N: Time.now.tv_sec.to_s},
    activity: {S: "Checked in"}})
    items = activities.items.query(
    hash_key: "roidrage",
    range_greater_than: (Time.now - 85600).tv_sec)
    Query
    Table with “Hash + Range” Primary Key (Ruby)
    ]
    [
    Put item
    Create table

    View Slide

  70. Throughput

    View Slide

  71. 34
    Scaling DynamoDB throughput
    (video)

    View Slide

  72. Query / Scan

    View Slide

  73. Query / Scan
    ]
    [
    36
    http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/QueryAndScan.html

    View Slide

  74. Query / Scan
    ]
    [
    36
    Query
    • Search only on
    primary key (hash or
    composite)
    • Supports a subset of
    comparison operators
    on key attribute values.
    • Returns 1 MB per
    Query operation.
    • More efficient than
    Scan.
    http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/QueryAndScan.html

    View Slide

  75. Query / Scan
    ]
    [
    36
    Scan
    • Scans the entire table
    • Supports a specific set
    of comparison
    operators (e.g. <=, >, ==).
    • Returns 1 MB / Scan.
    • Slower for bigger
    tables.
    Query
    • Search only on
    primary key (hash or
    composite)
    • Supports a subset of
    comparison operators
    on key attribute values.
    • Returns 1 MB per
    Query operation.
    • More efficient than
    Scan.
    http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/QueryAndScan.html

    View Slide

  76. 37
    Limiting the capabilities of Query's comparison operators was a deliberate
    decision, to ensure that this API's performance will always remain
    predictable, no matter the scale of the table (size or throughput).
    ...
    Stefano @ AWS (on discussion forums)
    Query vs. Scan?
    ]
    [

    View Slide

  77. 38
    Limiting the capabilities of Query's comparison operators was a deliberate
    decision, to ensure that this API's performance will always remain
    predictable, no matter the scale of the table (size or throughput).
    This limitation forces the developer to perform more work upfront, but it
    will yield a scalable workload no matter how much it grows.
    ...
    Stefano @ AWS (on discussion forums)
    Query vs. Scan?
    ]
    [

    View Slide

  78. 39
    Limiting the capabilities of Query's comparison operators was a deliberate
    decision, to ensure that this API's performance will always remain
    predictable, no matter the scale of the table (size or throughput).
    This limitation forces the developer to perform more work upfront, but it
    will yield a scalable workload no matter how much it grows.
    An operation like CONTAINS could seem appealing on paper, but its
    performance would start slowing progressively as the dataset size grows,
    eventually requiring a painful rearchitecture down the road.
    Stefano @ AWS (on discussion forums)
    Query vs. Scan?
    ]
    [

    View Slide

  79. Lost Update

    View Slide

  80. Writing to DynamoDB
    ]
    [
    41
    http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/QueryAndScan.html

    View Slide

  81. Writing to DynamoDB
    ]
    [
    41
    http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/QueryAndScan.html
    How to solve the “Lost update”:
    Optimistic Concurrency Control
    (A.K.A. Conditional Writes)

    View Slide

  82. Writing to DynamoDB
    ]
    [
    41
    http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/QueryAndScan.html
    How to solve the “Lost update”:
    Optimistic Concurrency Control
    (A.K.A. Conditional Writes)
    Put/Update/Delete
    are always ACID;
    “Isolation” only at Item level
    Atomicity
    Consistency
    Isolation
    Durability
    {
    (only at Item)

    View Slide

  83. DynamoDB
    The “Lost update” concurrency issue
    ]
    [
    Client 1 Client 2
    Id=1
    Price=10
    Time

    View Slide

  84. DynamoDB
    The “Lost update” concurrency issue
    ]
    [
    Client 1 Client 2
    Id=1
    Price=10
    Id=1
    Price=10
    Id=1
    Price=10
    GetItem GetItem
    Time

    View Slide

  85. DynamoDB
    The “Lost update” concurrency issue
    ]
    [
    Client 1 Client 2
    Id=1
    Price=10
    Id=1
    Price=10
    Id=1
    Price=10
    GetItem GetItem
    PutItem
    PutItem
    Time
    Id=1
    Price=10
    Id=1
    Price=12
    Id=1
    Price=10
    Id=1
    Price=12
    Id=1
    Price=12
    Id=1
    Price=8
    Id=1
    Price=10
    Id=1
    Price=8

    View Slide

  86. DynamoDB
    How to fix it with Conditional Writes
    ]
    [
    Client 1 Client 2
    Id=1
    Price=10
    Id=1
    Price=10
    Id=1
    Price=10
    GetItem GetItem
    Time

    View Slide

  87. DynamoDB
    How to fix it with Conditional Writes
    ]
    [
    Client 1 Client 2
    Id=1
    Price=10
    Id=1
    Price=10
    Id=1
    Price=10
    GetItem GetItem
    Time
    Id=1
    Price=10
    Id=1
    Price=12
    Id=1
    Price=10
    Id=1
    Price=8

    View Slide

  88. DynamoDB
    How to fix it with Conditional Writes
    ]
    [
    Client 1 Client 2
    Id=1
    Price=10
    Id=1
    Price=10
    Id=1
    Price=10
    GetItem GetItem
    if (Price=10)
    PutItem
    Time
    Id=1
    Price=10
    Id=1
    Price=12
    Id=1
    Price=10
    Id=1
    Price=12
    Id=1
    Price=10
    Id=1
    Price=8

    View Slide

  89. Conditional Writes or Atomic Counters?
    ]
    [

    View Slide

  90. Conditional Writes or Atomic Counters?
    ]
    [
    Conditional Writes
    •Idempotent operation
    •Small overhead

    View Slide

  91. Conditional Writes or Atomic Counters?
    ]
    [
    Conditional Writes
    •Idempotent operation
    •Small overhead
    Atomic Counters
    •Increment/Decrement
    •Allow simultaneous
    write requests
    •NOT Idempotent

    View Slide

  92. (eventually)
    consistent read

    View Slide

  93. (Eventually) Consistent Reads
    ]
    [

    View Slide

  94. (Eventually) Consistent Reads
    ]
    [
    Consistent Read
    •Consumes more “read
    capacity” units (2x)
    •Consistency reached
    within 1,000 ms after
    last write

    View Slide

  95. (Eventually) Consistent Reads
    ]
    [
    Consistent Read
    •Consumes more “read
    capacity” units (2x)
    •Consistency reached
    within 1,000 ms after
    last write
    Eventually
    Consistent Read
    •Can read immediately
    after a write (2 copies)
    •Read old or new value

    View Slide

  96. (Eventually) Consistent Reads
    ]
    [
    Consistent Read
    •Consumes more “read
    capacity” units (2x)
    •Consistency reached
    within 1,000 ms after
    last write
    Eventually
    Consistent Read
    •Can read immediately
    after a write (2 copies)
    •Read old or new value
    (2 copies)

    View Slide

  97. (Eventually) Consistent Reads
    ]
    [
    Consistent Read
    •Consumes more “read
    capacity” units (2x)
    •Consistency reached
    within 1,000 ms after
    last write
    Eventually
    Consistent Read
    •Can read immediately
    after a write (2 copies)
    •Read old or new value
    (2 copies)
    Let me explain...

    View Slide

  98. DynamoDB
    Durability in DynamoDB
    ]
    [
    Client
    Time

    View Slide

  99. DynamoDB
    Durability in DynamoDB
    ]
    [
    Client
    Id=1
    Price=10
    Id=1
    Price=10
    PutItem
    Time
    Confirmation

    View Slide

  100. Example: Consistent Read (HTTP request)
    ]
    [
    49
    // This header is abbreviated.
    POST / HTTP/1.1
    x-amz-target: DynamoDB_20111205.GetItem
    content-type: application/x-amz-json-1.0
    {"TableName":"comptable",
    "Key":
    {"HashKeyElement":{"S":"Julie"},
    "RangeKeyElement":{"N":"1307654345"}},
    "AttributesToGet":["status","friends"],
    "ConsistentRead":true
    }

    View Slide

  101. Example: Consistent Read (HTTP request)
    ]
    [
    49
    // This header is abbreviated.
    POST / HTTP/1.1
    x-amz-target: DynamoDB_20111205.GetItem
    content-type: application/x-amz-json-1.0
    {"TableName":"comptable",
    "Key":
    {"HashKeyElement":{"S":"Julie"},
    "RangeKeyElement":{"N":"1307654345"}},
    "AttributesToGet":["status","friends"],
    "ConsistentRead":true
    } Consistent Read

    View Slide

  102. Consistency
    Availability
    Partition Tolerance
    CAP Theorem
    ]
    [

    View Slide

  103. APIs

    View Slide

  104. DynamoDB APIs
    ]
    [

    View Slide

  105. DynamoDB APIs
    ]
    [
    Table
    •CreateTable
    •UpdateTable
    •DeleteTable
    •DescribeTable
    •ListTables
    Item
    •PutItem
    •GetItem
    •UpdateItem
    •DeleteItem
    •BatchGetItem
    •BatchWriteItem
    Query/Scan
    •Query
    •Scan

    View Slide

  106. 53 (image)
    Monitoring DynamoDB with CloudWatch
    ]
    [

    View Slide

  107. +
    +
    + +
    +
    + +
    53 (image)
    Monitoring DynamoDB with CloudWatch
    ]
    [
    Successful
    Request
    Latency
    Consumed
    Read Capacity
    Units
    Throttled
    Requests
    User
    Errors
    Returned
    Item Count
    System
    Errors
    Consumed
    Write Capacity
    Units

    View Slide

  108. A simple example
    ]
    [

    View Slide

  109. A simple example
    ]
    [
    Let’s take a look.
    How to do things with Python
    and the BOTO library?

    View Slide

  110. 55
    Download and install the BOTO python library (Mac OS)
    (video)

    View Slide

  111. 56
    Enter AWS credentials and connect to DynamoDB
    (video)

    View Slide

  112. Table
    We are going to use this schema...
    ]
    [

    View Slide

  113. Table
    We are going to use this schema...
    ]
    [
    read_units=5
    write_units=5
    forum= subject=
    hash key range key

    View Slide

  114. Messages
    ... to create a table, and add items.
    ]
    [

    View Slide

  115. Messages
    ... to create a table, and add items.
    ]
    [
    forum=
    ”AWS forum”
    Body=
    ”http://127.0.0.1/hello.gif“
    subject=
    ”Hello!”
    SentBy=
    “Simone”
    forum=
    ”AWS forum”
    Body=
    "Nice meeting with you!"
    subject=
    ”Goodbye!”
    SentBy=
    “Simone”

    View Slide

  116. 59
    Define Schema and Table, then create the Table
    (video)

    View Slide

  117. 60
    Put an Item, retrieve the Item from the Table
    (video)

    View Slide

  118. 61
    Adding another Item with the AWS Management Console
    (video)

    View Slide

  119. Ok, I get BOTO.
    ]
    [

    View Slide

  120. Ok, I get BOTO.
    ]
    [
    Do you want
    more choice?

    View Slide

  121. 63 (image)
    Amazon DynamoDB libraries, mappers, etc.
    ]
    [

    View Slide

  122. 63 (image)
    Perl
    Javascript
    Erlang
    Node.js
    Java
    Django
    PHP
    Ruby
    Python
    .NET
    Groovy /
    Grails
    Cold
    Fusion
    Amazon DynamoDB libraries, mappers, etc.
    ]
    [

    View Slide

  123. 64
    Hive: Importing/Exporting/Querying Data in DynamoDB

    View Slide

  124. 65
    DynamoDB costs:
    0.0065 $/h per 10 writes/second
    0.0065 $/h per 50 “strong” reads/second
    1.00 $/month per GB

    View Slide

  125. 65
    DynamoDB costs:
    0.0065 $/h per 10 writes/second
    0.0065 $/h per 50 “strong” reads/second
    1.00 $/month per GB
    Unlike Scan, Query only operates on
    matching records, not all records.
    You only pay for the throughput of the
    items that match, not for everything
    scanned.

    View Slide

  126. 65
    DynamoDB costs:
    0.0065 $/h per 10 writes/second
    0.0065 $/h per 50 “strong” reads/second
    1.00 $/month per GB
    For large BLOBs or infrequently
    accessed data, use Amazon S3
    (DynamoDB item limit: 64 KB)
    You can store smaller data elements
    or file pointers in DynamoDB

    View Slide

  127. 65
    DynamoDB costs:
    0.0065 $/h per 10 writes/second
    0.0065 $/h per 50 “strong” reads/second
    1.00 $/month per GB
    DynamoDB Free tier:
    5 writes/second
    10 consistent reads/second
    100 Mb storage

    View Slide

  128. 66
    http://www.allthingsdistributed.com/2012/01/amazon-dynamodb.html

    View Slide

  129. Local Secondary Index
    (LSI)

    View Slide

  130. (video)
    Create a table with a Local Secondary Index (LSI), query it

    View Slide

  131. Thoughts...

    View Slide

  132. 70
    But what if the server / storage / datacenter fails? On most NoSQL systems
    you would lose your most recent changes, or the data might be saved but
    could be offline and unavailable.
    ...
    James Hamilton, VP and Distinguished Engineer, Amazon Web Services

    View Slide

  133. 71
    But what if the server / storage / datacenter fails? On most NoSQL systems
    you would lose your most recent changes, or the data might be saved but
    could be offline and unavailable.
    With dynamoDB, if data is committed just as one entire datacenter burns to
    the ground, the data is safe, and the application can continue to run
    without negative impact at exactly the same provisioned throughput rate.
    The loss of an entire datacenter isn’t even inconvenient, and has no impact
    on your running application performance.
    ...
    James Hamilton, VP and Distinguished Engineer, Amazon Web Services

    View Slide

  134. 72
    But what if the server / storage / datacenter fails? On most NoSQL systems
    you would lose your most recent changes, or the data might be saved but
    could be offline and unavailable.
    With dynamoDB, if data is committed just as one entire datacenter burns to
    the ground, the data is safe, and the application can continue to run
    without negative impact at exactly the same provisioned throughput rate.
    The loss of an entire datacenter isn’t even inconvenient, and has no impact
    on your running application performance.
    Combining rock solid synchronous, multi-datacenter redundancy with
    average latency in the single digits, and throughput scaling to the millions
    of requests per second is both an excellent engineering challenge and one
    often not achieved.
    James Hamilton, VP and Distinguished Engineer, Amazon Web Services

    View Slide

  135. 73 (image)
    Live repartitioning, no downtime
    ]
    [

    View Slide

  136. Why DynamoDB
    ]
    [
    • Sorted range keys
    • Conditional updates
    • Atomic counters
    • Structured data and multi-valued data types
    • Fetching and updating single attributes
    • Strong consistency
    • No table size limits
    • Live repartitioning
    • Disk-only writes
    • IOPS per table
    No explicit way to handle conflicts other than conditions

    View Slide

  137. View Slide

  138. Amazon DynamoDB
    Simone Brunozzi ( @simon)
    Senior Technology Evangelist
    Amazon Web Services
    v 4.0 - Apr 20th, 2013
    http://bit.ly/dynamodb2013

    View Slide