Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Big Data Redis Mongodb Dynamodb Sharding

Big Data Redis Mongodb Dynamodb Sharding

Building Cloud-Native App Series - Part 4 of 12
Microservices Architecture Series
NoSQL vs SQL
Redis, MongoDB, AWS DynamoDB
Big Data Design Patterns
Sharding, Partitions

Araf Karsh Hamid

June 01, 2022
Tweet

More Decks by Araf Karsh Hamid

Other Decks in Technology

Transcript

  1. @arafkarsh arafkarsh
    8 Years
    Network &
    Security
    6+ Years
    Microservices
    Blockchain
    8 Years
    Cloud
    Computing
    8 Years
    Distributed
    Computing
    Architecting
    & Building Apps
    a tech presentorial
    Combination of
    presentation & tutorial
    ARAF KARSH HAMID
    Co-Founder / CTO
    MetaMagic Global Inc., NJ, USA
    @arafkarsh
    arafkarsh
    1
    Microservice
    Architecture Series
    Building Cloud Native Apps
    NoSQL Vs. SQL
    Redis / MongoDB / DynamoDB
    Scalability: Shards and Partitions
    Data Lake / Data Warehouse
    Part 4 of 12

    View Slide

  2. @arafkarsh arafkarsh 2
    Slides are color coded based on the topic colors.
    NoSQL Vs. SQL
    1
    Redis
    MongoDB
    Dynamo DB
    2
    Scalability
    Sharding &
    Partitions
    3
    Data Lake
    Data Warehouse
    4

    View Slide

  3. @arafkarsh arafkarsh
    Agile
    Scrum (4-6 Weeks)
    Developer Journey
    Monolithic
    Domain Driven Design
    Event Sourcing and CQRS
    Waterfall
    Optional
    Design
    Patterns
    Continuous Integration (CI)
    6/12 Months
    Enterprise Service Bus
    Relational Database [SQL] / NoSQL
    Development QA / QC Ops
    3
    Microservices
    Domain Driven Design
    Event Sourcing and CQRS
    Scrum / Kanban (1-5 Days)
    Mandatory
    Design
    Patterns
    Infrastructure Design Patterns
    CI
    DevOps
    Event Streaming / Replicated Logs
    SQL NoSQL
    CD
    Container Orchestrator Service Mesh

    View Slide

  4. @arafkarsh arafkarsh
    Application Modernization – 3 Transformations
    4
    Monolithic SOA Microservice
    Physical
    Server
    Virtual
    Machine
    Cloud
    Waterfall Agile DevOps
    Source: IBM: Application Modernization > https://www.youtube.com/watch?v=RJ3UQSxwGFY
    Architecture
    Infrastructure
    Delivery
    Modernization
    1
    2
    3

    View Slide

  5. @arafkarsh arafkarsh
    NoSQL vs. SQL
    • Tables and Rows Vs Documents (MongoDB)
    • Multi Table Acid Transactions
    5
    1

    View Slide

  6. @arafkarsh arafkarsh
    NoSQL Databases
    6
    Database Type ACID Query Use Case
    Couchbase
    Doc Based,
    Key Value
    Open Source Yes N1QL
    Financial Services, Inventory,
    IoT
    Cassandra Wide Column Open Source No CQL
    Social Analytics
    Retail, Messaging
    Neo4J Graph
    Open Source
    Commercial
    Yes Cypher
    AI, Master Data Mgmt
    Fraud Protection
    Redis Key Value Open Source Yes Many languages Caching, Queuing
    Mongo DB Doc Based
    Open Source
    Commercial
    Yes JS
    IoT, Feal Time Analytics
    Inventory,
    Amazon
    Dynamo DB
    Key Value
    Doc based
    Vendor Yes DQL
    Gamming, Retail, Financial
    Services
    Source: https://searchdatamanagement.techtarget.com/infographic/NoSQL-database-comparison-to-help-you-choose-the-right-store.

    View Slide

  7. @arafkarsh arafkarsh
    SQL Vs NoSQL
    7
    SQL NoSQL
    Database Type Relational Non-Relational
    Schema Pre-Defined Dynamic Schema
    Database Category Table Based
    1. Documents
    2. Key Value Stores
    3. Graph Stores
    4. Wide Column Stores
    Queries
    Complex Queries (Standard SQL for
    all Relational Databases)
    Need to apply Special Query language for
    each type of NoSQL DB.
    Hierarchical Storage Not a Good Fit Perfect
    Scalability
    Scales well for traditional
    Applications
    Scales well for Modern heavy data-oriented
    Application
    Query Language
    SQL – Standard Language across all
    the Databases
    Non-Standard Query Language as each of the
    NoSQL DB is different.
    ACID Support Yes For some of the Database (Ex. MongoDB)
    Data Size Good for traditional Applications
    Handles massive amount of Data for the
    Modern App requirements.

    View Slide

  8. @arafkarsh arafkarsh
    SQL Vs NoSQL (MongoDB)
    8
    1. In MongoDB Transactional Properties are scoped at Doc Level.
    2. One or More fields can be atomically written in a Single Operation.
    3. With Updates to multiple sub documents including nested arrays.
    4. Any Error results in the entire operation to Roll back.
    5. This is at par with Data Integrity Guarantees provided Traditional Databases.

    View Slide

  9. @arafkarsh arafkarsh
    Multi Table / Doc ACID Transactions
    9
    Examples – Systems of Record or Line of Business (LoB) Applications
    1. Finance
    1. Moving funds between Bank Accounts,
    2. Payment Processing Systems
    3. Trading Platforms
    2. Supply Chain
    • Transferring ownership of Goods & Services through Supply
    Chains and Booking Systems – Ex. Adding Order and Reducing
    inventory.
    3. Billing System
    1. Adding a Call Detail Record and then updating Monthly Plan.
    Source: ACID Transactions in MongoDB

    View Slide

  10. @arafkarsh arafkarsh
    NoSQL Databases
    10
    2

    View Slide

  11. @arafkarsh arafkarsh
    Redis
    • Data Structures
    • Design Patterns
    11
    2020 2019 NoSQL Database Model
    1 1 Redis Key-Value, Multi Model
    2 2 Amazon DynamoDB Multi Model
    3 3 Microsoft Cosmos Multi Model
    4 4 Memcached Key-Value
    In-Memory Databases

    View Slide

  12. @arafkarsh arafkarsh
    Why do you need In-Memory Databases
    12
    1 Users 1 Million +
    2 Data Volume Terabytes to Petabytes
    3 Locality Global
    4 Performance Microsecond Latency
    5 Request Rate Millions Per Second
    6 Access Mobile, IoT, Devices
    7 Economics Pay as you go
    8 Developer Access Open API
    Source: AWS re:Invent 2020: https://www.youtube.com/watch?v=2WkJeofqIJg

    View Slide

  13. @arafkarsh arafkarsh
    Tables / Docs (JSON) – Why Redis is different?
    13
    • Redis is a Multi data model Key Store
    • Commands operate on Keys
    • Data types of Keys can change overtime
    Source: https://www.youtube.com/watch?v=ELk_W9BBTDU

    View Slide

  14. @arafkarsh arafkarsh
    Keys, Values & Data Types
    14
    movie:StarWars “Sold Out”
    Key Name Value
    String
    Hash
    List
    Set
    Sorted Set
    Basic Data Types
    Key Properties
    • Unique
    • Binary Safe (Case Sensitive)
    • Max Size = 512 MB
    Expiration / TTL
    • By Default – Keys are retained
    • Time in Seconds, Milli Second, Unix Epoch
    • Added / Removed from Key
    ➢ SET movie:StarWars ex 5000 (Expires in 5000 seconds)
    ➢ PEXPIRE movie:StarWars 5 (set for 5 milli seconds)
    https://redis.io/commands/set

    View Slide

  15. @arafkarsh arafkarsh
    Redis – Remote Dictionary Server
    15
    Distributed In-Memory Data Store
    String Standard String data
    Hash { A: “John Doe”, B: “New York”, C:USA” }
    List [ A -> B -> C -> D. -> E ]
    Set { A, B, C, D, E }
    Sorted Set { A:10, B:12, C:14:, D:20, E:32 }
    Stream … msg1, msg2, msg3
    Pub / Sub … msg1, msg2, msg3
    https://redis.io/topics/data-types

    View Slide

  16. @arafkarsh arafkarsh
    Data Type: Hash
    16
    movie:The-Force-Awakens
    Value
    J. J. Abrams
    L. Kasdan, J. J. Abrams, M. Arndt
    Dan Mindel
    ➢ HGET movie:The-Force-Awakens Director
    “J. J. Abrams”
    • Field & Value Pairs
    • Single Level
    • Add and Remove Fields
    • Set Operations
    • Intersect
    • Union
    https://redis.io/topics/data-types
    https://redis.io/commands#hash
    Key Name
    Director
    Writer
    Cinematography
    Field
    Use Cases
    • Session Cache
    • Rate Limiting

    View Slide

  17. @arafkarsh arafkarsh
    Data Type: List
    17
    movies
    Key Name
    “Force Awakens, The” “Last Jedi, The” “Rise of Skywalker, The”
    ➢ LPOP movies
    “Force Awakens, The”
    ➢ LPOP movies
    “Last Jedi, The”
    ➢ RPOP movies
    “Rise of Skywalker, The”
    ➢ RPOP movies
    “Last Jedi, The”
    • Ordered List (FIFO or LIFO)
    • Duplicates Allowed
    • Elements added from Left or Right or By Position
    • Max 4 Billion elements per List
    Type of Lists
    • Queues
    • Stacks
    • Capped List
    https://redis.io/topics/data-types
    https://redis.io/commands#list
    Use Cases
    • Communication
    • Activity List

    View Slide

  18. @arafkarsh arafkarsh
    Data Type: Set
    18
    movies
    Member / Element
    “Force Awakens, The”
    “Last Jedi, The”
    “Rise of Skywalker, The”
    ➢ SMEMBERS movies
    “Force Awakens, The”
    “Last Jedi, The”
    “Rise of Skywalker, The”
    • Un-Ordered List of Unique
    Elements
    • Set Operations
    • Difference
    • Intersect
    • Union
    https://redis.io/topics/data-types
    https://redis.io/commands#set
    Key Name
    Use Cases
    • Unique Visitors

    View Slide

  19. @arafkarsh arafkarsh
    Data Type: Sorted Set
    19
    movies
    Value
    “Force Awakens, The”
    “Last Jedi, The”
    “Rise of Skywalker, The”
    ➢ ZRANGE movies 0 1
    “Last Jedi, The”
    “Rise of Skywalker, The”
    • Ordered List of Unique
    Elements
    • Set Operations
    • Intersect
    • Union
    https://redis.io/topics/data-types
    https://redis.io/commands#set
    Key Name
    3
    1
    2
    Score
    Use Cases
    • Leaderboard
    • Priority Queues

    View Slide

  20. @arafkarsh arafkarsh
    Redis: Transactions
    20
    • Transactions are
    • Atomic
    • Isolated
    • Redis commands are
    queue
    • All the Queued commands
    are executed sequentially
    as an Atomic unit
    ➢ MULTI
    ➢ SET movie:The-Force-Awakens:Review Good
    ➢ INCR movie:The-Force-Awakens:Rating
    ➢ EXEC

    View Slide

  21. @arafkarsh arafkarsh
    Redis In-Memory Data Store Use cases
    21
    Machine
    Learning
    Message
    Queues
    Gaming
    Leaderboards
    Geospatial
    Session
    Store
    Media
    Streaming
    Real-time
    Analytics
    Caching

    View Slide

  22. @arafkarsh arafkarsh
    Use Case: Sorted Set – Leader Board
    22
    • Collection of Sorted Distinct
    Entities
    • Set Operations and Range
    Queries based on Score
    value: John
    score: 610
    value : Jane
    score: 987
    value : Sarah
    score: 1597
    value : Maya
    score: 144
    value : Fred
    score: 233
    value : Ann
    score: 377
    Game Scores
    ➢ ZADD game:1 987 Jane 1597 Sarah 377 Maya 610 John 144
    Ann 233 Fred
    ➢ ZREVRANGE game:1 0 3 WITHSCORES. (Get top 4 Scores)
    • Sarah 1597
    • Jane 987
    • John 610
    • Ann 377
    Source: AWS re:Invent 2020: https://www.youtube.com/watch?v=2WkJeofqIJg
    https://redis.io/commands/zadd

    View Slide

  23. @arafkarsh arafkarsh
    Use Case: Geospatial
    23
    • Compute distance between
    members
    • Find all members within a
    radius
    Source: AWS re:Invent 2020: https://www.youtube.com/watch?v=2WkJeofqIJg
    ➢ GEOADD cities 87.6298 41.8781 Chicago
    ➢ GEOADD cities 122.3321 447.6062 Seattle
    ➢ ZRANGE cities0 -1
    • “Chicago”
    • “Seattle”
    ➢ GEODIST cities Chicago Seattle mi
    • “1733.4089”
    ➢ GEORADIUS cities 122.4194 37..7749
    1000 mi WITHDIST
    • “Seattle”
    • “679.4848”
    o m for meters
    o km for
    kilometres
    o mi for miles
    o ft for feet
    https://redis.io/commands/geodist

    View Slide

  24. @arafkarsh arafkarsh
    Use Case: Streams
    24
    • Ordered collection of Data
    • Efficient for consuming
    from the tail
    • Multiple Consumers
    support similar to Kafka
    {
    “order”: “xy2123adbcd”
    {
    “item”: “book1”,
    “qty”: 1
    }
    }
    {
    “order”: “xy2123adbcd”
    {
    “item”: “book1”,
    “qty”: 1
    }
    }
    {
    “order”: “xy2123adbcd”
    {
    “item”: “book1”,
    “qty”: 1
    }
    }
    {
    “order”: “xy2123adbcd”
    {
    “item”: “book1”,
    “qty”: 1
    }
    }
    {
    “order”: “xy2123adbcd”
    {
    “item”: “book1”,
    “qty”: 1
    }
    }
    {
    “order”: “xy2123adbcd”
    {
    “item”: “book1”,
    “qty”: 1
    }
    }
    {
    “order”: “xy2123adbcd”
    {
    “item”: “book1”,
    “qty”: 1
    }
    }
    {
    “order”: “xy2123adbcd”
    {
    “item”: “book1”,
    “qty”: 1
    }
    }
    {
    “order”: “xy2123adbcd”
    {
    “item”: “book1”,
    “qty”: 1
    }
    }
    {
    “order”: “xy2123adbcd”
    {
    “item”: “book1”,
    “qty”: 1
    }
    }
    {
    “order”: “xy2123adbcd”
    {
    “item”: “book1”,
    “qty”: 1
    }
    }
    START
    END
    Consumer 1
    Consumer 2
    Consumer n
    Consumer G1
    Consumer G2
    Consumer Group G
    ➢ XADD orderStream * orderId1:item1:qty1
    ➢ XADD orderStream * orderId2:item1:qty2
    https://redis.io/commands/xadd
    * Autogenerates the Uniq ID
    Producer
    ➢ XREAD BLOCK 20 STREAMS orderStream $
    • orderId2
    • Item1
    • qty2
    Consumer

    View Slide

  25. @arafkarsh arafkarsh
    MongoDB: Design Patterns
    1. Prefer Embedding
    2. Embrace Duplication
    3. Know when Not to Embed
    4. Relationships and Join
    25

    View Slide

  26. @arafkarsh arafkarsh
    MongoDB Docs – Prefer Embedding
    26
    Use
    Structure
    to use
    Data
    within a
    Document
    Include
    Bounded
    Arrays to
    have
    multiple
    records

    View Slide

  27. @arafkarsh arafkarsh
    MongoDB Docs – Embrace Duplication
    27
    Field Info
    Duplicated
    from
    Customer
    Profile
    Address
    Duplicated
    from
    Customer
    Profile

    View Slide

  28. @arafkarsh arafkarsh
    Know When Not to Embed
    28
    As Item is used outside
    of Order, You don’t
    need to embed the
    whole Object here.
    Instead give the Item
    Reference ID.
    (Not to Embed)
    Name is given to
    decouple it from Item
    (Product) Service.
    (Embrace Duplication)

    View Slide

  29. @arafkarsh arafkarsh
    Relationships and Joins
    29
    Reviews are joined to Product
    Collection using Item UUID
    Bi-Directional Joins are also
    supported

    View Slide

  30. @arafkarsh arafkarsh
    MongoDB – Tips & Best Practices
    30
    1. MongoDB Will Abort any Multi Document transaction that runs for more
    than 60 seconds.
    2. No More than 1000 documents should be modified within a
    Transaction.
    3. Developers need to try logic to retry the transaction in case transaction
    is aborted due to network error.
    4. Transactions that affects Multiple Shards incur a greater performance
    Cost as operations are coordinated across multiple participating nodes
    over the network.
    5. Performance will be impacted if a transaction runs against a collection
    that is subject to rebalancing.

    View Slide

  31. @arafkarsh arafkarsh
    Amazon DynamoDB
    DynamoDB Concepts
    DynamoDB Design Patterns
    Performance

    View Slide

  32. @arafkarsh arafkarsh
    Amazon DynamoDB Concept
    Customer ID Name Category State
    Order
    Order
    Customer
    Cart
    Payments
    Order
    Cart
    Catalogue
    Catalogue
    Table
    Product ID Name Value Description Image
    Item ID Quantity Value Currency
    User ID + Item ID
    Attributes
    1. A single Table holds multiple Entities (Customer, Catalogue, Cart, Order
    etc.) aka Items.
    2. Item contains a collection of Attributes.
    3. Primary Key plays a key role in Performance, Scalability and avoiding Joins
    (in a typical RDBMS way).
    4. Primary Key contains a Partition Key and an option Sort Key.
    5. Item Data Model is JSON, and Attribute can be a field or a Custom Object.
    Items
    Primary Key

    View Slide

  33. @arafkarsh arafkarsh
    DynamoDB – Under the Hood
    One Single table Multiple Entities with multiple documents (Records in RDBMS style)
    1 Org Record
    2 Employee Record
    1 Org Record
    2 Employee Record
    1. DynamoDB Structure is JSON (Document Model) – However, it has no resemblance to MongoDB in terms DB
    implementation or Schema Design Patterns.
    2. Multiple Entities are part of the Single Table and this helps to avoid expensive joins. For Ex. PK = ORG#Magna
    will retrieve all the 3 records. 1 Record from Org Entity and 2 Records from Employee Entity.
    3. Partition Key helps in Sharding and Horizontal Scalability.

    View Slide

  34. @arafkarsh arafkarsh
    Scalability: Sharding / Partitions
    • Scale Cube
    • eBay Case Study
    • Sharding and Partitions
    34
    3

    View Slide

  35. @arafkarsh arafkarsh
    Scalability
    • Scale Cube
    • eBay Case Study
    35

    View Slide

  36. @arafkarsh arafkarsh
    App Scalability based
    on micro services
    architecture
    Source: The NewStack. Based on the Art of Scalability by By Martin Abbot
    & Michael Fisher
    36

    View Slide

  37. @arafkarsh arafkarsh
    Scale Cube and Micro Services
    37
    1. Functional
    Decomposition
    2. Avoid locks by
    Database Sharding
    3. Cloning Services

    View Slide

  38. @arafkarsh arafkarsh
    Scalability Best Practices : Lessons from
    Best Practices Highlights
    #1 Partition By Function
    • Decouple the Unrelated Functionalities.
    • Selling functionality is served by one set of applications, bidding by another, search by yet another.
    • 16,000 App Servers in 220 different pools
    • 1000 logical databases, 400 physical hosts
    #2 Split Horizontally
    • Break the workload into manageable units.
    • eBay’s interactions are stateless by design
    • All App Servers are treated equal and none retains any transactional state
    • Data Partitioning based on specific requirements
    #3
    Avoid Distributed
    Transactions
    • 2 Phase Commit is a pessimistic approach comes with a big COST
    • CAP Theorem (Consistency, Availability, Partition Tolerance). Apply any two at any point in time.
    • @ eBay No Distributed Transactions of any kind and NO 2 Phase Commit.
    #4
    Decouple Functions
    Asynchronously
    • If Component A calls component B synchronously, then they are tightly coupled. For such systems to
    scale A you need to scale B also.
    • If Asynchronous A can move forward irrespective of the state of B
    • SEDA (Staged Event Driven Architecture)
    #5
    Move Processing to
    Asynchronous Flow
    • Move as much processing towards Asynchronous side
    • Anything that can wait should wait
    #6 Virtualize at All Levels • Virtualize everything. eBay created their on O/R layer for abstraction
    #7 Cache Appropriately • Cache Slow changing, read-mostly data, meta data, configuration and static data.
    38
    Source: http://www.infoq.com/articles/ebay-scalability-best-practices

    View Slide

  39. @arafkarsh arafkarsh
    Database Shards / Partitions
    • Cap Theorem
    • Sharding / Partitioning
    • Geo Partitioning
    • Oracle Sharding and Geo Partitioning
    39

    View Slide

  40. @arafkarsh arafkarsh
    CAP Theorem by Eric Allen Brewer
    40
    Pick Any 2!! Say NO to 2 Phase Commit ☺
    Source: https://en.wikipedia.org/wiki/CAP_theorem | http://en.wikipedia.org/wiki/Eric_Brewer_(scientist)
    CAP 12 years later: How the “Rules have changed”
    “In a network subject to communication failures, it is
    impossible for any web service to implement an atomic
    read / write shared memory that guarantees a response
    to every request.”
    Partition Tolerance
    The system continues to operate despite an arbitrary
    number of messages being dropped (or delayed) by
    the network between nodes.
    Consistency
    Every read receives the
    most recent write or an
    error.
    Availability
    Every request receives a (non-error) response – without
    guarantee that it contains the most recent write.

    View Slide

  41. @arafkarsh arafkarsh
    Sharding / Partitioning
    41
    Method Scalability Table
    Sharding Horizontal Rows Same Schema with
    Uniq Rows
    Sharding Vertical Columns Different Schema
    Partition Vertical Rows Same Schema with
    Uniq Rows
    1. Optimize the Database
    2. Separate Rows or Columns into multiple smaller tables
    3. Each table has either Same Schema with Unique Rows
    4. Or has a Schema that is subset of the Original
    Customer ID Customer
    Name
    DOB City
    1 ABC Bengaluru
    2 DEF Tokyo
    3 JHI Kochi
    4 KLM Pune
    Original Table
    Customer ID Customer
    Name
    DOB City
    1 ABC Bengaluru
    2 DEF Tokyo
    Customer ID Customer
    Name
    DOB City
    3 JHI Kochi
    4 KLM Pune
    Horizontal Sharding - 1
    Horizontal Sharding - 2
    Customer ID Customer
    Name
    DOB
    1 ABC
    2 DEF
    3 JHI
    4 KLM
    Customer ID City
    1 Bengaluru
    2 Tokyo
    3 Kochi
    4 Pune
    Vertical Sharding - 1 Vertical Sharding - 2

    View Slide

  42. @arafkarsh arafkarsh
    Sharding Scenarios
    42
    1. Horizontal Scaling: Single Server is unable to handle the load
    even after partitioning the data sets.
    2. Data can be partitioned in such a way that specific server(s)
    can serve the search query based on the partition. For Ex. In
    an e-Commerce Application Searching the data based on
    1. Product Type
    2. Product Brand
    3. Sellers Region (for Local Shipping)
    4. Orders based on Year / Months

    View Slide

  43. @arafkarsh arafkarsh
    Geo Partitioning
    43
    • Geo-partitioning is the ability to control the location of
    data at the row level.
    • CockroachDB lets you control which tables are replicated
    to which nodes. But with geo-partitioning, you can control
    which nodes house data with row-level granularity.
    • This allows you to keep customer data close to the user,
    which reduces the distance it needs to travel, thereby
    reducing latency and improving user experience.
    Source: https://www.cockroachlabs.com/blog/geo-partition-data-reduce-latency/

    View Slide

  44. @arafkarsh arafkarsh
    Oracle Database – Geo Partitioning
    44
    Source: https://www.oracle.com/a/tech/docs/sharding-wp-12c.pdf

    View Slide

  45. @arafkarsh arafkarsh
    Oracle Sharding and Geo
    45
    CREATE SHARDED TABLE customers (
    cust_id NUMBER NOT NULL ,
    name VARCHAR2(50) ,
    address VARCHAR2(250) ,
    geo VARCHAR2(20) ,
    class VARCHAR2(3) ,
    signup_date DATE ,
    CONSTRAINT cust_pk PRIMARY KEY(geo, cust_id) )
    PARTITIONSET BY LIST (geo)
    PARTITION BY CONSISTENT HASH (cust_id)
    PARTITIONS AUTO (
    PARTITIONSET AMERICA VALUES (‘AMERICA’) TABLESPACE SET tbs1,
    PARTITIONSET ASIA VALUES (‘ASIA’) TABLESPACE SET tbs2
    );
    Primary
    Shard
    Standby
    Shards
    Read / Write
    Tx / Second
    Read Only
    Tx / Second
    25 25 1.18 Million 1.62 Million
    50 50 2.11 Million 3.26 Million
    75 75 3.57 Million 5.05 Million
    100 100 4.38 Million 6.82 Million
    Linear Scalability
    Source: https://www.oracle.com/a/tech/docs/sharding-wp-12c.pdf

    View Slide

  46. @arafkarsh arafkarsh
    Oracle
    Sharding
    Compared with
    Cassandra and
    MongoDB
    46

    View Slide

  47. @arafkarsh arafkarsh
    MongoDB: Cluster
    1. Replication
    2. Automatic Failover
    3. Sharding
    47

    View Slide

  48. @arafkarsh arafkarsh
    MongoDB Replication
    48
    Application
    (Client App Driver)
    Replica Set1
    (mongos)
    RS 2
    (mongos)
    RS 3
    (mongos)
    Secondary Servers
    Primary Server
    Replication
    Replication
    Heartbeat
    Source: MongoDB Replication https://docs.mongodb.com/manual/replication/
    ✓ Provides redundancy
    High Availability.
    ✓ It provides Fault
    Tolerance as
    multiple copies of
    data on different
    database servers
    ensures that the loss
    of a single database
    server will not affect
    the Application.
    1. Replicate the primary's oplog and
    2. Apply the operations to their data
    sets such that the secondaries' data
    sets reflect the primary's data set.
    3. Secondary apply the operations to
    their data sets asynchronously
    What Secondary does?
    What Primary does?
    1. Receives all write operations
    mongodb://
    mongodb0.example.com:27017,
    mongodb1.example.com:27017,
    mongodb2.example.com:27017/?
    replicaSet=myRepl
    Use Secure Connection
    mongodb://myDBReader:D1fficultP%40ssw0rd
    @mongodb0.example.com:27017
    Replica Set Connection Configuration

    View Slide

  49. @arafkarsh arafkarsh
    MongoDB Replication: Automatic Failover
    49
    Source: MongoDB Replication https://docs.mongodb.com/manual/replication/
    ✓ If the Primary is NOT reachable
    more than the configured
    electionTimeoutMillis (default 10
    seconds) then
    ✓ One of the Secondary will become
    the Primary after an election
    process.
    ✓ Most updated Secondary will
    become the next Primary.
    ✓ Election should not take more
    than 12 seconds to elect a Primary.
    Replica Set1
    (mongos)
    RS 2
    (mongos)
    RS 3
    (mongos)
    Secondary Servers
    Primary Server
    Heartbeat
    Election for new Primary
    Replica Set1
    (mongos)
    Primary
    (mongos)
    RS 3
    (mongos)
    Secondary Servers
    Primary Server
    Heartbeat
    Election for new Primary
    Replication
    ✓ The write Operations will be blocked until the new Primary is selected.
    ✓ The Secondary Replica Set can serve the Read Operations while the election is in progress provided its configured for that.
    ✓ MongoDB 4.2+ compatible drivers enable retryable writes by default
    ✓ MongoDB 4.0 and 3.6-compatible drivers must explicitly enable retryable writes by including retryWrites=true in the connection
    string.

    View Slide

  50. @arafkarsh arafkarsh
    MongoDB Replication: Arbiter
    50
    Application
    (Client App Driver)
    Replica Set1
    (mongos)
    RS 2
    (mongos)
    Arbiter
    (mongos)
    Secondary Servers
    Primary Server
    Replication
    ✓ An Arbiter can be used to save the
    cost of adding an additional
    Secondary Server.
    ✓ Arbiter will handle only the election
    process to select a Primary.
    Source: MongoDB Replication https://docs.mongodb.com/manual/replication/

    View Slide

  51. @arafkarsh arafkarsh
    MongoDB Replication: Secondary Reads
    51
    Replica Set1
    (mongos)
    RS 2
    (mongos)
    RS 3
    (mongos)
    Secondary Servers
    Primary Server
    Replication
    Replication
    Heartbeat
    Source: MongoDB Replication https://docs.mongodb.com/manual/core/read-preference/
    ✓ Asynchronous replication to secondaries means
    that reads from secondaries may return data that
    does not reflect the state of the data on the
    primary.
    ✓ Multi-document transactions that contain read
    operations must use read preference primary. All
    operations in a given transaction must route to
    the same member.
    Write to Primary and Read from Secondary
    Application
    (Client App Driver)
    Read from
    the
    Secondary
    Write
    mongo ‘mongodb://mongodb0,mongodb1,mongodb2/?replicaSet=rsOmega&readPreference=secondary’
    $ >

    View Slide

  52. @arafkarsh arafkarsh
    MongoDB – Deploy Replica Set
    52
    mongod --replSet “rsOmega” --bind_ip localhost,
    $ >
    replication:
    replSetName: "rsOmega"
    net:
    bindIp: localhost,
    Config File
    mongod --config
    $ >
    Use Config file to set the Replica Config to each Mongo Instance
    Use Command Line to set Replica details to each Mongo Instance
    1
    Source: MongoDB Replication https://docs.mongodb.com/manual/tutorial/deploy-replica-set/

    View Slide

  53. @arafkarsh arafkarsh
    MongoDB – Deploy Replica Set
    53
    mongo
    $ >
    Initiate the Replica Set
    Connect to Mongo DB
    2
    > rs.initiate( {
    _id : "rsOmega",
    members: [
    { _id: 0, host: "mongodb0.host.com:27017" },
    { _id: 1, host: "mongodb1.host.com :27017" },
    { _id: 2, host: "mongodb2.host.com :27017" }
    ]
    })
    3
    Run rs.initiate() on just one and only one mongod instance for
    the replica set.
    Source: MongoDB Replication https://docs.mongodb.com/manual/tutorial/deploy-replica-set/

    View Slide

  54. @arafkarsh arafkarsh
    MongoDB – Deploy Replica Set
    54
    mongo ‘mongodb://mongodb0,mongodb1,mongodb2/?replicaSet=rsOmega’
    $ >
    > rs.conf()
    Show Config
    Show the Replica Config
    4
    Source: MongoDB Replication
    https://docs.mongodb.com/manual/tutorial/deploy-replica-set/
    > rs.status()
    5 Ensure that the replica set has a primary
    mongo
    $ >
    6 Connect to the Replica Set

    View Slide

  55. @arafkarsh arafkarsh
    MongoDB Sharding
    55
    Application
    (Client App Driver)
    Config Server
    (mongos)
    Config
    (mongos)
    Config
    (mongos)
    Secondary Servers
    Primary Server
    Router
    (mongos)
    Router
    (mongos)
    Replica Set1
    (mongos)
    RS 2
    (mongos)
    RS 3
    (mongos)
    Secondary Servers
    Primary Server
    Shard 1
    Replica Set1
    (mongos)
    RS 2
    (mongos)
    RS 3
    (mongos)
    Secondary Servers
    Primary Server
    Shard 2
    Replica Set1
    (mongos)
    RS 2
    (mongos)
    RS 3
    (mongos)
    Secondary Servers
    Primary Server
    Shard 3

    View Slide

  56. @arafkarsh arafkarsh
    Distributed Transactions
    • Saga Design Pattern
    • Features
    • Handling Invariants
    • Forward recovery
    • Local Saga Feature
    • Distributed Saga
    • Use Case: Distributed Saga
    56
    4

    View Slide

  57. @arafkarsh arafkarsh
    Distributed Transactions : 2 Phase Commit
    2 PC or not 2 PC, Wherefore Art Thou XA?
    57
    How does 2PC impact scalability?
    • Transactions are committed in two phases.
    • This involves communicating with every database (XA
    Resources) involved to determine if the transaction will commit
    in the first phase.
    • During the second phase each database is asked to complete
    the commit.
    • While all of this coordination is going on, locks in all of the data
    sources are being held.
    • The longer duration locks create the risk of higher contention.
    • Additionally, the two phases require more database
    processing time than a single phase commit.
    • The result is lower overall TPS in the system.
    Transaction
    Manager
    XA Resources
    Request to Prepare
    Commit
    Prepared
    Prepare
    Phase
    Commit
    Phase
    Done
    Source : Pat Helland (Amazon) : Life Beyond Distributed Transactions Distributed Computing : http://dancres.github.io/Pages/
    Solution : Resilient System
    • Event Based
    • Design for failure
    • Asynchronous Recovery
    • Make all operations idempotent.
    • Each DB operation is a 1 PC

    View Slide

  58. @arafkarsh arafkarsh
    Distributed Tx: SAGA Design Pattern instead of 2PC
    58
    Long Lived Transactions (LLTs) hold on to DB resources for relatively long periods of time, significantly delaying
    the termination of shorter and more common transactions.
    Source: SAGAS (1987) Hector Garcia Molina / Kenneth Salem,
    Dept. of Computer Science, Princeton University, NJ, USA
    T1 T2 Tn
    Local Transactions
    C1 C2 Cn-1
    Compensating Transaction
    Divide long–lived, distributed transactions into quick local ones with compensating actions for
    recovery.
    Travel : Flight Ticket & Hotel Booking Example
    BASE (Basic Availability, Soft
    State, Eventual Consistency)
    Room Reserved
    T1
    Room Payment
    T2
    Seat Reserved
    T3
    Ticket Payment
    T4
    Cancelled Room Reservation
    C1
    Cancelled Room Payment
    C2
    Cancelled Ticket Reservation
    C3

    View Slide

  59. @arafkarsh arafkarsh
    SAGA Design Pattern Features
    59
    1. Backward Recovery (Rollback)
    T1
    T2
    T3
    T4
    C3
    C2
    C1
    Order Processing, Banking
    Transactions, Ticket Booking
    Examples
    Updating individual scores in
    a Team Game.
    2. Forward Recovery with Save Points
    T1
    (sp) T2
    (sp) T3
    (sp)
    • To recover from Hardware Failures, SAGA needs to be persistent.
    • Save Points are available for both Forward and Backward Recovery.
    Type
    Source: SAGAS (1987) Hector Garcia Molina / Kenneth Salem, Dept. of Computer Science, Princeton University, NJ, USA

    View Slide

  60. @arafkarsh arafkarsh
    Handling Invariants – Monolithic to Micro Services
    60
    In a typical Monolithic App
    Customer Credit Limit info and
    the order processing is part of
    the same App. Following is a
    typical pseudo code.
    Order Created
    T1
    Order
    Microservice
    Credit Reserved
    T2
    Customer
    Microservice
    In Micro Services world with Event Sourcing, it’s a
    distributed environment. The order is cancelled if
    the Credit is NOT available. If the Payment
    Processing is failed then the Credit Reserved is
    cancelled.
    Payment
    Microservice
    Payment Processed
    T3
    Order Cancelled
    C1
    Credit Cancelled due to
    payment failure
    C2
    Begin Transaction
    If Order Value <= Available
    Credit
    Process Order
    Process Payments
    End Transaction
    Monolithic 2 Phase Commit
    https://en.wikipedia.org/wiki/Invariant_(computer_science)

    View Slide

  61. @arafkarsh arafkarsh 61
    Use Case : Restaurant – Forward Recovery
    Domain
    The example focus on a
    concept of a Restaurant
    which tracks the visit of
    an individual or group
    to the Restaurant. When
    people arrive at the
    Restaurant and take a
    table, a table is opened.
    They may then order
    drinks and food. Drinks
    are served immediately
    by the table staff,
    however food must be
    cooked by a chef. Once
    the chef prepared the
    food it can then be
    served.
    Payment
    Billing
    Dining
    Source: http://cqrs.nu/tutorial/cs/01-design
    Soda Cancelled
    Table Opened
    Juice Ordered
    Soda Ordered
    Appetizer Ordered
    Soup Ordered
    Food Ordered
    Juice Served
    Food Prepared
    Food Served
    Appetizer Served
    Table Closed
    Aggregate Root : Dinning Order
    Billed Order
    T1
    Payment CC
    T2
    Payment Cash
    T3
    T1
    (sp) T2
    (sp) T3
    (sp)
    Event Stream
    Aggregate Root : Food Bill
    Transaction doesn't rollback if one payment
    method is failed. It moves forward to the
    NEXT one.
    sp
    Network
    Error
    C1 sp

    View Slide

  62. @arafkarsh arafkarsh
    Local SAGA Features
    62
    1. Part of the Micro Services
    2. Local Transactions and Compensation
    Transactions
    3. SAGA State is persisted
    4. All the Local transactions are based on
    Single Phase Commit (1 PC)
    5. Developers need to ensure that
    appropriate compensating
    transactions are Raised in the event of
    a failure.
    API Examples
    @StartSaga(name=“HotelBooking”)
    public void reserveRoom(…) {
    }
    @EndSaga(name=“HotelBooking”)
    public void payForTickets(…) {
    }
    @AbortSaga(name=“HotelBooking”)
    public void cancelBooking(…) {
    }
    @CompensationTx()
    public void cancelReservation(…) {
    }

    View Slide

  63. @arafkarsh arafkarsh
    SAGA Execution Container
    63
    1. SEC is a separate Process
    2. Stateless in nature and Saga state is stored in a
    messaging system (Kafka is a Good choice).
    3. SEC process failure MUST not affect Saga Execution as
    the restart of the SEC must start from where the Saga
    left.
    4. SEC – No Single Point of Failure (Master Slave Model).
    5. Distributed SAGA Rules are defined using a DSL.

    View Slide

  64. @arafkarsh arafkarsh
    Use Case : Travel Booking – Distributed Saga (SEC)
    64
    Hotel Booking
    Car Booking
    Flight Booking
    Saga
    Execution
    Container
    Start Saga
    {Booking Request}
    Payment
    End
    Saga
    Start
    Saga
    Start Hotel
    End Hotel
    Start Car
    End Car
    Start Flight
    End Flight
    Start Payment
    End Payment
    Saga Log
    End Saga
    {Booking Confirmed}
    SEC knows the structure of the
    distributed Saga and for each
    of the Request Which Service
    needs to be called and what
    kind of Recovery mechanism it
    needs to be followed.
    SEC can parallelize the calls
    to multiple services to
    improve the performance.
    The Rollback or Roll forward
    will be dependent on the
    business case.
    Source: Distributed Sagas By Catitie McCaffrey, June 6, 2017

    View Slide

  65. @arafkarsh arafkarsh
    Use Case : Travel Booking – Rollback
    65
    Hotel Booking
    Car Booking
    Flight Booking
    Saga
    Execution
    Container
    Start Saga
    {Booking Request}
    Payment
    Start
    Comp
    Saga
    End
    Comp
    Saga
    Start Hotel
    End Hotel
    Start Car
    Abort Car
    Cancel Hotel
    Cancel Flight
    Saga Log
    End Saga
    {Booking Cancelled}
    Kafka is a good choice to
    implement the SEC log.
    SEC is completely STATELESS in
    nature. Master Slave model
    can be implemented to avoid
    the Single Point of Failure.
    Source: Distributed Sagas By Catitie McCaffrey, June 6, 2017

    View Slide

  66. @arafkarsh arafkarsh
    Summary: Databases
    66
    1. DB Sharding / Partition
    2. 2 Phase Commit
    Doesn’t scale well in cloud environment
    3. SAGA Design Pattern
    Raise compensating events when the local transaction fails.
    4. SAGA Supports Rollbacks & Roll
    Forwards
    Critical pattern to address distributed transactions.

    View Slide

  67. @arafkarsh arafkarsh
    Scalability Best Practices : Lessons from
    Best Practices Highlights
    #1 Partition By Function
    • Decouple the Unrelated Functionalities.
    • Selling functionality is served by one set of applications, bidding by another, search by yet another.
    • 16,000 App Servers in 220 different pools
    • 1000 logical databases, 400 physical hosts
    #2 Split Horizontally
    • Break the workload into manageable units.
    • eBay’s interactions are stateless by design
    • All App Servers are treated equal and none retains any transactional state
    • Data Partitioning based on specific requirements
    #3
    Avoid Distributed
    Transactions
    • 2 Phase Commit is a pessimistic approach comes with a big COST
    • CAP Theorem (Consistency, Availability, Partition Tolerance). Apply any two at any point in time.
    • @ eBay No Distributed Transactions of any kind and NO 2 Phase Commit.
    #4
    Decouple Functions
    Asynchronously
    • If Component A calls component B synchronously, then they are tightly coupled. For such systems to
    scale A you need to scale B also.
    • If Asynchronous A can move forward irrespective of the state of B
    • SEDA (Staged Event Driven Architecture)
    #5
    Move Processing to
    Asynchronous Flow
    • Move as much processing towards Asynchronous side
    • Anything that can wait should wait
    #6 Virtualize at All Levels • Virtualize everything. eBay created their on O/R layer for abstraction
    #7 Cache Appropriately • Cache Slow changing, read-mostly data, meta data, configuration and static data.
    Source: http://www.infoq.com/articles/ebay-scalability-best-practices

    View Slide

  68. @arafkarsh arafkarsh 68
    100s Microservices
    1,000s Releases / Day
    10,000s Virtual Machines
    100K+ User actions / Second
    81 M Customers Globally
    1 B Time series Metrics
    10 B Hours of video streaming
    every quarter
    Source: NetFlix: : https://www.youtube.com/watch?v=UTKIT6STSVM
    10s OPs Engineers
    0 NOC
    0 Data Centers
    So what do NetFlix think about DevOps?
    No DevOps
    Don’t do lot of Process / Procedures
    Freedom for Developers & be Accountable
    Trust people you Hire
    No Controls / Silos / Walls / Fences
    Ownership – You Build it, You Run it.

    View Slide

  69. @arafkarsh arafkarsh 69
    50M Paid Subscribers
    100M Active Users
    60 Countries
    Cross Functional Team
    Full, End to End ownership of features
    Autonomous
    1000+ Microservices
    Source: https://microcph.dk/media/1024/conference-microcph-2017.pdf
    1000+ Tech Employees
    120+ Teams

    View Slide

  70. @arafkarsh arafkarsh 70
    Design Patterns are
    solutions to general
    problems that
    software developers
    faced during software
    development.
    Design Patterns

    View Slide

  71. @arafkarsh arafkarsh 71
    DREAM | AUTOMATE | EMPOWER
    Araf Karsh Hamid :
    India: +91.999.545.8627
    http://www.slideshare.net/arafkarsh
    https://www.linkedin.com/in/arafkarsh/
    https://www.youtube.com/user/arafkarsh/playlists
    http://www.arafkarsh.com/
    @arafkarsh
    arafkarsh

    View Slide

  72. @arafkarsh arafkarsh 72
    Source Code: https://github.com/MetaArivu Web Site: https://metarivu.com/ https://pyxida.cloud/

    View Slide

  73. @arafkarsh arafkarsh 73
    http://www.slideshare.net/arafkarsh

    View Slide

  74. @arafkarsh arafkarsh
    References
    74
    1. July 15, 2015 – Agile is Dead : GoTo 2015 By Dave Thomas
    2. Apr 7, 2016 - Agile Project Management with Kanban | Eric Brechner | Talks at Google
    3. Sep 27, 2017 - Scrum vs Kanban - Two Agile Teams Go Head-to-Head
    4. Feb 17, 2019 - Lean vs Agile vs Design Thinking
    5. Dec 17, 2020 - Scrum vs Kanban | Differences & Similarities Between Scrum & Kanban
    6. Feb 24, 2021 - Agile Methodology Tutorial for Beginners | Jira Tutorial | Agile Methodology Explained.
    Agile Methodologies

    View Slide

  75. @arafkarsh arafkarsh
    References
    75
    1. Vmware: What is Cloud Architecture?
    2. Redhat: What is Cloud Architecture?
    3. Cloud Computing Architecture
    4. Cloud Adoption Essentials:
    5. Google: Hybrid and Multi Cloud
    6. IBM: Hybrid Cloud Architecture Intro
    7. IBM: Hybrid Cloud Architecture: Part 1
    8. IBM: Hybrid Cloud Architecture: Part 2
    9. Cloud Computing Basics: IaaS, PaaS, SaaS
    1. IBM: IaaS Explained
    2. IBM: PaaS Explained
    3. IBM: SaaS Explained
    4. IBM: FaaS Explained
    5. IBM: What is Hypervisor?
    Cloud Architecture

    View Slide

  76. @arafkarsh arafkarsh
    References
    76
    Microservices
    1. Microservices Definition by Martin Fowler
    2. When to use Microservices By Martin Fowler
    3. GoTo: Sep 3, 2020: When to use Microservices By Martin Fowler
    4. GoTo: Feb 26, 2020: Monolith Decomposition Pattern
    5. Thought Works: Microservices in a Nutshell
    6. Microservices Prerequisites
    7. What do you mean by Event Driven?
    8. Understanding Event Driven Design Patterns for Microservices

    View Slide

  77. @arafkarsh arafkarsh
    References – Microservices – Videos
    77
    1. Martin Fowler – Micro Services : https://www.youtube.com/watch?v=2yko4TbC8cI&feature=youtu.be&t=15m53s
    2. GOTO 2016 – Microservices at NetFlix Scale: Principles, Tradeoffs & Lessons Learned. By R Meshenberg
    3. Mastering Chaos – A NetFlix Guide to Microservices. By Josh Evans
    4. GOTO 2015 – Challenges Implementing Micro Services By Fred George
    5. GOTO 2016 – From Monolith to Microservices at Zalando. By Rodrigue Scaefer
    6. GOTO 2015 – Microservices @ Spotify. By Kevin Goldsmith
    7. Modelling Microservices @ Spotify : https://www.youtube.com/watch?v=7XDA044tl8k
    8. GOTO 2015 – DDD & Microservices: At last, Some Boundaries By Eric Evans
    9. GOTO 2016 – What I wish I had known before Scaling Uber to 1000 Services. By Matt Ranney
    10. DDD Europe – Tackling Complexity in the Heart of Software By Eric Evans, April 11, 2016
    11. AWS re:Invent 2016 – From Monolithic to Microservices: Evolving Architecture Patterns. By Emerson L, Gilt D. Chiles
    12. AWS 2017 – An overview of designing Microservices based Applications on AWS. By Peter Dalbhanjan
    13. GOTO Jun, 2017 – Effective Microservices in a Data Centric World. By Randy Shoup.
    14. GOTO July, 2017 – The Seven (more) Deadly Sins of Microservices. By Daniel Bryant
    15. Sept, 2017 – Airbnb, From Monolith to Microservices: How to scale your Architecture. By Melanie Cubula
    16. GOTO Sept, 2017 – Rethinking Microservices with Stateful Streams. By Ben Stopford.
    17. GOTO 2017 – Microservices without Servers. By Glynn Bird.

    View Slide

  78. @arafkarsh arafkarsh
    References
    78
    Domain Driven Design
    1. Oct 27, 2012 What I have learned about DDD Since the book. By Eric Evans
    2. Mar 19, 2013 Domain Driven Design By Eric Evans
    3. Jun 02, 2015 Applied DDD in Java EE 7 and Open Source World
    4. Aug 23, 2016 Domain Driven Design the Good Parts By Jimmy Bogard
    5. Sep 22, 2016 GOTO 2015 – DDD & REST Domain Driven API’s for the Web. By Oliver Gierke
    6. Jan 24, 2017 Spring Developer – Developing Micro Services with Aggregates. By Chris Richardson
    7. May 17. 2017 DEVOXX – The Art of Discovering Bounded Contexts. By Nick Tune
    8. Dec 21, 2019 What is DDD - Eric Evans - DDD Europe 2019. By Eric Evans
    9. Oct 2, 2020 - Bounded Contexts - Eric Evans - DDD Europe 2020. By. Eric Evans
    10. Oct 2, 2020 - DDD By Example - Paul Rayner - DDD Europe 2020. By Paul Rayner

    View Slide

  79. @arafkarsh arafkarsh
    References
    79
    Event Sourcing and CQRS
    1. IBM: Event Driven Architecture – Mar 21, 2021
    2. Martin Fowler: Event Driven Architecture – GOTO 2017
    3. Greg Young: A Decade of DDD, Event Sourcing & CQRS – April 11, 2016
    4. Nov 13, 2014 GOTO 2014 – Event Sourcing. By Greg Young
    5. Mar 22, 2016 Building Micro Services with Event Sourcing and CQRS
    6. Apr 15, 2016 YOW! Nights – Event Sourcing. By Martin Fowler
    7. May 08, 2017 When Micro Services Meet Event Sourcing. By Vinicius Gomes

    View Slide

  80. @arafkarsh arafkarsh
    References
    80
    Kafka
    1. Understanding Kafka
    2. Understanding RabbitMQ
    3. IBM: Apache Kafka – Sept 18, 2020
    4. Confluent: Apache Kafka Fundamentals – April 25, 2020
    5. Confluent: How Kafka Works – Aug 25, 2020
    6. Confluent: How to integrate Kafka into your environment – Aug 25, 2020
    7. Kafka Streams – Sept 4, 2021
    8. Kafka: Processing Streaming Data with KSQL – Jul 16, 2018
    9. Kafka: Processing Streaming Data with KSQL – Nov 28, 2019

    View Slide

  81. @arafkarsh arafkarsh
    References
    81
    Databases: Big Data / Cloud Databases
    1. Google: How to Choose the right database?
    2. AWS: Choosing the right Database
    3. IBM: NoSQL Vs. SQL
    4. A Guide to NoSQL Databases
    5. How does NoSQL Databases Work?
    6. What is Better? SQL or NoSQL?
    7. What is DBaaS?
    8. NoSQL Concepts
    9. Key Value Databases
    10. Document Databases
    11. Jun 29, 2012 – Google I/O 2012 - SQL vs NoSQL: Battle of the Backends
    12. Feb 19, 2013 - Introduction to NoSQL • Martin Fowler • GOTO 2012
    13. Jul 25, 2018 - SQL vs NoSQL or MySQL vs MongoDB
    14. Oct 30, 2020 - Column vs Row Oriented Databases Explained
    15. Dec 9, 2020 - How do NoSQL databases work? Simply Explained!
    1. Graph Databases
    2. Column Databases
    3. Row Vs. Column Oriented Databases
    4. Database Indexing Explained
    5. MongoDB Indexing
    6. AWS: DynamoDB Global Indexing
    7. AWS: DynamoDB Local Indexing
    8. Google Cloud Spanner
    9. AWS: DynamoDB Design Patterns
    10. Cloud Provider Database Comparisons
    11. CockroachDB: When to use a Cloud DB?

    View Slide

  82. @arafkarsh arafkarsh
    References
    82
    Docker / Kubernetes / Istio
    1. IBM: Virtual Machines and Containers
    2. IBM: What is a Hypervisor?
    3. IBM: Docker Vs. Kubernetes
    4. IBM: Containerization Explained
    5. IBM: Kubernetes Explained
    6. IBM: Kubernetes Ingress in 5 Minutes
    7. Microsoft: How Service Mesh works in Kubernetes
    8. IBM: Istio Service Mesh Explained
    9. IBM: Kubernetes and OpenShift
    10. IBM: Kubernetes Operators
    11. 10 Consideration for Kubernetes Deployments
    Istio – Metrics
    1. Istio – Metrics
    2. Monitoring Istio Mesh with Grafana
    3. Visualize your Istio Service Mesh
    4. Security and Monitoring with Istio
    5. Observing Services using Prometheus, Grafana, Kiali
    6. Istio Cookbook: Kiali Recipe
    7. Kubernetes: Open Telemetry
    8. Open Telemetry
    9. How Prometheus works
    10. IBM: Observability vs. Monitoring

    View Slide

  83. @arafkarsh arafkarsh
    References
    83
    1. Feb 6, 2020 – An introduction to TDD
    2. Aug 14, 2019 – Component Software Testing
    3. May 30, 2020 – What is Component Testing?
    4. Apr 23, 2013 – Component Test By Martin Fowler
    5. Jan 12, 2011 – Contract Testing By Martin Fowler
    6. Jan 16, 2018 – Integration Testing By Martin Fowler
    7. Testing Strategies in Microservices Architecture
    8. Practical Test Pyramid By Ham Vocke
    Testing – TDD / BDD

    View Slide

  84. @arafkarsh arafkarsh 84
    1. Simoorg : LinkedIn’s own failure inducer framework. It was designed to be easy to extend and
    most of the important components are plug‐ gable.
    2. Pumba : A chaos testing and network emulation tool for Docker.
    3. Chaos Lemur : Self-hostable application to randomly destroy virtual machines in a BOSH-
    managed environment, as an aid to resilience testing of high-availability systems.
    4. Chaos Lambda : Randomly terminate AWS ASG instances during business hours.
    5. Blockade : Docker-based utility for testing network failures and partitions in distributed
    applications.
    6. Chaos-http-proxy : Introduces failures into HTTP requests via a proxy server.
    7. Monkey-ops : Monkey-Ops is a simple service implemented in Go, which is deployed into an
    OpenShift V3.X and generates some chaos within it. Monkey-Ops seeks some OpenShift
    components like Pods or Deployment Configs and randomly terminates them.
    8. Chaos Dingo : Chaos Dingo currently supports performing operations on Azure VMs and VMSS
    deployed to an Azure Resource Manager-based resource group.
    9. Tugbot : Testing in Production (TiP) framework for Docker.
    Testing tools

    View Slide

  85. @arafkarsh arafkarsh
    References
    85
    CI / CD
    1. What is Continuous Integration?
    2. What is Continuous Delivery?
    3. CI / CD Pipeline
    4. What is CI / CD Pipeline?
    5. CI / CD Explained
    6. CI / CD Pipeline using Java Example Part 1
    7. CI / CD Pipeline using Ansible Part 2
    8. Declarative Pipeline vs Scripted Pipeline
    9. Complete Jenkins Pipeline Tutorial
    10. Common Pipeline Mistakes
    11. CI / CD for a Docker Application

    View Slide

  86. @arafkarsh arafkarsh
    References
    86
    DevOps
    1. IBM: What is DevOps?
    2. IBM: Cloud Native DevOps Explained
    3. IBM: Application Transformation
    4. IBM: Virtualization Explained
    5. What is DevOps? Easy Way
    6. DevOps?! How to become a DevOps Engineer???
    7. Amazon: https://www.youtube.com/watch?v=mBU3AJ3j1rg
    8. NetFlix: https://www.youtube.com/watch?v=UTKIT6STSVM
    9. DevOps and SRE: https://www.youtube.com/watch?v=uTEL8Ff1Zvk
    10. SLI, SLO, SLA : https://www.youtube.com/watch?v=tEylFyxbDLE
    11. DevOps and SRE : Risks and Budgets : https://www.youtube.com/watch?v=y2ILKr8kCJU
    12. SRE @ Google: https://www.youtube.com/watch?v=d2wn_E1jxn4

    View Slide

  87. @arafkarsh arafkarsh
    References
    87
    1. Lewis, James, and Martin Fowler. “Microservices: A Definition of This New Architectural Term”, March 25, 2014.
    2. Miller, Matt. “Innovate or Die: The Rise of Microservices”. e Wall Street Journal, October 5, 2015.
    3. Newman, Sam. Building Microservices. O’Reilly Media, 2015.
    4. Alagarasan, Vijay. “Seven Microservices Anti-patterns”, August 24, 2015.
    5. Cockcroft, Adrian. “State of the Art in Microservices”, December 4, 2014.
    6. Fowler, Martin. “Microservice Prerequisites”, August 28, 2014.
    7. Fowler, Martin. “Microservice Tradeoffs”, July 1, 2015.
    8. Humble, Jez. “Four Principles of Low-Risk Software Release”, February 16, 2012.
    9. Zuul Edge Server, Ketan Gote, May 22, 2017
    10. Ribbon, Hysterix using Spring Feign, Ketan Gote, May 22, 2017
    11. Eureka Server with Spring Cloud, Ketan Gote, May 22, 2017
    12. Apache Kafka, A Distributed Streaming Platform, Ketan Gote, May 20, 2017
    13. Functional Reactive Programming, Araf Karsh Hamid, August 7, 2016
    14. Enterprise Software Architectures, Araf Karsh Hamid, July 30, 2016
    15. Docker and Linux Containers, Araf Karsh Hamid, April 28, 2015

    View Slide

  88. @arafkarsh arafkarsh
    References
    88
    16. MSDN – Microsoft https://msdn.microsoft.com/en-us/library/dn568103.aspx
    17. Martin Fowler : CQRS – http://martinfowler.com/bliki/CQRS.html
    18. Udi Dahan : CQRS – http://www.udidahan.com/2009/12/09/clarified-cqrs/
    19. Greg Young : CQRS - https://www.youtube.com/watch?v=JHGkaShoyNs
    20. Bertrand Meyer – CQS - http://en.wikipedia.org/wiki/Bertrand_Meyer
    21. CQS : http://en.wikipedia.org/wiki/Command–query_separation
    22. CAP Theorem : http://en.wikipedia.org/wiki/CAP_theorem
    23. CAP Theorem : http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
    24. CAP 12 years how the rules have changed
    25. EBay Scalability Best Practices : http://www.infoq.com/articles/ebay-scalability-best-practices
    26. Pat Helland (Amazon) : Life beyond distributed transactions
    27. Stanford University: Rx https://www.youtube.com/watch?v=y9xudo3C1Cw
    28. Princeton University: SAGAS (1987) Hector Garcia Molina / Kenneth Salem
    29. Rx Observable : https://dzone.com/articles/using-rx-java-observable

    View Slide