Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Patterns & Best Practices For Moving From RDBMS...

Patterns & Best Practices For Moving From RDBMS to Azure Storage

Moving from storing your data in an RDBMS like SQL Server to a NoSQL data store like Azure Table requires a programming paradigm shift. This talk will share the knowledge the Halo Services Team gained while making the jump from SQL to Azure Storage while working on Halo 4. I’ll start with an overview of the core differences between the two data stores. In addition I’ll discuss the importance of partitions in Azure Table, and how to perform transactions. Finally I’ll cover how to incorporate other data best practices, and explore how to achieve transaction like behavior across partitions using the Saga Pattern.

Caitie McCaffrey

April 26, 2013
Tweet

More Decks by Caitie McCaffrey

Other Decks in Technology

Transcript

  1. Patterns & Best Practices For Moving From RDBMS to Azure

    Storage DotNetConf 2013 1 Friday, April 26, 13
  2. Caitie McCa!rey S w r D v E r 343

    I r @CaitieM20 CaitieM.com 2 Friday, April 26, 13
  3. Azure Storage Blob: Unstructured Binary Table: Structured Non Relational SQL

    Database: Relational Storage 4 Friday, April 26, 13
  4. “W w Az r b r f r r r

    r , non-relational .” 5 Friday, April 26, 13
  5. Azure Table Structure Tables: Collection of Entities Entities: Set of

    Pro erties Properties: Name Value Pair Table Entity Entity Entity Property Property Property Property Property Property Property 7 Friday, April 26, 13
  6. Entities Partition Key: identifies which artition an entity belongs to

    Row Key: uniquely identifies an entity in a artition Timestamp: last time entity was modified Entity PartitionKey RowKey TimeStamp Property <Name, Value> Property <Name, Value> Property <Name, Value> Property <Name, Value> Property <Name, Value> 8 Friday, April 26, 13
  7. • Only One Partition Key + Row Key = Unique

    Identifier • Azure Table supports • Key per Entity 9 Friday, April 26, 13
  8. Type Numerics Date & Time Binary Character Strings Other Azure

    Table Int32 DateTime byte[] Strings (UTF 16 encoded Value) Guid Azure Table Int64 DateTime byte[] Strings (UTF 16 encoded Value) Bit Azure Table Double DateTime byte[] Strings (UTF 16 encoded Value) SQL BigInt Bit SQL Int Bit SQL SmallInt Date Char Bit SQL TinyInt DateTime Text Bit SQL Money DateTime2 Binary Varchar Bit SQL SmallMoney DateTimeOffset Image Nchar Bit SQL Numeric SmallDateTime VarBinary Ntext Bit SQL Float Time NVarChar Bit SQL Decimal Bit SQL Real Bit Properties are Name Value Pairs 10 Friday, April 26, 13
  9. Table Partitions •Azure Table supports Batch Transactions on entities that

    are in the same Table & Partition E w S P r K 11 Friday, April 26, 13
  10. “This Partition Size is Too Small. Not Enough Transaction Support!”

    Goldilocks & Azure Storage 14 Friday, April 26, 13
  11. “This Partition Size is Just Right! Scale & Just Enough

    Transaction Support” Goldilocks & Azure Storage 15 Friday, April 26, 13
  12. Partition Key Space • Inserts into Partition Key • Space

    Should be “Random” 17 17 Friday, April 26, 13
  13. Partition Key Space • Inserts into Partition Key • Space

    Should be “Random” 18 Partition Key Space 1 18 Friday, April 26, 13
  14. Partition Key Space • Inserts into Partition Key • Space

    Should be “Random” 19 Partition Key Space 1 2 19 Friday, April 26, 13
  15. Partition Key Space • Inserts into Partition Key • Space

    Should be “Random” 20 Partition Key Space 1 2 3 20 Friday, April 26, 13
  16. Partition Key Space • Inserts into Partition Key • Space

    Should be “Random” 21 Partition Key Space 1 2 3 4 21 Friday, April 26, 13
  17. Partition Key Space • Inserts into Partition Key • Space

    Should be “Random” 22 Partition Key Space 1 2 3 4 22 Friday, April 26, 13
  18. Partition Key Space • Inserts into Partition Key • Space

    Should be “Random” 23 Partition Key Space 23 Friday, April 26, 13
  19. Partition Key Space • Inserts into Partition Key • Space

    Should be “Random” 24 Partition Key Space 12 24 Friday, April 26, 13
  20. Partition Key Space • Inserts into Partition Key • Space

    Should be “Random” 25 Partition Key Space 12 28 25 Friday, April 26, 13
  21. Partition Key Space • Inserts into Partition Key • Space

    Should be “Random” 26 Partition Key Space 12 28 1 26 Friday, April 26, 13
  22. Partition Key Space • Inserts into Partition Key • Space

    Should be “Random” 27 Partition Key Space 12 28 1 6 27 Friday, April 26, 13
  23. Partition Key Space • Inserts into Partition Key • Space

    Should be “Random” 28 Partition Key Space 12 28 1 6 28 Friday, April 26, 13
  24. 34 Simple Stats: Azure Storage What Queries do I need?

     Total player statistics  All games a player has played  Show stats for single game 34 Friday, April 26, 13
  25. Simple Stats: Azure Storage What Transactions do I want? 

    Player Data updates  Write Game Stats 35 Friday, April 26, 13
  26. 36 How Do I Partition The Data  Game Id

    or Player Id? Simple Stats: Azure Storage Table = Games PK=GameId(1) RK=PlayerId(2) PK=GameId(1) RK=PlayerId(1) PK=GameId(2) RK=PlayerId(3) PK=GameId(2) RK=PlayerId(2) PK=GameId(3) RK=PlayerId(1) PK=GameId(3) RK=PlayerId(3) Table = Players PK=PlayerId(1) RK=GameId(3) PK=PlayerId(1) RK=GameId(1) PK=PlayerId(2) RK=GameId(1) PK=PlayerId(2) RK=GameId(2) PK=PlayerId(3) RK=GameId(2) PK=PlayerId(3) RK=GameId(3) OR 36 Friday, April 26, 13
  27. Code Break The Worst Error Is Data Loss or Data

    Corruption 42 Friday, April 26, 13
  28. Immutability in the Real World “Event” Entities: Source of Truth

    is Immutable Aggregate “View” Entities: use all of CRUD. Player Game Player Game Player F( Player Game Player Game Player Game ) , , , , 45 Friday, April 26, 13
  29. More Immutability Benefits •Flexibility to add new views later •Data

    Audit Log •Idempotent Writes to a Partition 46 Friday, April 26, 13
  30. Google Spanner “Spanner is Google’s scalable, multi-version, globally distributed, and

    synchronously- replicated database. It is the first system to distribute data at global scale and support externally-consistent distributed transactions.” ~James C. Corbett et al 51 Friday, April 26, 13
  31. Google Spanner “The key enabler of these properties is a

    new TrueTime API and its implementation…using multiple modern clock references (GPS and atomic clocks).” ~James C. Corbett et al 52 Friday, April 26, 13
  32. Store Stats Player One Store Stats Player Two Store Stats

    Player Three Store Stats Player Four Game Stats 56 Friday, April 26, 13
  33. Store Stats Player One Store Stats Player Two Store Stats

    Player Three Store Stats Player Four Game Stats 57 Friday, April 26, 13
  34. Store Stats Player One Store Stats Player Two Store Stats

    Player Three Store Stats Player Four Compensate Player One Compensate Player Two Compensate Player Three Compensate Player Four 58 Friday, April 26, 13
  35. Store Stats Player One Store Stats Player Two Store Stats

    Player Three Store Stats Player Four Game Stats 59 Friday, April 26, 13