Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Database Design for Beginners

Database Design for Beginners

Rails’ migrations were a revelation when Rails came out, as it provided a way to manage your database schema, but also included many wonderful defaults that made all Rails developers pretty good at database schema design! But these defaults are just the beginning. Properly modeling your database can bring many benefits, such as fewer bugs, more reliable insights about your users or business, and developer understanding. We’ll talk about what makes a good database design by understanding the concept of normalization and why a properly normalized database yields benefits.

David Copeland

May 01, 2019
Tweet

More Decks by David Copeland

Other Decks in Programming

Transcript

  1. PROFESSIONAL WRESTLING A DOMAIN EVERYONE CAN LOVE • A wrestler

    may have a finishing move e.g. The Rock’s finishing move is the Rock Bottom • A wrestler might wrestle on a show e.g. The Rock wrestles on Smackdown • A wrestler can be cleared to compete on one or more matches of a show e.g. The Rock wrestlers on the main event of Smackdown, but Zack Ryder wrestlers on the opener. • A show may have a title belt that is defended on that show e.g. On Smackdown, the WWE Championship is defended.
  2. PROFESSIONAL WRESTLING A DOMAIN EVERYONE CAN LOVE • A wrestler

    may have a finishing move e.g. The Rock’s finishing move is the Rock Bottom • A wrestler might wrestle on a show e.g. The Rock wrestles on Smackdown • A wrestler can be cleared to compete on one or more matches of a show e.g. The Rock wrestlers on the main event of Smackdown, but Zack Ryder wrestlers on the opener. • A show may have a title belt that is defended on that show e.g. On Smackdown, the WWE Championship is defended.
  3. WRESTLERS NAME FINISHING MOVE SHOWS NAME TITLE SHOW MATCHES SHOW

    MATCH WRESTLER CLEARANCES WRESTLER SHOW MATCH MATCHES NAME
  4. ANOMALIES WHEN THE DATA MODEL PREVENTS THE STORAGE OF CERTAIN

    FACTS, ALLOWS AMBIGUITY TO EXIST, OR REQUIRES DELETING ONE FACT TO DELETE ANOTHER
  5. NORMAL FORMS ARE MATHEMATICAL TRUTHS ABOUT A DATA MODEL THAT

    PROVE WHAT SORTS OF ANOMALIES HAVE BEEN REMOVED
  6. PROFESSIONAL WRESTLING A DOMAIN EVERYONE CAN LOVE • A wrestler

    may have a finishing move. • A wrestler might wrestle on a show. • A wrestler can be cleared to compete on one or more matches of a show. • A show may have a title belt that is defended on that show.
  7. WRESTLER FINISHING MOVE SHOW TITLE MATCH The Rock Rock Bottom

    RAW, Smackdown Universal Championship Main Event Stone Cold Steve Austin Stunner Smackdown WWE Championship Main Event, Opener Kevin Owens Stunner RAW Universal Title Main Event
  8. WHAT IS A RELATION? • A relation requires: • No

    duplicate rows • Each field has a single value • Fields in the same column are of the same type
  9. A NOTE ABOUT TYPES • Think about what are the

    allowed values - likely not strings or numbers • NULL is not a value, therefore NULL is not an allowed value, therefore no field can allow NULL • Remember, this is a design, not an implementation.
  10. WRESTLER FINISHING MOVE SHOW TITLE MATCH The Rock Rock Bottom

    RAW, Smackdown Universal Championship Main Event Stone Cold Steve Austin Stunner Smackdown WWE Championship Main Event, Opener Kevin Owens Stunner RAW Universal Title Main Event
  11. WRESTLER FINISHING MOVE SHOW TITLE MATCH The Rock Rock Bottom

    RAW, Smackdown Universal Championship Main Event Stone Cold Steve Austin Stunner Smackdown WWE Championship Main Event, Opener Kevin Owens Stunner RAW Universal Title Main Event
  12. WRESTLER FINISHING MOVE SHOW TITLE MATCH The Rock Rock Bottom

    RAW, Smackdown Universal Championship Main Event Stone Cold Steve Austin Stunner Smackdown WWE Championship Main Event, Opener Kevin Owens Stunner RAW Universal Title Main Event
  13. WRESTLER FINISHING MOVE SHOW TITLE MATCH The Rock Rock Bottom

    RAW, Smackdown Universal Championship Main Event Stone Cold Steve Austin Stunner Smackdown WWE Championship Main Event, Opener Kevin Owens Stunner RAW Universal Title Main Event
  14. WRESTLER FINISHING MOVE SHOW TITLE MATCH The Rock Rock Bottom

    RAW Universal Championship Main Event The Rock Rock Bottom Smackdown WWE Championship Main Event Stone Cold Steve Austin Stunner Smackdown WWE Championship Main Event Stone Cold Steve Austin Stunner Smackdown WWE Championship Opener Kevin Owens Stunner RAW Universal Championship Main Event
  15. FIRST NORMAL FORM NO DUPLICATE ROWS, EACH VALUE IS A

    SINGLE VALUE OF THE RIGHT TYPE (E.G. NO MULTIPLE VALUES)
  16. WRESTLER FINISHING MOVE SHOW TITLE MATCH The Rock Rock Bottom

    RAW Universal Championship Main Event The Rock Rock Bottom Smackdown WWE Championship Main Event Stone Cold Steve Austin Million Dollar Dream Smackdown WWE Championship Main Event Stone Cold Steve Austin Stunner Smackdown WWE Championship Opener Kevin Owens Stunner RAW Universal Championship Main Event Kenny Omega One-Winged Angel ??? ??? ???
  17. WRESTLER FINISHING MOVE SHOW TITLE MATCH The Rock Rock Bottom

    RAW Universal Championship Main Event The Rock Rock Bottom Smackdown WWE Championship Main Event Stone Cold Steve Austin Million Dollar Dream Smackdown WWE Championship Main Event Stone Cold Steve Austin Stunner Smackdown WWE Championship Opener Kevin Owens Stunner RAW Universal Championship Main Event Kenny Omega One-Winged Angel ??? ??? ???
  18. WRESTLER FINISHING MOVE SHOW TITLE MATCH The Rock Rock Bottom

    RAW Universal Championship Main Event The Rock Rock Bottom Smackdown WWE Championship Main Event Stone Cold Steve Austin Million Dollar Dream Smackdown WWE Championship Main Event Stone Cold Steve Austin Stunner Smackdown WWE Championship Opener Kevin Owens Stunner RAW Universal Championship Main Event Kenny Omega One-Winged Angel ??? ??? ???
  19. WRESTLER FINISHING MOVE SHOW TITLE MATCH The Rock Rock Bottom

    RAW Universal Championship Main Event The Rock Rock Bottom Smackdown WWE Championship Main Event Stone Cold Steve Austin Million Dollar Dream Smackdown WWE Championship Main Event Stone Cold Steve Austin Stunner Smackdown WWE Championship Opener Kevin Owens Stunner RAW Universal Championship Main Event Kenny Omega One-Winged Angel ??? ??? ???
  20. WRESTLER FINISHING MOVE SHOW TITLE MATCH The Rock Rock Bottom

    RAW Universal Championship Main Event The Rock Rock Bottom Smackdown WWE Championship Main Event Stone Cold Steve Austin Million Dollar Dream Smackdown WWE Championship Main Event Stone Cold Steve Austin Stunner Smackdown WWE Championship Opener Kevin Owens Stunner RAW Universal Championship Main Event Kenny Omega One-Winged Angel ??? ??? ???
  21. BECAUSE WE ARE IN FIRST NORMAL FORM, I.E. A RELATION,

    WE CAN SHOW WHY THIS IS FLAWED AND FIX IT WITHOUT A LOT OF THINKING!
  22. FUNCTIONAL DEPENDENCIES • A FUNCTIONAL DEPENDENCY is when the value

    of a column unambiguously implies the value of another column, based on business rules.
  23. FUNCTIONAL DEPENDENCIES • A FUNCTIONAL DEPENDENCY is when the value

    of a column unambiguously implies the value of another column, based on business rules. • A wrestler may only have a single finishing move, therefore: • WRESTLER implies FINISHING MOVE • A show may have a single title defended on it, therefore: • SHOW implies TITLE
  24. NOT FUNCTIONAL DEPENDENCIES • Note that because a SHOW can

    have more than one MATCH, SHOW does not imply MATCH, so no functional dependency there. • FINISHING MOVE does not imply WRESTLER, because it’s allowed for many wrestlers to have the same finishing move.
  25. WRESTLER FINISHING MOVE SHOW TITLE MATCH The Rock Rock Bottom

    RAW Universal Championship Main Event Stone Cold Steve Austin Stunner Smackdown WWE Championship Main Event Stone Cold Steve Austin Stunner Smackdown WWE Championship Opener The Rock Rock Bottom Smackdown WWE Championship Main Event Kevin Owens Stunner RAW Universal Championship Main Event ALTHOUGH OUR FUNCTIONAL DEPENDENCIES ARE SATISFIED BY THE DATA, THE DATA MODEL DOESN’T REQUIRE IT
  26. WRESTLER FINISHING MOVE SHOW TITLE MATCH The Rock Rock Bottom

    RAW Universal Championship Main Event Stone Cold Steve Austin Stunner Smackdown WWE Championship Main Event Stone Cold Steve Austin Stunner Smackdown WWE Championship Opener The Rock Rock Bottom Smackdown WWE Championship Main Event Kevin Owens Stunner RAW Universal Championship Main Event ALTHOUGH OUR FUNCTIONAL DEPENDENCIES ARE SATISFIED BY THE DATA, THE DATA MODEL DOESN’T REQUIRE IT PEOPLE’S ELBOW
  27. OUR DESIGN ALLOWS A QUERY BASED ON WRESTLER TO RETURN

    AN AMBIGUOUS RESULT FOR FINISHING MOVE THUS, IT IS OPEN TO ANOMALIES, AND THEREFORE FLAWED
  28. KEYS • A key is a subset of columns that

    is uniquely identifies a row • i.e. no more than one row with the same set of values for the key
  29. KEYS • A key is a subset of columns that

    is uniquely identifies a row • i.e. no more than one row with the same set of values for the key • A relation can have many keys, but always has at least one
  30. KEYS • A key is a subset of columns that

    is uniquely identifies a row • i.e. no more than one row with the same set of values for the key • A relation can have many keys, but always has at least one • all columns form a key since there can be no duplicate rows
  31. KEYS • A key is a subset of columns that

    is uniquely identifies a row • i.e. no more than one row with the same set of values for the key • A relation can have many keys, but always has at least one • all columns form a key since there can be no duplicate rows • business rules reveal other keys
  32. KEYS BASED ON BUSINESS RULES • According to our business

    rules, a wrestler cannot wrestle the same match on the same show more than once, therefore: • { WRESTLER, SHOW, MATCH } is a key
  33. KEYS BASED ON BUSINESS RULES • According to our business

    rules, a wrestler cannot wrestle the same match on the same show more than once, therefore: • { WRESTLER, SHOW, MATCH } is a key • Meaning, if we have a WRESTLER, SHOW, and MATCH, we can use that to identify at most one row in our table, i.e. this produces an unambiguous answer.
  34. KEYS BASED ON BUSINESS RULES • According to our business

    rules, a wrestler cannot wrestle the same match on the same show more than once, therefore: • { WRESTLER, SHOW, MATCH } is a key • Meaning, if we have a WRESTLER, SHOW, and MATCH, we can use that to identify at most one row in our table, i.e. this produces an unambiguous answer. • i.e. there can be no more than one row with the same values for WRESTLER, SHOW, and MATCH
  35. WRESTLER FINISHING MOVE SHOW TITLE MATCH The Rock Rock Bottom

    RAW Universal Championship Main Event The Rock Rock Bottom Smackdown WWE Championship Main Event Stone Cold Steve Austin Stunner Smackdown WWE Championship Main Event Stone Cold Steve Austin Stunner Smackdown WWE Championship Opener Kevin Owens Stunner RAW Universal Championship Main Event OUR DATA SATISFY THE KEY. THE DATA MODEL SIMPLY NEEDS TO STATE WHAT THE KEYS ARE FOR IT TO SATISFY THE KEY (WE’LL SEE HOW TO ENFORCE THIS IN A REAL DATABASE LATER ON)
  36. IDENTIFY FUNCTIONAL DEPENDENCIES AND KEYS STEP 3 STEP 1
 WHAT

    FACTS DO WE WANT TO RECORD? STEP 2
 CREATE RELATIONS
  37. REMINDER: KEYS AND FUNCTIONAL DEPENDENCIES ARE THE BUSINESS RULES. THEY

    CANNOT BE MATHEMATICALLY DERIVED BASED ON THE DATA
  38. IMPLICATIONS OF KEYS AND FUNCTIONAL DEPENDENCIES To get unambiguous information:

    we want a query that returns 0 or 1 rows → thus we must use a key to get that.
  39. IMPLICATIONS OF KEYS AND FUNCTIONAL DEPENDENCIES To get unambiguous information:

    we want a query that returns 0 or 1 rows → thus we must use a key to get that. Our Functional Dependency { WRESTLER } → { FINISHING MOVE } says we should get an unambiguous FINISHING MOVE from just WRESTLER → however WRESTLER is not a key → meaning a query based on WRESTLER is might return ambiguous results
  40. IMPLICATIONS OF KEYS AND FUNCTIONAL DEPENDENCIES To get unambiguous information:

    we want a query that returns 0 or 1 rows → thus we must use a key to get that. Our Functional Dependency { WRESTLER } → { FINISHING MOVE } says we should get an unambiguous FINISHING MOVE from just WRESTLER → however WRESTLER is not a key → meaning a query based on WRESTLER is might return ambiguous results
  41. IMPLICATIONS OF KEYS AND FUNCTIONAL DEPENDENCIES To get unambiguous information:

    we want a query that returns 0 or 1 rows → thus we must use a key to get that. Our Functional Dependency { WRESTLER } → { FINISHING MOVE } says we should get an unambiguous FINISHING MOVE from just WRESTLER → however WRESTLER is not a key → meaning a query based on WRESTLER is might return ambiguous results ∴ Our database allows storing data that violates our business rules
  42. IF OUR DESIGN WAS SUCH THAT WRESTLER WAS A KEY

    OF SOME TABLE, FUNCTIONAL DEPENDENCIES WOULD ALWAYS TO BE SATISFIED
  43. EXTRACT NEW TABLES STEP 4 STEP 1
 WHAT FACTS DO

    WE WANT TO RECORD? STEP 2
 WHAT ARE THE TYPES? STEP 3
 IDENTIFY FUNCTIONAL DEPENDENCIES AND KEYS
  44. MAKE A NEW TABLE WHERE THE KEY IS THE VALUE

    ON THE LEFT SIDE OF THE FUNCTIONAL DEPENDENCY WRESTLER
  45. HEATH’S THEOREM ALSO SAYS WE HAVE NOT LOST DATA WRESTLER

    FINISHING MOVE The Rock Rock Bottom Stone Cold Steve Austin Stunner Kevin Owens Stunner WRESTLER SHOW TITLE MATCH The Rock RAW Universal Championship Main Event The Rock Smackdown WWE Championship Main Event Stone Cold Steve Austin Smackdown WWE Championship Main Event Stone Cold Steve Austin Smackdown WWE Championship Opener Kevin Owens RAW Universal Championship Main Event
  46. HEATH’S THEOREM ALSO SAYS WE HAVE NOT LOST DATA WRESTLER

    FINISHING MOVE The Rock Rock Bottom Stone Cold Steve Austin Stunner Kevin Owens Stunner WRESTLER SHOW TITLE MATCH The Rock RAW Universal Championship Main Event The Rock Smackdown WWE Championship Main Event Stone Cold Steve Austin Smackdown WWE Championship Main Event Stone Cold Steve Austin Smackdown WWE Championship Opener Kevin Owens RAW Universal Championship Main Event
  47. HEATH’S THEOREM ALSO SAYS WE HAVE NOT LOST DATA WRESTLER

    FINISHING MOVE The Rock Rock Bottom Stone Cold Steve Austin Stunner Kevin Owens Stunner WRESTLER SHOW TITLE MATCH The Rock RAW Universal Championship Main Event The Rock Smackdown WWE Championship Main Event Stone Cold Steve Austin Smackdown WWE Championship Main Event Stone Cold Steve Austin Smackdown WWE Championship Opener Kevin Owens RAW Universal Championship Main Event WRESTLER FINISHING MOVE SHOW TITLE MATCH The Rock Rock Bottom RAW Universal Championship Main Event The Rock Rock Bottom Smackdown WWE Championship Main Event Stone Cold Steve Austin Stunner Smackdown WWE Championship Main Event Stone Cold Steve Austin Stunner Smackdown WWE Championship Opener Kevin Owens Stunner RAW Universal Championship Main Event
  48. WRESTLER MATCHES WRESTLER SHOW MATCH The Rock RAW Main Event

    The Rock Smackdown Main Event Stone Cold Steve Austin Smackdown Main Event Stone Cold Steve Austin Smackdown Opener Kevin Owens RAW Main Event FINISHING MOVES WRESTLER FINISHING MOVE The Rock Rock Bottom Stone Cold Steve Austin Stunner Kevin Owens Stunner TITLES SHOW TITLE RAW WWE Championship Smackdown Universal Championship
  49. WRESTLER MATCHES WRESTLER SHOW MATCH The Rock RAW Main Event

    The Rock Smackdown Main Event Stone Cold Steve Austin Smackdown Main Event Stone Cold Steve Austin Smackdown Opener Kevin Owens RAW Main Event FINISHING MOVES WRESTLER FINISHING MOVE The Rock Rock Bottom Stone Cold Steve Austin Stunner Kevin Owens Stunner TITLES SHOW TITLE RAW WWE Championship Smackdown Universal Championship
  50. THESE TABLES ARE IN BOYCE-CODD NORMAL FORM, MEANING THEY HAVE

    NO ANOMALIES BASED ON OUR FUNCTIONAL DEPENDENCIES
  51. MANY ANOMALIES ADDRESSED WRESTLER MATCHES WRESTLER SHOW MATCH The Rock

    RAW Main Event The Rock Smackdown Main Event Stone Cold Steve Austin Smackdown Main Event Stone Cold Steve Austin Smackdown Opener Kevin Owens RAW Main Event Zack Ryder Superstars Main Event Ricochet RAW Main Event FINISHING MOVES WRESTLER FINISHING MOVE The Rock People’s Elbow Stone Cold Steve Austin Stunner Kevin Owens Stunner Kenny Omega One-Winged Angel Zack Ryder Rough Ryder TITLES SHOW TITLE RAW Heavyweight Championship Smackdown Universal Championship
  52. HOW DID WE DO THIS? 1. What facts do we

    want to record? 2. Create Relations
  53. HOW DID WE DO THIS? 1. What facts do we

    want to record? 2. Create Relations 3. Identify Functional Dependencies and Keys
  54. HOW DID WE DO THIS? 1. What facts do we

    want to record? 2. Create Relations 3. Identify Functional Dependencies and Keys 4. Extract New Tables/Apply Heath’s Theorem: 1. Anything functional dependent on only part of a key, make a new table 2. That table’s key is the the left side of the dependency 3. The table’s columns are the right size 4. Remove the right side columns from the original table
  55. PRIMARY KEYS • The keys we saw like { WRESTLER,

    SHOW, MATCH } are primary keys - they uniquely identify the row.
  56. PRIMARY KEYS • The keys we saw like { WRESTLER,

    SHOW, MATCH } are primary keys - they uniquely identify the row. • They are called natural keys or sometimes business keys. • They are handy because they represent facts and are easily understood.
  57. PRIMARY KEYS • The keys we saw like { WRESTLER,

    SHOW, MATCH } are primary keys - they uniquely identify the row. • They are called natural keys or sometimes business keys. • They are handy because they represent facts and are easily understood. • But…what if we rename a show or a wrestler? We have to change the key values everywhere. It’s messy.
  58. PRIMARY KEYS • The keys we saw like { WRESTLER,

    SHOW, MATCH } are primary keys - they uniquely identify the row. • They are called natural keys or sometimes business keys. • They are handy because they represent facts and are easily understood. • But…what if we rename a show or a wrestler? We have to change the key values everywhere. It’s messy. • The ID field Rails makes is called a synthetic key or surrogate key.
  59. WRESTLER SHOWS WRESTLER ID SHOW ID WRESTLER SHOW MATCH FINISHING

    MOVES WRESTLER ID FINISHING MOVE TITLES SHOW ID TITLE
  60. WRESTLER SHOWS WRESTLER ID SHOW ID WRESTLER SHOW MATCH FINISHING

    MOVES WRESTLER ID FINISHING MOVE TITLES SHOW ID TITLE KEY
  61. WRESTLER SHOWS WRESTLER ID SHOW ID WRESTLER SHOW MATCH FINISHING

    MOVES WRESTLER ID FINISHING MOVE TITLES SHOW ID TITLE ALSO A KEY
  62. WRESTLER SHOWS WRESTLER ID SHOW ID WRESTLER SHOW MATCH WRESTLER

    WRESTLER ID SHOW SHOW ID MATCH The Rock 1 RAW 1 Main Event The Rock 2 Smackdown 2 Main Event Stone Cold Steve Austin 3 Smackdown 3 Main Event Stone Cold Steve Austin 4 Smackdown 4 Opener Kevin Owens 5 RAW 5 Main Event
  63. WRESTLER SHOWS WRESTLER ID SHOW ID WRESTLER SHOW MATCH FINISHING

    MOVES WRESTLER ID FINISHING MOVE TITLES SHOW ID TITLE
  64. WRESTLER SHOWS WRESTLER ID SHOW ID WRESTLER SHOW MATCH SHOWS

    SHOW ID SHOW WRESTLERS WRESTLER ID WRESTLER
  65. WRESTLER SHOWS WRESTLER ID SHOW ID MATCH SHOWS SHOW ID

    SHOW WRESTLERS WRESTLER ID WRESTLER BOYCE-CODD NORMAL FORM I.E. FREE OF ANOMALIES RELATED TO FUNCTIONAL DEPENDENCIES! KEY KEY KEY
  66. WRESTLER SHOWS WRESTLER_ID SHOW_ID MATCH SHOWS SHOW ID SHOW WRESTLERS

    WRESTLER ID WRESTLER FINISHING MOVES WRESTLER ID FINISHING MOVE TITLES SHOW ID TITLE
  67. WRESTLERS NAME FINISHING MOVE SHOWS NAME TITLE SHOW MATCHES SHOW

    MATCH WRESTLER CLEARANCES WRESTLER SHOW MATCH MATCHES NAME
  68. LOGIC TO PHYSICAL • The tables, i.e. migrations are the

    implementation of the design • Primary concerns: • Enforcing keys (remember, there are business keys still in effect even if using synthetic keys) • Enforcing associations • Enforcing types
  69. create_table :wrestler_shows do |t| t.references :wrestler, null: false, foreign_key: true

    t.references :show, null: false, foreign_key: true t.string :match, null: false end add_index :wrestler_shows, [ :wrestler_id, :show_id, :match ], unique: true
  70. create_table :shows do |t| t.string :name, null: false end add_index

    :shows, :name, unique: true create_table :show_titles do |t| t.string :name, null: false t.references :show, null: false, foreign_key: true end
  71. ON TYPES • These are the same wrestler: The Rock,

    THE ROCK, The Rock • This is in no way the name of a wrestler: “ “
  72. ON TYPES • These are the same wrestler: The Rock,

    THE ROCK, The Rock • This is in no way the name of a wrestler: “ “ • Trade-off time: • How likely is bad data to be input? • How bad is it if it happens?
  73. create_table :finishing_moves do |t| t.string :name, null: false t.references :wrestler,

    null: false, foreign_key: true end # In our code wrestler.finishing_move.name # undefined method `name’ # for NilClass
  74. create_table :finishing_moves do |t| t.string :name, null: false t.references :wrestler,

    null: false, foreign_key: true end # In our code wrestler.finishing_move.name # undefined method `name’ # for NilClass wrestler.finishing_move&.name
  75. create_table :finishing_moves do |t| t.string :name, null: false t.references :wrestler,

    null: false, foreign_key: true end # In our code wrestler.finishing_move.name # undefined method `name’ # for NilClass wrestler.finishing_move&.name def finishing_move_name self.finishing_move&.name end
  76. create_table :wrestlers do |t| t.string :name, null: false t.string :finishing_move,

    null: true, comment: "null if the wrestler has no finishing move" end
  77. GENERAL GUIDANCE • Create unique indexes for all business keys

    • Use foreign key constraints • Default to not-null • Comment nullable fields • Use database constraints if the downside of bad data is really bad, otherwise validations
  78. CLOSING THOUGHTS • Boyce-Codd Normal form does not eliminate all

    anomalies, but it eliminates many! • When editing your model, edit your design, then flow through to the physical. • Start with business rules, i.e. functional dependencies and keys.