Slide 1

Slide 1 text

EUGENE T. COOK PHPCON JAPAN 2015 DATABASE THEORY, MODELS AND ABSTRACTIONS

Slide 2

Slide 2 text

A BIT ABOUT ME Transactional and Start Up Attorney MBA - IT and 
 Operations Management Polymathic Technologist JD - Rule of Law in
 Developing Nations Shameless Plugs

Slide 3

Slide 3 text

TWO PRIMARY TRENDS IN DB DESIGN SCHEMA NOSQL (SCHEMALESS)* MySQL MongoDB MariaDB RethinkDB PostgreSQL Redis Google Big Table * THERE ARE EXCEPTIONS TO MANY OF THE CHARACTERISTICS OF TYPICAL SCHEMALESS STORES

Slide 4

Slide 4 text

QUICK PRIMER DATA MODELING

Slide 5

Slide 5 text

DATA MODELING • Identifying the distinct entities that comprise your data’s architecture • Conceptualizing your data objects • Identifying the relationships between the data objects DATABASE ARCHITECTURE DATABASE DESIGN • Conversion of the data objects into an actual schema • Naming convention of the data and relationships • Deciding what constraints you should have

Slide 6

Slide 6 text

DATA MODELING - STRUCTURE PROJECT SCOPE YOU’VE BEEN ASKED BY HEAD MISTRESS GERTRUDE TO CREATE A SYSTEM FOR HER NEW BOARDING SCHOOL. SHE WANTS TO BE ABLE TO TRACK STUDENT SCHEDULES AND ALSO THEIR FINANCIAL ACCOUNT STATUS.

Slide 7

Slide 7 text

WHAT ARE THE LOGICAL OBJECTS COURSES STUDENTS ENROLLMENT GRADES HOW WOULD THESE LOOK? ACCOUNT TEACHERS

Slide 8

Slide 8 text

MONOLITHIC STUDENT CLASS 1 GRADE 1 CLASS 2 GRADE 2 DUE LAST PAY JOE LAW A SOCIAL B 500 1 MALCOLM SCIENCE B MATH A 250 1 MINDY GEO C SOCIAL D 350 2 JACK COMP A LAW A 600 12

Slide 9

Slide 9 text

THAT WAS EASY! ANY QUESTIONS?

Slide 10

Slide 10 text

GERTRUDE DOES NOT APPROVE WHAT ABOUT NEXT SEMESTER? WHAT IF THEY TAKE MORE THAN TWO CLASSES? HOW DO WE SAVE MORE INFORMATION ABOUT THE CLASSES?

Slide 11

Slide 11 text

I THOUGHT I WAS DONE

Slide 12

Slide 12 text

QUICK PRIMER DATABASE DESIGN

Slide 13

Slide 13 text

MONOLITHIC STUDENT CLASS 1 GRADE 1 CLASS 2 GRADE 2 DUE LAST PAY JOE LAW A SOCIAL B 500 1 MALCOLM SCIENCE B MATH A 250 1 MINDY GEO C SOCIAL D 350 2 JACK COMP A LAW A 600 12 NORMALIZATION ORGANIZATION TABLES AND COLUMNS TO MINIMIZE REDUNDANCY CARDINALITY THE FUNDAMENTAL RELATION OF ONE TABLE TO ANOTHER

Slide 14

Slide 14 text

NORMALIZATION CLASS 1 GRADE 1 CLASS 2 GRADE 2 DUE LAST PAY JOE LAW A SOCIAL B 500 1 MALCOLM SCIENCE B MATH A 250 1 MINDY GEO C SOCIAL D 350 2 JACK COMP A LAW A 600 12 WHAT ABOUT CLASS META INFORMATION? WHAT IF WE CHANGED THE CLASS NAME?

Slide 15

Slide 15 text

NORMALIZATION STUDENT CLASS 1 GRADE 1 CLASS 2 GRADE 2 DUE LAST PAY JOE LAW A SOCIAL B 500 1 MALCOLM SCIENCE B MATH A 250 1 MINDY GEO C SOCIAL D 350 2 JACK COMP A LAW A 600 12 CLASS DESC TEACHER LAW … SUSAN SCIENCE … RANDALL GEO … CRAIG COMP … SUSAN WE’VE ADDED META, BUT NOW WHAT?

Slide 16

Slide 16 text

ONE TO ONE CARDINALITIES (RELATIONSHIPS) BELONGS TO (FK) / HAS ONE ONE TO MANY BELONGS TO (FK) / HAS MANY MANY TO MANY HAS MANY THROUGH / HAS MANY THROUGH

Slide 17

Slide 17 text

TEACHERS CLASSES IDENTIFY THE RELATIONSHIPS CAN A CLASS HAVE MULTIPLE TEACHERS? CAN A TEACHER TEACH MULTIPLE CLASSES? M - 1

Slide 18

Slide 18 text

ONE TO MANY STUDENT CLASS 1 GRADE 1 CLASS 2 GRADE 2 DUE LAST PAY JOE 1 A 5 B 500 1 MALCOLM 2 A 6 A 250 1 MINDY 3 C 5 D 350 2 JACK 4 A 1 A 600 12 CLASS NAME DESC TEACHER 1 LAW … 1 2 SCIENCE … 2 3 GEO … 3 4 COMP … 1 TEACHER NAME HIRE DATE 1 SUSAN … 2 RANDALL … 3 CRAIG …

Slide 19

Slide 19 text

CLASSES STUDENTS IDENTIFY THE RELATIONSHIPS CAN A STUDENT BE IN MULTIPLE CLASSES? CAN A CLASS HAVE MULTIPLE STUDENTS? M - M

Slide 20

Slide 20 text

MANY TO MANY STUDENT NAME 1 JOE 2 MALCOLM 3 MINDY 4 JACK CLASS NAME 1 MATH 2 SCIENCE 3 GYM 4 LANGUAGE ENROLLED_IN STUDENT_ID CLASS_ID 1 1 2 2 1 3 3 4 1 4 2 4 5 3 1 6 2 1

Slide 21

Slide 21 text

VISUALIZING THE RELATIONSHIPS

Slide 22

Slide 22 text

QUICK PRIMER OPTIMIZATION

Slide 23

Slide 23 text

INDEXES DATA OPTIMIZATIONS HIGHLY EFFICIENT COPY OF SELECT COLUMNS WHICH TYPICALLY INCLUDE DIRECT ADDRESSING STORED PROCEDURES ROUTINES RUN WITHIN THE DATABASE THAT CAN SOMETIMES MORE EFFICIENTLY MANIPULATE DATA OR REDUCE DB TRAFFIC VIEWS STORED QUERIES OF DATA WHICH CAN BE INDEXED SEPARATELY AND IN SOME DATABASES, THE RESULTING DATASET IS INDEXED AS A CLUSTER, DRAMATICALLY INCREASING PERFORMANCE

Slide 24

Slide 24 text

INDEXES CREATE AN INDEX FOR ATTRIBUTES THAT YOU QUERY AGAINST THE MOST. INDEXES ARE UPDATED DYNAMICALLY, AND WILL RESULT IN DRAMATICALLY IMPROVED QUERIES.

Slide 25

Slide 25 text

DATA MODELING • Identifying the distinct entities that comprise your data’s architecture • Conceptualizing your data objects • Identifying the relationships between the data objects DATABASE ARCHITECTURE DATABASE DESIGN • Conversion of the data objects into an actual schema • Naming convention of the data and relationships • Deciding what constraints you should have

Slide 26

Slide 26 text

STORED PROCEDURES CREATE STORED PROCEDURES FOR STATICALLY CALLED SQL THAT MANIPULATES OR QUERIES LARGE DATASETS FOR SUBSETS VIEWS USE VIEWS FOR COMMON JOINS, WHICH MANY RDBS’ KEEP IN THE A SEPARATE TABLE CACHE. THIS WILL ALLOW DIRECT QUERIES TO THE JOINED TABLE.

Slide 27

Slide 27 text

QUICK PRIMER NOSQL

Slide 28

Slide 28 text

SCHEMA-LESS? • NoSQL databases don’t have an internal schema • NoSQL databases are dynamically typed, meaning attributes can even have different types • Uses documents instead of records, which can be thought of as objects or arrays • Documents have no inherent schema • Schemas, constraints and types are all “soft.” They rely on programming or abstractions rather than “hard” in the DB. WHAT’S NOSQL?

Slide 29

Slide 29 text

OK - BUT WHY?

Slide 30

Slide 30 text

• Flexible documents • Fast iteration • More agile • No “migrations” required • “Web Read” (JSON/BSON) • “Web Scale” (Performance) • Data fidelity • Soft constraints are programmatic • Most NoSQL not built for joins • Requires optimization of writes for reads

Slide 31

Slide 31 text

WEB SCALE11!!1!1!ELEVENTYONE!

Slide 32

Slide 32 text

BRO, DO YOU EVEN WEBSCALE? • “Web Scale” - delivering scalable, consistent performance, especially through distributed reads. • Achieved through denormalization. • Relational data is kept in the document itself rather than separated out into separate tables.

Slide 33

Slide 33 text

DENORMALIZATION? YOU JUST WASTED THE LAST 20 MINS.

Slide 34

Slide 34 text

RDBS • Sophisticated joins • Fidelity of data is paramount • Complex relationships • Static data model • Multi-row transactions • Less sophisticated DBA • Can boost performance with followers (slaves) WHEN TO USE NOSQL • Complex data (E.g. geo-spatial) • Changing data model • Horizontal inconsistency not necessarily an issue • Some great features specifically for webapps (E.g. oplog in MongoDB) • Distributed read performance is critical • More sophisticated DBA willing to maintain, cleanse data and normalize types.

Slide 35

Slide 35 text

WHAT’S THE POINT? • No right or wrong answer. • Not mutually exclusive - can be used together with great effect. • E.g. Comments system might be on NoSQL but your HR system might be on RDBS.

Slide 36

Slide 36 text

QUICK PRIMER ABSTRACTIONS

Slide 37

Slide 37 text

ABSTRACTIONS • API or library which obfuscates some of the raw querying or manipulation of the database. • Can allow for easy drop in of other database types (E.g. MySQL to Postgres) • Some add validation or schema support to databases that might not have them by default. • Some abstractions also offer data migrations, handling the table structure for you and can even build your original data model from an existing schema. ABSTRACTIONS EXAMPLES • Propel • Doctrine • PDO • Mongoose • Monk

Slide 38

Slide 38 text

LOOK FORWARD DON’T BE AFRAID TO BROADEN YOUR HORIZONS YOUR APP WILL THANK YOU

Slide 39

Slide 39 text

KEEP IN TOUCH @ETC ECOOK [email protected]

Slide 40

Slide 40 text

QUESTIONS?

Slide 41

Slide 41 text

THANK YOU