Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Intro to (Relational) Databases 2015

Intro to (Relational) Databases 2015

Preetam Jinka

February 25, 2015
Tweet

More Decks by Preetam Jinka

Other Decks in Programming

Transcript

  1. Intro to Databases About me • Third year math major

    at UVA • Engineer at VividCortex Details on preet.am/ Contact info: • [email protected] • @PreetamJinka on Twitter 2
  2. Intro to Databases Materials • Python (2.7 and 3.4 worked

    for me) • SQLite3 module for Python ◦ It should come with Python by default • Either a terminal or an IDE 4
  3. Intro to Databases Examples and exercises • GitHub link: https://github.com/PreetamJinka/intro-to-databases

    ◦ Also posted on the FB event page and my Twitter • You don’t need git. Downloading the ZIP is fine. 5
  4. Intro to Databases Tests and fill in the blanks •

    Everything is organized into test cases • There are clearly marked “fill in the blank” exercises • Type in the answer and run the test • Immediate feedback! 6
  5. Intro to Databases What is a database? 8 It’s a

    structured set of data. DBMS: Usually people refer to database management systems (DBMS) as databases. Examples include MySQL, PostgreSQL, MongoDB, etc. (This talk focuses on Relational DBMSes)
  6. Intro to Databases It all boils down to two things

    • Structure ◦ How the database looks at data ◦ How you organize data • Questions (called queries) ◦ What you can ask ◦ What types of answers you get 9
  7. Intro to Databases Relational databases 12 • Collections of tables

    • Querying happens across a single table, or across tables • You can query across databases too
  8. Intro to Databases Tables 13 • Sets of rows •

    Prespecified columns id username first_name last_name 1 mreynolds0 Michelle Reynolds 2 rbishop1 Ralph Bishop users
  9. Intro to Databases Basic Queries Verbs: • SELECT - get

    rows • INSERT - insert rows • UPDATE - update fields in rows • DELETE - delete rows • REPLACE - replace a row 14
  10. Intro to Databases SQL in a nutshell • Selects ◦

    SELECT * FROM my_table WHERE some_col > 0 ◦ SELECT some_col FROM my_table • Inserts ◦ INSERT INTO my_table VALUES (1), (2), (3) ◦ INSERT INTO my_table (colA, colB) VALUES (1, 2) 15
  11. Intro to Databases Exercise 1 17 Select every column from

    every row in the “users” table. $ python -m unittest exercises.E.test_exercise_1 . ---------------------------------------------------------------------- Ran 1 test in 0.170s OK
  12. Intro to Databases Aggregate functions Instead of selecting columns, you

    can select aggregates. • SELECT AVG(some_col) FROM my_table WHERE some_col > 0 • Others: ◦ COUNT ◦ FIRST/LAST ◦ MIN/MAX ◦ SUM 18
  13. Intro to Databases Exercise 2 19 Select the number of

    rows in the “users” table. $ python -m unittest exercises.E.test_exercise_2 . ---------------------------------------------------------------------- Ran 1 test in 0.155s OK
  14. Intro to Databases Creating tables Example: CREATE TABLE users (

    id INTEGER, username VARCHAR ); INTEGER and VARCHAR are the types. 20
  15. Intro to Databases Exercise 3 Create this table. Don’t forget

    semicolons! 21 name red orange yellow green blue colors
  16. Intro to Databases How do I structure tables? 22 •

    Each object for a class should have its own row ◦ That means you should have a table for each class ◦ Examples: users, products, widgets, … • It’s a good idea to have an “ID” column for classes ◦ Easier to reference a row
  17. Intro to Databases What about relationships? 23 Let’s think about

    people and cars. We’ll have two tables. people and cars • If a person can have a single car, add a column in the people table for a car ID • If a car can have a single person, add a column in the cars table for a person ID • Otherwise, use a separate table storing (car, person) rows.
  18. Intro to Databases 26 Set intersection Table A Table B

    Intersection! This is an inner join.
  19. Intro to Databases A simple JOIN between two columns will

    give you all possible combinations of the rows 27 Joins State ID Abbrev 1 VA 2 MD User ID State 1 1 2 1 JOIN => State ID Abbrev User ID State 1 VA 1 1 1 VA 2 1 2 MD 1 1 2 MD 2 1
  20. Intro to Databases A simple JOIN between two columns will

    give you all possible combinations of the rows 28 Joins State ID Abbrev 1 VA 2 MD User ID State 1 1 2 1 JOIN => State ID Abbrev User ID State 1 VA 1 1 1 VA 2 1 2 MD 1 1 2 MD 2 1
  21. Intro to Databases We know that the “State ID” and

    “State” columns refer to the same thing. 29 Joins Users JOIN => State ID Abbrev User ID State 1 VA 1 1 1 VA 2 1 2 MD 1 1 2 MD 2 1 States ON states.id = users.state
  22. Intro to Databases JOIN syntax SELECT tableA.colA, tableB.colB FROM tableA

    JOIN tableB ON tableA.colA = tableB.reference You can join more than two tables. 30
  23. Intro to Databases Exercise 4 Select (username, state) rows using

    the following tables. How many JOINs do you need? 32 id code 1 AL 2 AK id username first_name last_name 1 mreynolds0 Michelle Reynolds 2 rbishop1 Ralph Bishop state user 32 1 48 2 states users locations
  24. Intro to Databases Class enrollment example 34 Tables: class {

    id INTEGER enrolled INTEGER capacity INTEGER } enrollment { student VARCHAR class INTEGER } id enrolled capacity 1 10 20 2 20 20 student class A 2 B
  25. Intro to Databases Class enrollment example 35 Tables: class {

    id INTEGER enrolled INTEGER capacity INTEGER } enrollment { student VARCHAR class INTEGER } id enrolled capacity 1 10 20 2 20 20 student class A 2 B This class is full.
  26. Intro to Databases Class enrollment example 36 • A wants

    to switch from class 2 to class 1. ◦ We need change the “class” column in the enrollment table, and then update the “enrolled” columns in the class table. • B wants to enter class 2. ◦ We need to check if there is space using the class table, and then update the enrollment table. id enrolled capacity 1 10 20 2 20 20 student class A 2 B
  27. Intro to Databases Class enrollment example 37 • A wants

    to switch from class 2 to class 1. ◦ We need change the “class” column in the enrollment table, and then update the “enrolled” columns in the class table. Something bad happens. The rest doesn’t complete. • B wants to enter class 2. ◦ We need to check if there is space using the class table, and then update the enrollment table. B can’t enroll in class 2, because it looks filled! id enrolled capacity 1 10 20 2 20 20 class 1 student class A 2 B
  28. Intro to Databases Questions? It’s okay if you don’t understand

    it completely. We’ll dig into it next. 38
  29. Intro to Databases Transactions A transaction is a set of

    operations that have the following properties: • Atomicity: all of them happen or none of them do • Consistency: constraints hold true after the transaction is committed or rolled back • Isolation: transactions don’t overlap • Durability: committed transactions last 39
  30. Intro to Databases What happens without ACID? Here’s some simple

    state. We’ll make some changes, each without some property of ACID and see what could happen. 40 “A” “B” “A + B” 1 ? 3
  31. Intro to Databases What happens without atomicity? What’s B? (It’s

    2.) 41 “A” “B” “A + B” 1 ? 3 Transaction 1 Transaction 2 (I want to set A = 2, but keep B the same.) SET “A” = 2 SET “A + B” = 4 (Something went wrong!) ROLLBACK (I want to get the value of B.) GET “A + B” => 3 GET “A” => 2 => “B” = 1
  32. Intro to Databases What happens without consistency? 42 “A” “B”

    “A + B” 1 ? 3 Transaction 1 Transaction 2 (I want to add 1 to “B”.) GET “A + B” => 3 SET “A + B” = 4 COMMIT (I want to get the value of B.) GET “A + B” => 3 GET “A” => 1 => “B” = 2 (This is wrong!)
  33. Intro to Databases What happens without isolation? 43 “A” “B”

    “A + B” 1 ? 3 Transaction 1 Transaction 2 (I want to add 1 to “B”.) GET “A + B” => 3 SET “A + B” = 4 ROLLBACK (We decide to undo.) (I want to get the value of B.) GET “A + B” => 4 GET “A” => 1 => “B” = 3 (This is wrong!)
  34. Intro to Databases What happens without durability? 44 “A” “B”

    “A + B” 1 ? 3 Transaction 1 Transaction 2 (I want to add 1 to “B”.) GET “A + B” => 3 SET “A + B” = 4 COMMIT (The databases crashes!) (I want to get the value of B.) GET “A + B” => 3 GET “A” => 1 => “B” = 2 (This is wrong!)
  35. Intro to Databases What happens when you drop ACID? Weird

    stuff. It’s difficult to reason about your data without those guarantees. 45
  36. Intro to Databases Transaction syntax Really simple, but powerful. •

    BEGIN - starts a transaction • COMMIT - commits a transaction • ROLLBACK - undos the transaction These are individual statements. 47
  37. Intro to Databases Exercise 5 Insert a new row, count

    total number of rows in the table, and then undo your changes. Here’s the INSERT example as a reminder: INSERT INTO my_table (colA, colB) VALUES (1, 2) 49 id username first_name last_name 1 mreynolds0 Michelle Reynolds users
  38. Intro to Databases Searching without order 51 What’s the smallest

    number in the following set? (Note that you don’t see the entire set!) 7 1 5 2 18 4 21 -10 85 6 14 19 20 8 25 64 71 1 ... ...
  39. Intro to Databases Searching without order 52 What’s the smallest

    number in the following set? (Note that you don’t see the entire set!) You can’t answer that question! 7 1 5 2 18 4 21 -10 85 6 14 19 20 8 25 64 71 1 ... ...
  40. Intro to Databases Searching with order 53 What’s the smallest

    number in the following set? (Note that you don’t see the entire set!) -91 -14 -4 1 5 7 8 9 10 11 12 13 18 19 24 27 36 39 ... ...
  41. Intro to Databases Searching with order 54 What’s the smallest

    number in the following set? (Note that you don’t see the entire set!) We don’t have to look at every value. -91 -14 -4 1 5 7 8 9 10 11 12 13 18 19 24 27 36 39 ... ...
  42. Intro to Databases What do we mean by “efficient?” Minimize

    the number of rows you look at. We make searching efficient using indexes, just like those found in books. 55
  43. Intro to Databases Creating indexes Suppose we had the following

    schema. How many indexes are there? CREATE TABLE "users" ( "id" INTEGER PRIMARY KEY, "username" VARCHAR, "first_name" VARCHAR, "last_name" VARCHAR ); 56
  44. Intro to Databases Creating indexes Suppose we had the following

    schema. How many indexes are there? CREATE TABLE "users" ( "id" INTEGER PRIMARY KEY, "username" VARCHAR, "first_name" VARCHAR, "last_name" VARCHAR ); There is one index. 57
  45. Intro to Databases Creating indexes Let’s add another index on

    “username” so searching with a username is efficient. CREATE TABLE "users" ( "id" INTEGER PRIMARY KEY, "username" VARCHAR, "first_name" VARCHAR, "last_name" VARCHAR ); 58
  46. Intro to Databases Creating indexes Let’s add another index on

    “username” so searching with a username is efficient. CREATE TABLE "users" ( "id" INTEGER PRIMARY KEY, "username" VARCHAR UNIQUE, "first_name" VARCHAR, "last_name" VARCHAR ); (Unique columns get indexed in SQLite.) 59
  47. Intro to Databases Exercise 6 We’re going to see how

    an index improves a search. Change the following schema to include an index on column "a". CREATE TABLE "numbers" (a INTEGER); 60
  48. Intro to Databases Back to class enrollment We want to

    answer the following questions: 1. Which classes did a specific student sign up for? 2. Which students are signed up for a specific class? 61 student class C 3 A 2 B 1 C 1 B 2
  49. Intro to Databases Back to class enrollment We want to

    answer the following questions: 1. Which classes did a specific student sign up for? 2. Which students are signed up for a specific class? We need two indexes to efficiently answer both of these questions. 62
  50. Intro to Databases Back to class enrollment We want to

    answer the following questions: 1. Which classes did a specific student sign up for? 2. Which students are signed up for a specific class? 63 student class A 2 B 1 B 2 C 1 C 3 class student 1 B 1 C 2 A 2 B 3 C
  51. Intro to Databases Back to class enrollment We want to

    answer the following questions: 1. Which classes did a specific student sign up for? 2. Which students are signed up for a specific class? 64 student class A 2 B 1 B 2 C 1 C 3 class student 1 B 1 C 2 A 2 B 3 C
  52. Intro to Databases The end. You’re almost experts! Questions or

    comments? Contact info: • [email protected] • @PreetamJinka on Twitter 65