Slide 1

Slide 1 text

Intro to Databases Intro to Databases Workshop Preetam Jinka February 2015

Slide 2

Slide 2 text

Intro to Databases About me ● Third year math major at UVA ● Engineer at VividCortex Details on preet.am/ Contact info: ● [email protected] ● @PreetamJinka on Twitter 2

Slide 3

Slide 3 text

Intro to Databases Setup 3 Getting and using workshop materials

Slide 4

Slide 4 text

Intro to Databases Materials ● Python (2.7 and 3.4 worked for me) ● SQLite3 module for Python ○ It should come with Python by default ● Either a terminal or an IDE 4

Slide 5

Slide 5 text

Intro to Databases Examples and exercises ● GitHub link: https://github.com/PreetamJinka/intro-to-databases ○ Also posted on the FB event page and my Twitter ● You don’t need git. Downloading the ZIP is fine. 5

Slide 6

Slide 6 text

Intro to Databases Tests and fill in the blanks ● Everything is organized into test cases ● There are clearly marked “fill in the blank” exercises ● Type in the answer and run the test ● Immediate feedback! 6

Slide 7

Slide 7 text

Intro to Databases Fundamentals Databases? Queries? 7

Slide 8

Slide 8 text

Intro to Databases What is a database? 8 It’s a structured set of data. DBMS: Usually people refer to database management systems (DBMS) as databases. Examples include MySQL, PostgreSQL, MongoDB, etc. (This talk focuses on Relational DBMSes)

Slide 9

Slide 9 text

Intro to Databases It all boils down to two things ● Structure ○ How the database looks at data ○ How you organize data ● Questions (called queries) ○ What you can ask ○ What types of answers you get 9

Slide 10

Slide 10 text

Intro to Databases File systems are databases too! 10

Slide 11

Slide 11 text

Intro to Databases Structure Tables everywhere! 11

Slide 12

Slide 12 text

Intro to Databases Relational databases 12 ● Collections of tables ● Querying happens across a single table, or across tables ● You can query across databases too

Slide 13

Slide 13 text

Intro to Databases Tables 13 ● Sets of rows ● Prespecified columns id username first_name last_name 1 mreynolds0 Michelle Reynolds 2 rbishop1 Ralph Bishop users

Slide 14

Slide 14 text

Intro to Databases Basic Queries Verbs: ● SELECT - get rows ● INSERT - insert rows ● UPDATE - update fields in rows ● DELETE - delete rows ● REPLACE - replace a row 14

Slide 15

Slide 15 text

Intro to Databases SQL in a nutshell ● Selects ○ SELECT * FROM my_table WHERE some_col > 0 ○ SELECT some_col FROM my_table ● Inserts ○ INSERT INTO my_table VALUES (1), (2), (3) ○ INSERT INTO my_table (colA, colB) VALUES (1, 2) 15

Slide 16

Slide 16 text

Intro to Databases Questions? 16

Slide 17

Slide 17 text

Intro to Databases Exercise 1 17 Select every column from every row in the “users” table. $ python -m unittest exercises.E.test_exercise_1 . ---------------------------------------------------------------------- Ran 1 test in 0.170s OK

Slide 18

Slide 18 text

Intro to Databases Aggregate functions Instead of selecting columns, you can select aggregates. ● SELECT AVG(some_col) FROM my_table WHERE some_col > 0 ● Others: ○ COUNT ○ FIRST/LAST ○ MIN/MAX ○ SUM 18

Slide 19

Slide 19 text

Intro to Databases Exercise 2 19 Select the number of rows in the “users” table. $ python -m unittest exercises.E.test_exercise_2 . ---------------------------------------------------------------------- Ran 1 test in 0.155s OK

Slide 20

Slide 20 text

Intro to Databases Creating tables Example: CREATE TABLE users ( id INTEGER, username VARCHAR ); INTEGER and VARCHAR are the types. 20

Slide 21

Slide 21 text

Intro to Databases Exercise 3 Create this table. Don’t forget semicolons! 21 name red orange yellow green blue colors

Slide 22

Slide 22 text

Intro to Databases How do I structure tables? 22 ● Each object for a class should have its own row ○ That means you should have a table for each class ○ Examples: users, products, widgets, … ● It’s a good idea to have an “ID” column for classes ○ Easier to reference a row

Slide 23

Slide 23 text

Intro to Databases What about relationships? 23 Let’s think about people and cars. We’ll have two tables. people and cars ● If a person can have a single car, add a column in the people table for a car ID ● If a car can have a single person, add a column in the cars table for a person ID ● Otherwise, use a separate table storing (car, person) rows.

Slide 24

Slide 24 text

Intro to Databases Joins 24 Selecting from multiple tables

Slide 25

Slide 25 text

Intro to Databases 25 Set intersection A B Intersection!

Slide 26

Slide 26 text

Intro to Databases 26 Set intersection Table A Table B Intersection! This is an inner join.

Slide 27

Slide 27 text

Intro to Databases A simple JOIN between two columns will give you all possible combinations of the rows 27 Joins State ID Abbrev 1 VA 2 MD User ID State 1 1 2 1 JOIN => State ID Abbrev User ID State 1 VA 1 1 1 VA 2 1 2 MD 1 1 2 MD 2 1

Slide 28

Slide 28 text

Intro to Databases A simple JOIN between two columns will give you all possible combinations of the rows 28 Joins State ID Abbrev 1 VA 2 MD User ID State 1 1 2 1 JOIN => State ID Abbrev User ID State 1 VA 1 1 1 VA 2 1 2 MD 1 1 2 MD 2 1

Slide 29

Slide 29 text

Intro to Databases We know that the “State ID” and “State” columns refer to the same thing. 29 Joins Users JOIN => State ID Abbrev User ID State 1 VA 1 1 1 VA 2 1 2 MD 1 1 2 MD 2 1 States ON states.id = users.state

Slide 30

Slide 30 text

Intro to Databases JOIN syntax SELECT tableA.colA, tableB.colB FROM tableA JOIN tableB ON tableA.colA = tableB.reference You can join more than two tables. 30

Slide 31

Slide 31 text

Intro to Databases Questions? 31

Slide 32

Slide 32 text

Intro to Databases Exercise 4 Select (username, state) rows using the following tables. How many JOINs do you need? 32 id code 1 AL 2 AK id username first_name last_name 1 mreynolds0 Michelle Reynolds 2 rbishop1 Ralph Bishop state user 32 1 48 2 states users locations

Slide 33

Slide 33 text

Intro to Databases Transactions 33 and ACID

Slide 34

Slide 34 text

Intro to Databases Class enrollment example 34 Tables: class { id INTEGER enrolled INTEGER capacity INTEGER } enrollment { student VARCHAR class INTEGER } id enrolled capacity 1 10 20 2 20 20 student class A 2 B

Slide 35

Slide 35 text

Intro to Databases Class enrollment example 35 Tables: class { id INTEGER enrolled INTEGER capacity INTEGER } enrollment { student VARCHAR class INTEGER } id enrolled capacity 1 10 20 2 20 20 student class A 2 B This class is full.

Slide 36

Slide 36 text

Intro to Databases Class enrollment example 36 ● A wants to switch from class 2 to class 1. ○ We need change the “class” column in the enrollment table, and then update the “enrolled” columns in the class table. ● B wants to enter class 2. ○ We need to check if there is space using the class table, and then update the enrollment table. id enrolled capacity 1 10 20 2 20 20 student class A 2 B

Slide 37

Slide 37 text

Intro to Databases Class enrollment example 37 ● A wants to switch from class 2 to class 1. ○ We need change the “class” column in the enrollment table, and then update the “enrolled” columns in the class table. Something bad happens. The rest doesn’t complete. ● B wants to enter class 2. ○ We need to check if there is space using the class table, and then update the enrollment table. B can’t enroll in class 2, because it looks filled! id enrolled capacity 1 10 20 2 20 20 class 1 student class A 2 B

Slide 38

Slide 38 text

Intro to Databases Questions? It’s okay if you don’t understand it completely. We’ll dig into it next. 38

Slide 39

Slide 39 text

Intro to Databases Transactions A transaction is a set of operations that have the following properties: ● Atomicity: all of them happen or none of them do ● Consistency: constraints hold true after the transaction is committed or rolled back ● Isolation: transactions don’t overlap ● Durability: committed transactions last 39

Slide 40

Slide 40 text

Intro to Databases What happens without ACID? Here’s some simple state. We’ll make some changes, each without some property of ACID and see what could happen. 40 “A” “B” “A + B” 1 ? 3

Slide 41

Slide 41 text

Intro to Databases What happens without atomicity? What’s B? (It’s 2.) 41 “A” “B” “A + B” 1 ? 3 Transaction 1 Transaction 2 (I want to set A = 2, but keep B the same.) SET “A” = 2 SET “A + B” = 4 (Something went wrong!) ROLLBACK (I want to get the value of B.) GET “A + B” => 3 GET “A” => 2 => “B” = 1

Slide 42

Slide 42 text

Intro to Databases What happens without consistency? 42 “A” “B” “A + B” 1 ? 3 Transaction 1 Transaction 2 (I want to add 1 to “B”.) GET “A + B” => 3 SET “A + B” = 4 COMMIT (I want to get the value of B.) GET “A + B” => 3 GET “A” => 1 => “B” = 2 (This is wrong!)

Slide 43

Slide 43 text

Intro to Databases What happens without isolation? 43 “A” “B” “A + B” 1 ? 3 Transaction 1 Transaction 2 (I want to add 1 to “B”.) GET “A + B” => 3 SET “A + B” = 4 ROLLBACK (We decide to undo.) (I want to get the value of B.) GET “A + B” => 4 GET “A” => 1 => “B” = 3 (This is wrong!)

Slide 44

Slide 44 text

Intro to Databases What happens without durability? 44 “A” “B” “A + B” 1 ? 3 Transaction 1 Transaction 2 (I want to add 1 to “B”.) GET “A + B” => 3 SET “A + B” = 4 COMMIT (The databases crashes!) (I want to get the value of B.) GET “A + B” => 3 GET “A” => 1 => “B” = 2 (This is wrong!)

Slide 45

Slide 45 text

Intro to Databases What happens when you drop ACID? Weird stuff. It’s difficult to reason about your data without those guarantees. 45

Slide 46

Slide 46 text

Intro to Databases Questions? 46

Slide 47

Slide 47 text

Intro to Databases Transaction syntax Really simple, but powerful. ● BEGIN - starts a transaction ● COMMIT - commits a transaction ● ROLLBACK - undos the transaction These are individual statements. 47

Slide 48

Slide 48 text

Intro to Databases Transaction syntax BEGIN SELECT …; INSERT …; COMMIT 48

Slide 49

Slide 49 text

Intro to Databases Exercise 5 Insert a new row, count total number of rows in the table, and then undo your changes. Here’s the INSERT example as a reminder: INSERT INTO my_table (colA, colB) VALUES (1, 2) 49 id username first_name last_name 1 mreynolds0 Michelle Reynolds users

Slide 50

Slide 50 text

Intro to Databases Indexes Efficient searching 50

Slide 51

Slide 51 text

Intro to Databases Searching without order 51 What’s the smallest number in the following set? (Note that you don’t see the entire set!) 7 1 5 2 18 4 21 -10 85 6 14 19 20 8 25 64 71 1 ... ...

Slide 52

Slide 52 text

Intro to Databases Searching without order 52 What’s the smallest number in the following set? (Note that you don’t see the entire set!) You can’t answer that question! 7 1 5 2 18 4 21 -10 85 6 14 19 20 8 25 64 71 1 ... ...

Slide 53

Slide 53 text

Intro to Databases Searching with order 53 What’s the smallest number in the following set? (Note that you don’t see the entire set!) -91 -14 -4 1 5 7 8 9 10 11 12 13 18 19 24 27 36 39 ... ...

Slide 54

Slide 54 text

Intro to Databases Searching with order 54 What’s the smallest number in the following set? (Note that you don’t see the entire set!) We don’t have to look at every value. -91 -14 -4 1 5 7 8 9 10 11 12 13 18 19 24 27 36 39 ... ...

Slide 55

Slide 55 text

Intro to Databases What do we mean by “efficient?” Minimize the number of rows you look at. We make searching efficient using indexes, just like those found in books. 55

Slide 56

Slide 56 text

Intro to Databases Creating indexes Suppose we had the following schema. How many indexes are there? CREATE TABLE "users" ( "id" INTEGER PRIMARY KEY, "username" VARCHAR, "first_name" VARCHAR, "last_name" VARCHAR ); 56

Slide 57

Slide 57 text

Intro to Databases Creating indexes Suppose we had the following schema. How many indexes are there? CREATE TABLE "users" ( "id" INTEGER PRIMARY KEY, "username" VARCHAR, "first_name" VARCHAR, "last_name" VARCHAR ); There is one index. 57

Slide 58

Slide 58 text

Intro to Databases Creating indexes Let’s add another index on “username” so searching with a username is efficient. CREATE TABLE "users" ( "id" INTEGER PRIMARY KEY, "username" VARCHAR, "first_name" VARCHAR, "last_name" VARCHAR ); 58

Slide 59

Slide 59 text

Intro to Databases Creating indexes Let’s add another index on “username” so searching with a username is efficient. CREATE TABLE "users" ( "id" INTEGER PRIMARY KEY, "username" VARCHAR UNIQUE, "first_name" VARCHAR, "last_name" VARCHAR ); (Unique columns get indexed in SQLite.) 59

Slide 60

Slide 60 text

Intro to Databases Exercise 6 We’re going to see how an index improves a search. Change the following schema to include an index on column "a". CREATE TABLE "numbers" (a INTEGER); 60

Slide 61

Slide 61 text

Intro to Databases Back to class enrollment We want to answer the following questions: 1. Which classes did a specific student sign up for? 2. Which students are signed up for a specific class? 61 student class C 3 A 2 B 1 C 1 B 2

Slide 62

Slide 62 text

Intro to Databases Back to class enrollment We want to answer the following questions: 1. Which classes did a specific student sign up for? 2. Which students are signed up for a specific class? We need two indexes to efficiently answer both of these questions. 62

Slide 63

Slide 63 text

Intro to Databases Back to class enrollment We want to answer the following questions: 1. Which classes did a specific student sign up for? 2. Which students are signed up for a specific class? 63 student class A 2 B 1 B 2 C 1 C 3 class student 1 B 1 C 2 A 2 B 3 C

Slide 64

Slide 64 text

Intro to Databases Back to class enrollment We want to answer the following questions: 1. Which classes did a specific student sign up for? 2. Which students are signed up for a specific class? 64 student class A 2 B 1 B 2 C 1 C 3 class student 1 B 1 C 2 A 2 B 3 C

Slide 65

Slide 65 text

Intro to Databases The end. You’re almost experts! Questions or comments? Contact info: ● [email protected] ● @PreetamJinka on Twitter 65

Slide 66

Slide 66 text

Intro to Databases Photo credits https://www.flickr.com/photos/minnellium/3588877363 https://www.flickr.com/photos/tormodspictures/10357367406 https://www.flickr.com/photos/universalpops/6399704823 https://www.flickr.com/photos/wv/3477462384 https://www.flickr.com/photos/redspotted/272104 https://www.flickr.com/photos/heipei/3570609850 66