Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Lesson 4 - Databases

Lesson 4 - Databases

Dana Spiegel

October 21, 2012
Tweet

More Decks by Dana Spiegel

Other Decks in Technology

Transcript

  1. Quick Review • Good coding techniques • Development tools •

    Unix/Linux (*nix) • Shell • SSH • Text editors (emacs and vi) • Cloud file storage systems (S3) • DNS • Source Control (git) • Good places to find help online 2
  2. What are Databases? • Servers that store structured information •

    Highly optimized for storing data • Can store billions of data points (usually in rows) • Many work with some form of relational structures (RDBMS) • Most provide data aggregation functions • Most have a way to keep multiple servers in synced • Master-master replication • Master-slave replication • Some are optimized for data distribution and scaling (sharding) • Some are optimized for fast access • Some offer advanced features, like map-reduce • All have a database query language • SQL invented at IBM by Chamberlin and Boyce in 1970s 1 • Many offer transaction support and data locking 3 1. http://en.wikipedia.org/wiki/SQL
  3. Popular Databases • SQL • MySQL, PostgreSQL, MSSQL, Oracle, SQLite

    • Key/Value Stores • Redis, Memcache, Membase • http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key- value-stores • NoSQL • MongoDB, Riak, CouchDB, Cassandra • http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis • Graph Databases • Neo4j • http://www.cse.unsw.edu.au/~iwgdm/2012/Slides/Renzo.pdf • Constant Databases • http://cr.yp.to/cdb.html 4
  4. Data Management • We are going to learn SQLite, an

    entry-level, disk-based SQL RDBMS • Store data in tables, made of rows and columns, similar to Excel • Tables are created using SQL DDL: • Each table has a set of columns, usually with a primary key • Primary key identify the identity column • Collation defines how columns are sorted • Important on text columns, since it will determine how data is returned • End each statement with a semicolon (;) • Use single quote characters for text values 5 CREATE TABLE users ( id INTEGER PRIMARY KEY AUTOINCREMENT, first_name VARCHAR(50), last_name VARCHAR(50), username VARCHAR(50) NOT NULL, email VARCHAR(100) );
  5. SQL Data Types • Numeric • usually int • Text

    • usually varchar • Date • date, time, timestamp*, interval • Some databases don’t store timezone • Always use UTC! • Blob • Raw data • Don’t store images, files, etc. unless you know what you are doing! 6
  6. Accessing Data • SQLite is started interactively using the sqlite3

    command • Data is retrieved using SELECT • Specify column names in the order you want to retrieve them • Specify ordering of data using order by • Specify constraints on data selection using where • Select only a fixed number of rows using limit 7 SELECT * FROM users; SELECT username, id FROM users; SELECT username, id FROM users order by username asc; SELECT username, id FROM users WHERE id < 100; SELECT username, id FROM users WHERE username LIKE 'a%'; SELECT username, id FROM users WHERE id BETWEEN 5 and 10; SELECT username, id FROM users WHERE id IN (1,2,3,4); SELECT username, id FROM users WHERE first_name = 'John'; SELECT username, id FROM users WHERE first_name IS NOT NULL; SELECT username, id FROM users LIMIT 5;
  7. Inserting Data • Add data to a table using insert

    • Specify columns first • Specify data second, in the order in which they should apply to columns specified • NOT NULL columns and AUTOINCREMENT columns don’t need to be specified 8 INSERT INTO users ( first_name, last_name, username, email) VALUES ( 'Allison', 'Payne', 'thaps1928', '[email protected]' );
  8. Updating Data • Update data in rows by specifying which

    rows get updated, then setting values for columns • Rows can be updated more than 1 at a time • Functional paradigm instead of Imperative/Procedural • Can use column values to update other column values in same row 9 UPDATE users SET first_name = 'John' WHERE id < 5; UPDATE users SET first_name = first_name || 'John' WHERE id < 5; UPDATE users SET username = LOWER(username);
  9. Deleting Data • Deleting data works like update, specifying rows

    to apply deletion function • Truncate a table to clear it of data • Remove a table from the database using drop table • Be very careful with these methods. They destroy data! 10 DELETE FROM users WHERE id > 49999; TRUNCATE TABLE users; DELETE FROM users; DROP TABLE users;
  10. Indexes • Enable a database to optimize data retrieval •

    Primary keys automatically have unique indexes • Create index on specific columns • Create unique index to ensure columns (or collections of columns) have a uniqueness constraint for values • Use explain queries to understand which column(s) need indexing • Indexing speeds up retrieval, but slows down insertion, updating, and deleting since the table and the indices must be updated 11 CREATE INDEX name_idx ON users (first_name, last_name); CREATE UNIQUE INDEX username_uq ON users (username);
  11. CRUD • Create • Read • Update • Delete •

    Refers to all of the major functions that are implemented in RDBMes • Used to model RESTful web APIs 12
  12. Data Relationships • Relational databases are structured to relate tables

    to each other • Similar to object relations, but not exactly • Foreign Keys: each row matches exactly 0 or 1 rows in another table • Enable different types of relationships in web framework ORMs: • One-to-One: each row matches exactly 1 row in another table • Many-to-Many: each row matches 0 or more rows in another table • Foreign Key: each row matches 0 or 1 rows in another table • In SQLite, you must enable foreign key support 13 CREATE TABLE user_profiles ( id INTEGER PRIMARY KEY AUTOINCREMENT, user_id INTEGER, profile_value VARCHAR(100), FOREIGN KEY(user_id) REFERENCES users(id) ); PRAGMA foreign_keys = ON;
  13. Using Data Relationships • When a foreign key is enabled,

    inserting a row into a table requires that the id in related tables exist in a row • Deleting data requires constraints be met at all times • Queries can be made joining related tables, but data is returned only if referential integrity is maintained 14 sqlite> insert into user_profiles (user_id, profile_value) values (2, 'foo'); sqlite> insert into user_profiles (user_id, profile_value) values (12, 'foo'); Error: foreign key constraint failed sqlite> delete from users where id=2; Error: foreign key constraint failed SELECT user.id, user_profile.profile_value FROM users user JOIN user_profiles user_profile ON user.id = user_profile.user_id;
  14. In Class: Data Models • Create a SQL DDL data

    model for a parking lot: • Lots • Spaces • Cars 15
  15. Aggregation Functions • Aggregation enables computation • Simple aggregation returns

    single value, such as count and sum • Multi-valued aggregation enabled by group by • Think of group by as collapsing rows that have the same values • Can cause some unexpected effects... be sure to remember functional nature of query evaluation! 16 SELECT COUNT(*) FROM users; SELECT first_name, COUNT(first_name) as c FROM users GROUP BY first_name ORDER BY c DESC LIMIT 10;
  16. In Class: Start Working with SQLite • Use SQLite3! •

    Homework: • Write a SQL DDL for users and addresses where each user can have 0 or more addresses • Email me the DDL 17