Junky • Lead projects in .NET, Ruby, and PHP • Book on PHP & CouchDB development out early next year from Packt Publishing October 4th, 2011 - Tim Juravich Who Am I?
level discussion on NoSQL databases from the PHP developer POV • We’ll focus on Data Models • What we won’t be able to talk about: - How each database scales, stores it’s data. - CAP Theorem (http://tinyurl.com/nosql-cap) • We’ll touch on tools for us PHP Developers October 4th, 2011 - Tim Juravich What we’ll talk about today
defines how your application stores data and how it makes associations. If you force an incorrect data model into your application...you’re probably going to have a hard time. October 4th, 2011 - Tim Juravich Data Models
- Tim Juravich Traditional relational Model (MySQL) id first_name last_name 1 John Doe 2 Jane Doe users id address city state zip user_id 1 123 Main Street Seattle WA 98101 1 2 120 Pike Street Seattle WA 98101 1 3 321 2nd Ave Seattle WA 98101 2 addresses id number primary user_id 1 555-867-5309 1 1 2 555-867-5309 0 1 3 555-867-5309 1 2 phone numbers
are just limitations - Due to normalization as your data grows so does the complexity of your database - As your database grows, writing becomes a bottleneck - Difficult to scale horizontally - Sometimes tables are just too limiting October 4th, 2011 - Tim Juravich What’s wrong with this?
Some NoSQL databases have been around forever - In 2004 & 2005 they explode • NoSQL really means “Not Only SQL” October 4th, 2011 - Tim Juravich NoSQL databases are born
News - Different tools for different situations - Flexible (Schema-less) - Focused on scalability out of the box - New data models • The Bad News - No common standards - Relatively immature - New data models October 4th, 2011 - Tim Juravich NoSQL
- Tim Juravich Size vs. Complexity Typical RDMBS Key/value Stores Column Stores Document Database Graph Database RDMBS Performance Line Scale To Complexity Scale To Size
- Tim Juravich Key Value Stores • Definition - Access to a value based on a unique key - Think of this in terms of a hash-table, or in PHP an associative array. “John Doe” user_1 { name: “John Doe”, email: [email protected]”, phone: “8675309” } user_2
Incredibly Fast - Pub/Sub support - Simple CLI • Weaknesses - It can feel limiting with a complex use case • Use it for - Rapidly changing data. - Stocks, Analytics, Real-time collection/communication October 4th, 2011 - Tim Juravich Redis redis> set im.a.key "im.a.value" OK redis> get im.a.key "im.a.value"
- Tim Juravich Redis & PHP • There are a variety of Redis & PHP toolkits (http://tinyurl.com/redis-php) • My favorite is Predis (https://github.com/nrk/predis) • Twitter clone to play with: (http://redis.io/topics/twitter-clone) <?php $redis = new Predis\Client(); $redis->set('foo', 'bar'); $value = $redis->get('foo');
- Tim Juravich Column Stores • Definition - Similar to relational database, but it flips it all around. Instead of storing records, column stores store all of the values for a column together in a stream. From there, you can use an index to get column values for a particular record. - Can handle 4 or 5 dimensions Keyspace, Column Family, Column Family Row, Column Keyspace, Column Family, Column Family Row, Super Column, Column
- Tim Juravich Column Stores users (Column Family) johndoe (key) John Doe name (column) email (column) phone (column) [email protected] 5558675309 janedoe (key) Jane Doe name (column) email (column) [email protected] example-db (Keyspace)
- Tim Juravich Cassandra • Strengths - Can handle some serious data - Writes are much faster than reads - Hadoop integration • Weaknesses - Complex for it’s offering and bloated • Use it for - Apps with a lot of writing. Serious applications.
- Tim Juravich Cassandra & PHP • Thrift • There are a few PHP libraries (http://tinyurl.com/cassandra-php) • My favorite is phpcassa (https://github.com/thobbs/phpcassa)
- Tim Juravich Document Databases • Definition - Documents provide access to structured data, without a schema. - Buckets of key-value pairs inside of a self contained object. - Friendliest NoSQL databases for PHP developers
- Tim Juravich MongoDB • Strengths - Familiar Query Language - A lot of developer toolkits • Weaknesses - Sharding can be a pain • Use it for - Things you might do with MySQL, but schemas are getting in the way. $post->where('id', $this->getID());
- Tim Juravich MongoDB & PHP • A TON of libraries (http://tinyurl.com/mongo-php) - Doctrine - Cake - Code Ignitor - Symfony - Zend - etc. • My favorite is ActiveMongo (https://github.com/crodas/ActiveMongo)
- Tim Juravich CouchDB • Strengths - Bi-lateral replication (master-master) - Slick RESTful JSON API, easy to use • Weaknesses - Need some Javascript chops - Slower writes • Use it for - Mobile, CRM, CMS systems, multi-site deployments
- Tim Juravich CouchDB & PHP • Not as many as MongoDB (http://tinyurl.com/couch-php) - Doctrine - Variety of standalone libraries • My favorite is Sag (https://github.com/sbisbee/sag)
- Tim Juravich CouchDB & PHP (Sag) <?php class User { public $name; public $email; } $user = new User(); $user->name = 'John Doe'; $user->email = '[email protected]'; $sag = new Sag('127.0.0.1', '5984'); $sag->setDatabase('example-db'); $sag->post($user);
- Tim Juravich Graph Databases • Definition - Instead of tables, rows, columns, it's a flexible graph model that contains nodes. Nodes have properties and relationships to other nodes.
- Tim Juravich Neo4j • Strengths - Fully transactional - Flexible API • Weaknesses - A completely different way of thinking - Complex • Use it for - Social relations, road maps, network topologies
- Tim Juravich So which DB should you use? • Don’t rule out relational databases • Do your homework • Every project is different • NoSQL probably should be avoided in these areas - Transactions, orders, anything where money changes hands - Business critical data or line of business applications