MongoDB 101
Ryan Fischer
@ryanfischer20
Thursday, May 17, 12
Slide 2
Slide 2 text
What you will hear
What is NoSQL
Available NoSQL Databases
Intro to MongoDB
Thursday, May 17, 12
Slide 3
Slide 3 text
What is NoSQL
Thursday, May 17, 12
Slide 4
Slide 4 text
Fancy Answer
NoSQL is a class of database management system identified by
its non-adherence to the widely-use relational database
management system
Thursday, May 17, 12
Slide 5
Slide 5 text
My Answer
It’s not SQL
Thursday, May 17, 12
Slide 6
Slide 6 text
Does not use SQL as its query language
May not give full ACID guarantees
Distributed architecture
Typically optimized for reading and writing operations
Thursday, May 17, 12
Slide 7
Slide 7 text
Advantages
Thursday, May 17, 12
Slide 8
Slide 8 text
Traditional Scaling
Bigger is better! (or so they thought)
Increase the size and power of the server
Thursday, May 17, 12
Slide 9
Slide 9 text
Scaling with NoSql
Scale horizontally!
Distribute across multiple servers
More economical using lower-cost servers
Thursday, May 17, 12
Slide 10
Slide 10 text
Goodbye Schemas
Flexible data models
Easy to add/change data structures
Thursday, May 17, 12
Slide 11
Slide 11 text
Disadvantages
Thursday, May 17, 12
Slide 12
Slide 12 text
Goodbye Schemas
Flexible data structures
Application dependent on integrity
Thursday, May 17, 12
Slide 13
Slide 13 text
NoSQl is Still Young
Does not reduce administration (at least not yet)
Lack of expertise
Lack of projects expanding on NoSQL
Thursday, May 17, 12
NoSQL out in the Wild
Analytics - takes advantage of read/write optimizations
Logging
Large Scale Projects
Thursday, May 17, 12
Slide 17
Slide 17 text
MongoDB
Thursday, May 17, 12
Slide 18
Slide 18 text
What is MongoDB
Document Oriented Storage
Replication & Auto-Sharding
Document-based queries similar to SQL
Atomic Updates
Map/Reduce
Thursday, May 17, 12
Slide 19
Slide 19 text
Document Oriented
No schemas!!
No joins for high performance and scalability
embed documents
JSON-Style storage
Thursday, May 17, 12
Slide 20
Slide 20 text
High Performance
Stores a lot of data in memory
Embedding documents increase read and writes
Allows indexing
Thursday, May 17, 12
Slide 21
Slide 21 text
Availability and Scalability
Replicated servers with automatic master failover
Auto-sharing across servers
Consistent reads distributed over replicated servers
Thursday, May 17, 12
Slide 22
Slide 22 text
Atomic Modifers
In place updating documents
Does not replace entire document
Ideally suited for write heavy applications
Thursday, May 17, 12
Slide 23
Slide 23 text
Storing data
Data is grouped by collections
Collection contains documents of key-value pairs
Values can be rich including arrays and documents
Stored as BSON - Binary Serialized Document Notation
Thursday, May 17, 12
Slide 24
Slide 24 text
Querying
Javascript console allows for functions
Returns a cursor - lazy load of results
Queries expressed as JSON
Documents auto-assigned ObjectId
Thursday, May 17, 12
Slide 25
Slide 25 text
Examples
Interactive time!
Goto to https://gist.github.com/2719591 for examples
Thursday, May 17, 12
Slide 26
Slide 26 text
Embed vs Referenced
Relationships for models
Object Models - Think differently
When in doubt store in different collection
Thursday, May 17, 12
Slide 27
Slide 27 text
Geospatial Queries
Thursday, May 17, 12
Slide 28
Slide 28 text
SQL
SELECT * FROM Places
WHERE
acos(sin(1.3963) * sin(Lat) +
cos(1.3963) * cos(Lat) * cos(Lon -
(-0.6981))) * 6371 <= 1000;
Thursday, May 17, 12
Slide 29
Slide 29 text
Exact Queries
Search by closest points
Ways to search
Thursday, May 17, 12
Slide 30
Slide 30 text
Query within a rectangle
Circle with a center point and radius
Search within a polygon ( >= 1.9 )
Bound Queries
Thursday, May 17, 12
Slide 31
Slide 31 text
Thursday, May 17, 12
Slide 32
Slide 32 text
Thursday, May 17, 12
Slide 33
Slide 33 text
Spherical Model
Use decimal degrees - 42.53
Use [longitude, latitude] as ordering
Use radians for distance
$nearSphere and $centerSphere
Thursday, May 17, 12
Slide 34
Slide 34 text
GridFS
Store large files in MongoDB
Stores it in chunks
Thursday, May 17, 12
Slide 35
Slide 35 text
What is Sharding
Allows MongoDB to scale horizontally
Evenly distributes chunks of data
Performed per collection
Thursday, May 17, 12
Slide 36
Slide 36 text
Thursday, May 17, 12
Slide 37
Slide 37 text
Disadvantages
No inherit transaction support
Scaling sometimes isn’t simple
Multiple servers recommended
Object modeling can be complex
Thursday, May 17, 12
Slide 38
Slide 38 text
Advantages
Active community including 10Gen
Driver support for most languages
Many new features to come
Thursday, May 17, 12
Slide 39
Slide 39 text
The big data loss debate
Internet flame war history
Mongo performs one write at a time - global lock
Stored in memory
Replication - fail over
Thursday, May 17, 12
Slide 40
Slide 40 text
The End
Follow me - @ryanfischer20
Thursday, May 17, 12