Slide 1

Slide 1 text

Brendan McAdams 10gen, Inc. [email protected] @rit MongoDB + the JVM Integrating NoSQL with Java & Scala Thursday, September 13, 12

Slide 2

Slide 2 text

Let’s Face It ... SQL Sucks. For some problems at least. Thursday, September 13, 12

Slide 3

Slide 3 text

Stuffing an object graph into a relational model is like fitting a square peg into a round hole. Databases should simplify application development - they should present a model that fits naturally with our code Thursday, September 13, 12

Slide 4

Slide 4 text

MongoDB as a Database • MongoDB is designed to fit your application • Flexible • Fast • Sane Thursday, September 13, 12

Slide 5

Slide 5 text

MongoDB as a Database • Two crucial core concepts • Document Oriented Data • Scalability Thursday, September 13, 12

Slide 6

Slide 6 text

MongoDB as a Database • Document Oriented Data • Instead of flat row structures, “document oriented” data (similar to JSON) • Rich: Embed other documents and arrays as values Thursday, September 13, 12

Slide 7

Slide 7 text

Rich Documents Represent MongoDB Data { “title”: “Programming Erlang: Software for a Concurrent World”, “author”: “Joe Armstrong”, “publicationYear”: 2007, “publisher”: “The Pragmatic Programmers, LLC”, } Like JSON , MongoDB Documents are made up of keys and value pairs. Thursday, September 13, 12

Slide 8

Slide 8 text

Rich Documents Represent MongoDB Data { “title”: “Programming Erlang: Software for a Concurrent World”, “author”: “Joe Armstrong”, “publicationYear”: 2007, “price”: { "currency": "USD", "discount": 24.14, "msrp": 36.95 }, “publisher”: “The Pragmatic Programmers, LLC”, } Values can be complex, such as embedded subdocuments... Thursday, September 13, 12

Slide 9

Slide 9 text

Rich Documents Represent MongoDB Data { “title”: “Programming Erlang: Software for a Concurrent World”, “author”: “Joe Armstrong”, “publicationYear”: 2007, “price”: { "currency": "USD", "discount": 24.14, "msrp": 36.95 }, “publisher”: “The Pragmatic Programmers, LLC”, “tags”: [ “erlang”, “concurrent programming”, “multicore”, “programming” ] } Or even arrays! Thursday, September 13, 12

Slide 10

Slide 10 text

Querying with MongoDB • Querying in MongoDB is similar to a ‘query by example’ interface • Finding items by exact match via “key” = “value” • Built-in Query Expressions for more advanced statements Thursday, September 13, 12

Slide 11

Slide 11 text

Querying With MongoDB > db.books.find({“author”: “Joe Armstrong”}) { “title”: “Programming Erlang: Software for a Concurrent World”, “author”: “Joe Armstrong”, “publicationYear”: 2007, “price”: { "currency": "USD", "discount": 24.14, "msrp": 36.95 }, “publisher”: “The Pragmatic Programmers, LLC”, “tags”: [ “erlang”, “concurrent programming”, “multicore”, “programming” ] } Basic queries consist of “key” = “value” Thursday, September 13, 12

Slide 12

Slide 12 text

Querying With MongoDB > db.books.find({“price.currency”: “USD”}) { “title”: “Programming Erlang: Software for a Concurrent World”, “author”: “Joe Armstrong”, “publicationYear”: 2007, “price”: { "currency": "USD", "discount": 24.14, "msrp": 36.95 }, “publisher”: “The Pragmatic Programmers, LLC”, “tags”: [ “erlang”, “concurrent programming”, “multicore”, “programming” ] } Embedded docs can be accessed by “key.subkey” = “value” Thursday, September 13, 12

Slide 13

Slide 13 text

Querying With MongoDB > db.books.find({“tags”: “multicore”}) { “title”: “Programming Erlang: Software for a Concurrent World”, “author”: “Joe Armstrong”, “publicationYear”: 2007, “price”: { "currency": "USD", "discount": 24.14, "msrp": 36.95 }, “publisher”: “The Pragmatic Programmers, LLC”, “tags”: [ “erlang”, “concurrent programming”, “multicore”, “programming” ] } Embedded arrays can be accessed by matching just a single value from the array Thursday, September 13, 12

Slide 14

Slide 14 text

Querying With MongoDB > db.books.find({“price.discount”: {$lt: 25.00}}) { “title”: “Programming Erlang: Software for a Concurrent World”, “author”: “Joe Armstrong”, “publicationYear”: 2007, “price”: { "currency": "USD", "discount": 24.14, "msrp": 36.95 }, “publisher”: “The Pragmatic Programmers, LLC”, “tags”: [ “erlang”, “concurrent programming”, “multicore”, “programming” ] } Finally, MongoDB provides a set of query expressions for concepts such as greater than, less than, etc. Thursday, September 13, 12

Slide 15

Slide 15 text

MongoDB as a Database • Scalability • Database should grow and scale with our application • Replica Sets: Robust, modernized Replication Model with automatic failover • Sharding: n-scalable horizontal partitioning with automatic management Thursday, September 13, 12

Slide 16

Slide 16 text

MongoDB on the JVM • MongoDB has strong, wide support on the JVM • Java • Scala • Hadoop • Also, fantastic work occurring in Clojure community (see Monger - clojuremongodb.info ) Thursday, September 13, 12

Slide 17

Slide 17 text

MongoDB + Java • Java + MongoDB • “Core” MongoDB Driver (mongo-java-driver) • Manipulate MongoDB Docs as Map-like structures • “Object Document Mapping” (like Hibernate, but less painful) • Morphia • Map domain objects to MongoDB with JPA-like annotations • Spring Data • Spring ODM for many NoSQL databases, supports MongoDB Thursday, September 13, 12

Slide 18

Slide 18 text

Core MongoDB + Java com.mongodb.Mongo m = new Mongo(); The Mongo class represents a connection pool com.mongodb.DB db = m.getDB( "bookstore" ); The DB class represents a Database context com.mongodb.DBCollection coll = db.getCollection( "books" ); The DBCollection class represents a Collection (Mongo’s version of a table) handle Thursday, September 13, 12

Slide 19

Slide 19 text

Core MongoDB + Java com.mongodb.DBObject q = new BasicDBObject(); q.put( “tag”, “scala” ); q.put( “price”, new BasicDBObject( “$lt”, 40.00 ) ); Documents are represented by DBObject for ( DBObject doc : coll.find( q ) ) { // ... } Queries return a DBCursor, which is both Iterator and Iterable Thursday, September 13, 12

Slide 20

Slide 20 text

Core MongoDB + Java DBObject q = QueryBuilder.start( “price” ).lessThan( 40.00 ). and( “tag” ).is( “scala” ).get(); There is also a QueryBuilder helper class for querying If you don’t fancy doing everything by hand, you can use tools like Morphia to map domain objects automatically... Thursday, September 13, 12

Slide 21

Slide 21 text

Object Mapping Java via Morphia @Entity("books") // Book classes persist to / from “books” class Book {} Morphia uses JPA-like Annotations to mark up domain objects for MongoDB persistence @Id private ObjectId id; Any field can be tagged as the primary key via the @Id annotation private List tags = new ArrayList(); List fields are automatically persisted as MongoDB arrays Thursday, September 13, 12

Slide 22

Slide 22 text

Object Mapping Java via Morphia /** * Could also use "reference", which are stored to * their own collection and loaded automatically * * Morphia uses the field name for where to store the value, */ @Embedded private Price price; Complex sub-objects can be marked to either “embed” or “reference” automatically /** * Can rename a field for how stored in MongoDB */ @Property("publicationYear") private int year; It’s trivial to name a field one thing in MongoDB and another in our Morphia model Thursday, September 13, 12

Slide 23

Slide 23 text

MongoDB + Scala • Scala + MongoDB • “Core” MongoDB Driver (casbah) •Wraps the Java driver, provides strong Scala API •Documents manipulated in a 2.8+ collections Map structure including Builder, Factory and CanBuildFrom •ODMs • Salat - Case class mapping with some optional annotations, very fast and lightweight • Lift - Popular Scala web framework includes a MongoDB ODM layer based on the ActiveRecord pattern •Next-Generation Drivers (async focus) [Pure rewrites of driver] • Hammersmith - my pet project, Netty + Akka.IO interfaces, strongly functional and callback based • ReactiveMongo - from the amazing team @ Zenexity who brought us the Play! Framework - new, but very promising Thursday, September 13, 12

Slide 24

Slide 24 text

Basic MongoDB + Scala via Casbah Casbah’s version of a Document is MongoDBObject, which is a full Scala MapLike collection case class Book(id: ObjectId, author: Seq[Author], isbn: String, price: Price, publicationYear: Int, tags: Seq[String], title: String, publisher: String, edition: Option[String]) { def toDBObject = MongoDBObject( "author" -> author.map { a => a.name }, "_id" -> id, "isbn" -> isbn, "price" -> price.toDBObject, "publicationYear" -> publicationYear, "tags" -> tags, "title" -> title, "publisher" -> publisher, "edition" -> edition ) } Thursday, September 13, 12

Slide 25

Slide 25 text

Basic MongoDB + Scala via Casbah Because it is a full Collection implementation, construction Builder style is easy as well val b = MongoDBObject.newBuilder b += "foo" -> "bar" b += "x" -> 5 b += "map" -> Map("spam" -> 8.2, "eggs" -> "bacon") val dbObj = b.result It’s even easy to start with a blank DBObject val dbObj = MongoDBObject.empty dbObj must beDBObject dbObj must have size (0) Thursday, September 13, 12

Slide 26

Slide 26 text

Basic MongoDB + Scala via Casbah While Casbah wraps the Java Driver for the Mongo protocol, its API aims to be as Scala pure as possible val mongo: MongoCollection = MongoConnection()("bookstore")("books") Casbah’s Cursors can easily be iterated in standard Scala style def findAll() = for ( book <- mongo.find() ) yield newBook(book) findAll().foreach(b => println(" " + b)) Thursday, September 13, 12

Slide 27

Slide 27 text

Basic MongoDB + Scala via Casbah Leveraging Scala’s great support for composable DSLs, Casbah provides a Query DSL that feels natural to a MongoDB user... val q: DBObject = ("price" $lt 40.00) ++ ("tag" -> "scala") Like with Java, the Scala MongoDB Community has created a few ways of mapping Objects as well... Thursday, September 13, 12

Slide 28

Slide 28 text

Object Mapping in Scala via Lift class Book private() extends BsonRecord[Book] with ObjectIdPk[Book] { } Lift uses an ActiveRecord style API for mapping MongoDB + Scala Thursday, September 13, 12

Slide 29

Slide 29 text

Object Mapping in Scala via Lift Fields in Lift-Mongo are declared as objects implementing a special typed trait. object isbn extends StringField(this, 64) And Lists can be automatically handled via a MongoListField object author extends MongoListField[Book, String](this) Fields can be declared optional by overriding a trait attribute object edition extends StringField(this, 32) { override def optional_? = true } Thursday, September 13, 12

Slide 30

Slide 30 text

Object Mapping in Scala via Lift Embedding objects represented by another entity is easy as well object price extends BsonRecordField(this, Price) class Price private() extends BsonRecord[Price] { // ... object currency extends StringField(this, 3) object discount extends DoubleField(this) object msrp extends DoubleField(this) } To make working with Lift + MongoDB as easy as possible, Foursquare has created a fantastic Query DSL called Rogue Thursday, September 13, 12

Slide 31

Slide 31 text

Object Mapping in Scala via Lift The downside to Lift is the need to use specially structured objects. For those who want a less formal API, Salat makes it easy... Thursday, September 13, 12

Slide 32

Slide 32 text

Object Mapping in Scala via Salat case class Book(id: ObjectId, author: Seq[Author], isbn: String, price: Price, publicationYear: Int, tags: Seq[String], title: String, publisher: String, edition: Option[String]) case class Author(name: String) case class Price(currency: String, discount: Double, msrp: Double) val authors = Seq( Author("Timothy Perrett") ) val price = Price("USD", 39.99, 39.99) val tags = Seq("functional programming", "scala", "web development", "lift", "#legendofklang") val liftInAction = Book(new ObjectId, authors, "9781935182801", price, 2011, tags, "Lift in Action", "Manning Publications Co.", Some("First")) Salat uses Scala’s case classes as the core of their model (immutability is good!) Thursday, September 13, 12

Slide 33

Slide 33 text

Object Mapping in Scala via Salat /** * The Salat Grater uses runtime Scala type reflection to * generate a MongoDB Object. */ val dbo = grater[Book].asDBObject(liftInAction) mongo.save(dbo) Persistence is made simple (through Scala reflection) via the Grater object Some annotations are available to mark indexed fields, etc but the core ideas in Salat are elegantly simple Thursday, September 13, 12

Slide 34

Slide 34 text

MongoDB + Hadoop • Hadoop + MongoDB • Hadoop integration with MongoDB is one of my key projects at 10gen • Feed MongoDB data directly (“live”) into MapReduce jobs & save MapReduce results directly to MongoDB • Coming Soon: Read/Write “archived BSON” (basically, MongoDB Backup Files) • Support for “Core MapReduce” as well as the wider Hadoop ecosystem • Pig • Streaming • Hive & Scoobi (coming soon) • See http://api.mongodb.org/hadoop to learn more Thursday, September 13, 12

Slide 35

Slide 35 text

[Want to Know More About MongoDB?] Stop by our booth on the conference floor! [Docs] http://mongodb.org http://api.mongodb/ *Contact Me* [email protected] (twitter: @rit) Thursday, September 13, 12