Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Rivieria Scala and Clojure - MongoDB talk

rozza
January 16, 2013

Rivieria Scala and Clojure - MongoDB talk

My MongoDB talk from the recent Rivieria Scala and Clojure usergroup.

http://www.meetup.com/riviera-scala-clojure/events/89448102/

rozza

January 16, 2013
Tweet

More Decks by rozza

Other Decks in Technology

Transcript

  1. 4 10gen Overview Offices in New York, Palo Alto, Washington

    DC, London, Dublin, Barcelona and Sydney
  2. Relational Database Challenges Data Types • Unstructured data • Semi-structured

    data • Polymorphic data Volume of Data • Petabytes of data • Trillions of records • Tens of millions of queries per second Agile Development • Iterative • Short development cycles • New workloads New Architectures • Horizontal scaling • Commodity servers • Cloud computing
  3. MongoDB is a ___________ database • Document-oriented • Open-source •

    High performance and horizontally scalable • Full featured
  4. Document-Oriented Database • Not .PDF & .DOC files • A

    document is essentially an associative array – JSON object, PHP Array, Python Dictionary, etc. • BSON – www.bsonspec.org
  5. Open-Source • MongoDB is an open-source project • On GitHub

    • Database licensed under the AGPL • Drivers licensed under Apache
  6. High Performance and Horizontally Scalable • High performance – Written

    in C++ – Data serialised as BSON (fast parsing) – Full support for primary & secondary indexes • Horizontally scalable – Auto-sharding – Scale across • Commodity hardware • Cloud compute • Hybrid
  7. Full Featured • Rich Ad Hoc queries • Real time

    aggregation • Geospatial features • Native bindings for most programming languages
  8. MongoDB is a Single-Master System • All writes are to

    a primary (master) • Failure of the primary is detected, and a new one is elected • Application writes get an error if there is no quorum to elect a new master – Reads can continue
  9. MongoDB Storage Management • Data is kept in memory-mapped files

    • Files are allocated as needed • Indexes (B*-trees) point to documents using geographical addresses
  10. Relational Schema User ·Name ·Email address Category ·Name ·URL Comment

    ·Comment ·Date ·Author Article ·Name ·Slug ·Publish date ·Text Tag ·Name ·URL
  11. MongoDB Schema User ·Name ·Email address Article ·Name ·Slug ·Publish

    date ·Text ·Author Comment[] ·Comment ·Date ·Author Tag[] ·Value Category[] ·Value
  12. Scalability – Auto-sharding Node 1 Secondary Config Server Node 1

    Secondary Config Server Node 1 Secondary Config Server Shard Shard Shard Mongos App Server Mongos App Server Mongos App Server
  13. Terminology RDBMS Mongo Table, View Collection Row(s) Document Index Index

    Join Linking/Embedding Partition Shard Partition Key Shard Key
  14. _id • _id is the primary key in MongoDB •

    Automatically indexed • Automatically created as an ObjectId if not provided • Any unique immutable value could be used
  15. • ObjectId is a special 12 byte value • Guaranteed

    to be unique across your cluster ObjectId("50ed3c5cab4ef39dc735664b") |-------------||---------||-----||----------| ts mac pid inc ObjectId
  16. // find users with any tags > db.users.find( {tags: {$exists:

    true }} ) // find users matching a regular expression > db.users.find( {username: /^ro*/i } ) // count posts by author > db.users.find( {username: "Ross"} ).count() • Conditional Operators – $all, $exists, $mod, $ne, $in, $nin, $nor, $or, $size, $type – $lt, $lte, $gt, $gte Query Operators
  17. > tags = ["superuser", "db_admin"] > address = { street:

    "Scrutton Street", city: "London" } > db.users.update({}, {"$pushAll": {"tags": tags}, "$set": {"address": address}, "$inc": {"tag_count": 2}}) Update
  18. > db.users.findOne() { "_id" : ObjectId("50ed3c5cab4ef39dc735664b"), "address" : { "street"

    : "Zetland House", "city" : "London" }, "first_name" : "Ross", "last_name" : "Lawley", "tag_count" : 2, "tags" : [ "superuser", "db_admin" ], "username" : "ross" } Read (Query)
  19. • Scalar – $set, $unset, $inc, • Array – $push,

    $pushAll, $pull, $pullAll, $addToSet Atomic operators
  20. // 1 means ascending, -1 means descending > db.users.ensureIndex({username: 1})

    > db.users.find({username: "ross"}).explain() // Multi-key indexes > db.users.ensureIndex({tags: 1}) // index nested field > db.users.ensureIndex({"address.city": 1}) // Compound indexes > db.users.ensureIndex({ "username": 1, "address.city": 1 }) Secondary Indexes
  21. MongoDB drivers • Official Support for 13 languages • Community

    drivers for tons more – Clojure, R, lua etc. • Drivers connect to mongo servers • Drivers translate BSON into native types • mongo shell is not a driver, but works like one in some ways • Installed using typical means (npm, pecl, gem, pip)
  22. Scala case class Book(id: ObjectId, author: String, isbn: String, price:

    Price, year: Int, tags: Seq[String], title: String, publisher: String, edition: Option[String]) { def toDBObject = MongoDBObject( "_id" -> id, "author" -> author, "isbn" -> isbn, "price" -> price.toDBObject, "publicationYear" -> year, "tags" -> tags, "title" -> title, "publisher" -> publisher, "edition" -> edition ) }
  23. Scala // Connect to default - localhost, 27017 scala> val

    mongoClient = MongoClient() mongoClient: com.mongodb.casbah.MongoClient ... // Chainable api scala> val mongoColl = mongoClient("bookstore")("books") mongoColl: com.mongodb.casbah.MongoCollection ...
  24. Scala scala> val builder = MongoDBObject.newBuilder scala> builder += "foo"

    -> "bar" scala> builder += "x" -> "y" scala> builder += ("pie" -> 3.14) scala> builder += ("spam" -> "eggs", "mmm" -> "bacon") builder.type = com.mongodb.casbah.commons.MongoDBObjectBuilder // Return a DBObject scala> val newObj = builder.result newObj: com.mongodb.casbah.commons.Imports.DBObject = { "foo" : "bar" , "x" : "y" , "pie" : 3.14 , "spam" : "eggs" , "mmm" : "bacon"}
  25. Scala scala> val newObj = MongoDBObject("foo" -> "bar", | "x"

    -> "y", | "pie" -> 3.14, | "spam" -> "eggs") newObj: com.mongodb.casbah.commons.Imports.DBObject = { "foo" : "bar" , "x" : "y" , "pie" : 3.14 , "spam" : "eggs"}
  26. Scala val mongo: MongoClient = MongoClient()("bookstore")("books") def findAll() = for

    ( book <- mongo.find() ) yield new Book(book) findAll().foreach(b => println("<Book> " + b))
  27. Scala val mongo: MongoClient = MongoClient()("bookstore")("books") val query: DBObject =

    ("price" $lt 40.00) ++ ("tag" -> "scala") mongo.find( query )
  28. • Wraps the Java driver • Scala-esque way of interacting

    with mongo • Casbah 2.5.0 coming soon! Casbah
  29. • Bi-directional Scala case class serialization library • Leverages MongoDB's

    DBObject (target format) • Depends on the latest releases of: -scalap -casbah-core -mongo-java-driver • https://github.com/novus/salat Salat
  30. Salat - there and back case class Alpha(x: String) scala>

    val a = Alpha(x = "Hello world") a: com.novus.salat.test.model.Alpha = Alpha(Hello world) scala> val dbo = grater[Alpha].asDBObject(a) dbo: com.mongodb.casbah.Imports.DBObject = { "_typeHint" : "com.novus.salat.test.model.Alpha" , "x" : "Hello world"} scala> val a_* = grater[Alpha].asObject(dbo) a_*: com.novus.salat.test.model.Alpha = Alpha(Hello world) scala> a == a_* res0: Boolean = true
  31. • A case class instance extends Scala's Product trait, which

    provides a product iterator over its elements. • Salat used pickled Scala signatures to turn case classes into indexed fields with associated type information. • These fields are then serialized or deserialized using the memoized indexed fields with type information. How does that work?
  32. • Case classes • Case classes typed to a trait

    or an abstract superclass (requires @Salat) • Inside a case class constructor • Immutable collections: lists, seqs, maps whose key is a String • Options • Any type handled by BSON encoding hooks Salat - Supports
  33. • classes • case classes nested inside an enclosing class

    or trait (Cake pattern - coming soon) • collection support needs to be improved - no Set, Array, mutable Map, etc Salat - Limitations
  34. Simple way to get started and create your own DAO

    • insert and get back an Option with the id • findOne - get back an Option typed to your case class • find and get back a Mongo cursor typed to your class • iterate, limit, skip and sort • update with a query and a case class • save and remove case classes SalatDAO
  35. SalatDAO - example import com.novus.salat._ import com.novus.salat.global._ case class Omega(_id:

    ObjectId = new ObjectId, z: String, y: Boolean) object OmegaDAO extends SalatDAO[Omega, ObjectId]( collection = MongoConnection()("salat-example")("omega"))
  36. SalatDAO - Insert and find scala> val o = Omega(z

    = "something", y = false) o: Omega = Omega(4dac7b3e75e1b63949139c91, something, false) scala> val _id = OmegaDAO.insert(o) _id: Option[ObjectId] = Some(4dac7b3e75e1b63949139c91) scala> val o_* = OmegaDAO.findOne( MongoDBObject("z" -> "something")) o_*: Option[Omega] = Some( Omega(4dac7b3e75e1b63949139c91, something, false))
  37. • Simplifies serialization to and from mongo • salatDAO provides

    a quick CRUD interface • New version coming soon! Salat
  38. • ReactiveMongo - asynchronous / non blocking by the guys

    at zenexity! Stephane Godbillon • Hammersmith - Netty + Akka.IO interfaces, strongly functional and callback based Scala - community drivers
  39. • ReplicaSet support • Authentication support • GridFS support (streaming

    capable) • Cursors (providing a stream of documents) • Bulk inserts • Database commands support • Indexing operations ReactiveMongo
  40. ReactiveMongo // select only my documents val query = BSONDocument("firstName"

    -> BSONString("Ross")) // get a Cursor[BSONDocument] val cursor = collection.find(query) //Enumerate this cursor and print a readable version cursor.enumerate.apply(Iteratee.foreach { doc => println("found document: " + DefaultBSONIterator.pretty(doc.bsonIterator)) })
  41. ReactiveMongo // select only my documents val query = BSONDocument("firstName"

    -> BSONString("Ross")) // get a Cursor[BSONDocument] val cursor = collection.find(query) // Create a list using the toList helper val futurelist = cursor.toList futurelist.onSuccess { case list => val names = list.map( _.getAs[BSONString]("lastName").get.value) println("got names: " + names) }
  42. Monger (ns my.service.server (:require [monger.core :as mg]) (:import [com.mongodb MongoOptions

    ServerAddress])) ;; localhost, default port (mg/connect!) ;; set default database using set-db (mg/set-db! (mg/get-db "monger-test"))
  43. Monger - Insert (connect!) (set-db! (monger.core/get-db "monger-test")) ;; with explicit

    document id (recommended) (insert "documents" { :_id (ObjectId.) :first_name "John" :last_name "Lennon" }) ;; with a different write concern (insert "documents" { :_id (ObjectId.) :first_name "John" :last_name "Lennon" } WriteConcern/JOURNAL_SAFE)
  44. Monger - Querying (connect!) (set-db! (monger.core/get-db "monger-test")) ;; Query returns

    com.mongodb.DBObject object (mc/find "documents" {:first_name "Ringo"})
  45. Monger - Querying (connect!) (set-db! (monger.core/get-db "monger-test")) ;; Query returns

    com.mongodb.DBObject object (mc/find "documents" {:first_name "Ringo"}) ;; returns documents as Clojure maps (mc/find-maps "documents" {:first_name "Ringo"})
  46. • MongoDB is an agile, scalable NoSQL database • If

    you're coding on the JVM you have many options for interacting with MongoDB • Multiple native drivers • Some great community drivers In Summary
  47. Click to edit Master text styles More Information Resource Location

    MongoDB Downloads www.mongodb.org/downloads Free Online Training education.10gen.com Webinars and Events www.10gen.com/events White Papers www.10gen.com/white-papers Customer Case Studies www.10gen.com/customers Presentations www.10gen.com/presentations Documentation docs.mongodb.org Additional Info [email protected]