Slide 1

Slide 1 text

Ross Lawley - [email protected] Django and mongoDB

Slide 2

Slide 2 text

Quick introduction to mongoDB MongoDB is a scalable, high-performance, open source NoSQL database. •Document-oriented storage •Full Index Support •Replication & High Availability •Auto-Sharding •Querying •Fast In-Place Updates •Map/Reduce •GridFS

Slide 3

Slide 3 text

Database Landscape depth of functionality scalability & performance memcached key/value RDBMS

Slide 4

Slide 4 text

> p = { author: "Ross", date: new Date(), body: "About MongoDB...", tags: ["tech", "databases"]} > db.posts.save(p) Documents Blog Post Document

Slide 5

Slide 5 text

> db.posts.find() { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "Ross", date : ISODate("2012-02-02T11:52:27.442Z"), body : "About MongoDB...", tags : [ "tech", "databases" ] } Querying

Slide 6

Slide 6 text

// 1 means ascending, -1 means descending > db.posts.ensureIndex({author: 1}) > db.posts.find({author: 'Ross'}) { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author: "Ross", ... } Secondary Indexes Create index on any Field in Document

Slide 7

Slide 7 text

// find posts with any tags > db.posts.find({tags: {$exists: true }}) // find posts matching a regular expression > db.posts.find({author: /^ro*/i }) // count posts by author > db.posts.find({author: 'Ross'}).count() Query Operators Conditional Operators - $all, $exists, $mod, $ne, $in, $nin, $nor, $or, $size, $type - $lt, $lte, $gt, $gte

Slide 8

Slide 8 text

> db.posts.find({"author": 'Ross'}).explain() { "cursor" : "BtreeCursor author_1", "nscanned" : 1, "nscannedObjects" : 1, "n" : 1, "millis" : 0, "indexBounds" : { "author" : [ [ "Ross", "Ross" ] ] } } Examine the query plan

Slide 9

Slide 9 text

Atomic Operations $set, $unset, $inc, $push, $pushAll, $pull, $pullAll, $bit // Create a comment > new_comment = { author: "Fred", date: new Date(), body: "Best Post Ever!"} // Add to post > db.posts.update({ _id: "..." }, {"$push": {comments: new_comment}, "$inc": {comments_count: 1} });

Slide 10

Slide 10 text

{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "Ross", date : "Thu Feb 02 2012 11:50:01", body : "About MongoDB...", tags : [ "tech", "databases" ], comments : [{ author : "Fred", date : "Fri Feb 03 2012 13:23:11", body : "Best Post Ever!" }], comment_count : 1 } Nested Documents

Slide 11

Slide 11 text

{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "Ross", date : "Thu Feb 02 2012 11:50:01", body : "About MongoDB...", tags : [ "tech", "databases" ], comments : [{ author : "Fred", date : "Fri Feb 03 2012 13:23:11", body : "Best Post Ever!" }], comment_count : 1 } Nested Documents

Slide 12

Slide 12 text

+

Slide 13

Slide 13 text

class Post(models.Model): author = models.CharField(max_length=250) title = models.CharField(max_length=250) body = models.TextField() date = models.DateTimeField('date') tags = models.ManyToManyField('Tag') comments = models.ManyToManyField('Comment') class Tag(models.Model): text = models.CharField(max_length=250) class Comment(models.Model): author = models.CharField(max_length=250) body = models.TextField() date = models.DateTimeField('date') We already model to objects

Slide 14

Slide 14 text

In a relational database post id author title body date id post_id tag_id post_tags id text tag id post_id comment_id post_comments id author body date comment 0..* 0..*

Slide 15

Slide 15 text

In mongoDB class Post(models.Model): author = models.CharField(max_length=250) title = models.CharField(max_length=250) body = models.TextField() date = models.DateTimeField('date') tags = models.ManyToManyField('Tag') comments = models.ManyToManyField('Comment')

Slide 16

Slide 16 text

{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "Fred", date : "Fri Feb 03 2012 13:23:11", body : "Best Post Ever!" }] } author : "Ross", title : "mongoDB and Django can play nice", body : "About MongoDB...", date : "Thu Feb 02 2012 11:50:01", tags : [ "tech", "databases" ], comments : [{ In mongoDB

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

Integration Choices •pymongo •Pick a ODM • MongoEngine • MongoKit • MongoAlchemy • Minimongo • DictShield - roll your own •Django Nonrel

Slide 19

Slide 19 text

# Connect to mongodb from pymongo import Connection connection = Connection() db = connection.blog # In the view post = {"author": "Ross", "body": "mongoDB and Django ....", "tags": ["mongodb", "django", "pymongo"], "date": datetime.datetime.utcnow()} db.posts.save(post) pymongo

Slide 20

Slide 20 text

pymongo + Go native, its fast and efficient You have to understand how mongoDB works - You do all the work No document validation No auto forms generation No admin

Slide 21

Slide 21 text

ODM Mappers - in general + Familiar to what you're used to Defined schemas - Too familiar? MongoDB nuances and features hidden Slower Which library to choose? Varying levels of integration with Django

Slide 22

Slide 22 text

# Settings.py import mongoengine mongoengine.connect('blog') # models.py class Post(Document): title = StringField(max_length=120, required=True) body = StringField() author = ReferenceField(User) date = DateTimeField(default=datetime.datetime.utcnow) tags = ListField(StringField(max_length=30)) comments = ListField(EmbeddedDocumentField("Comment")) class Comment(EmbeddedDocument): body = StringField() author = StringField(max_length=120) ODM - MongoEngine

Slide 23

Slide 23 text

# Usage examples posts = Post.objects.filter(comment__author=Ross) # Creating post, created = Post.objects.get_or_create(form.cleaned_data) # Top tags - map / reduce in the background from operator import itemgetter freqs = Post.objects.item_frequencies('tag', normalize=True) tags = sorted(freqs.items(), key=itemgetter(1), reverse=True)[:10] ORM - MongoEngine

Slide 24

Slide 24 text

MongoEngine + Familar Tries to follow Django API where sane Can exist alongside relational databases Special mongoDB field types - listfield, dictfield etc. Some django integration * authentication backend * session backend Not Django specific - Its not the Django ORM Connect in settings.py Monolithic compared to pymongo No inbuilt Django forms / admin * django-mongotools - views / forms * django-mongonaut - for admin

Slide 25

Slide 25 text

# Settings.py DATABASES = { "default": { "ENGINE": "django_mongodb_engine", "NAME": "blog", } } # models.py from djangotoolbox.fields import EmbeddedModelField, ListField class Post(models.Model): author = models.CharField(max_length=250) title = models.CharField(max_length=250) body = models.TextField() date = models.DateTimeField('date') tags = ListField() comments = ListField(EmbeddedModelField('Comment')) Django Nonrel - Django MongoDB Engine

Slide 26

Slide 26 text

Django Nonrel - Django MongoDB Engine + Full django integration Special mongoDB field types - listfield, dictfield etc. Model forms Admin integration - Fork of Django 1.3 Can't fully support Django ORM API * joins, transactions, aggregates Admin limitations - EmbeddedFields / ListFields Can't pick up existing Django Apps and reuse them Sometimes confusing as to why you broke it

Slide 27

Slide 27 text

"I *really* want to see this work merged so that django-nonrel can become a plugin and not a fork. But without tests or docs it's just not gonna happen." Jacob Kaplan-Moss django-developers 08/12/2011 Django Nonrel

Slide 28

Slide 28 text

Playing nice • Growing desire for formal django integration • You can chose your level of integration • Consider helping / contributing: http://api.mongodb.org/python/current/tools.html http://mongoengine.org https://github.com/django-nonrel http://django-mongodb.org

Slide 29

Slide 29 text

@mongodb conferences, appearances, and meetups http://www.10gen.com/events http://bit.ly/mongofb Facebook | Twitter | LinkedIn http://linkd.in/joinmongo download at mongodb.org support, training, and this talk brought to you by