Slide 1

Slide 1 text

Lessons learnt building @RossC0 http://github.com/rozza

Slide 2

Slide 2 text

WHAT IS MONGODB? A document database Highly scalable Developer friendly http://mongodb.org In BSON { _id : ObjectId("..."), author : "Ross", date : ISODate("2012-07-05..."), text : "About MongoDB...", tags : [ "tech", "databases" ], comments : [{ author : "Tim", date : ISODate("2012-07-05..."), text : "Best Post Ever!" }], comment_count : 1 }

Slide 3

Slide 3 text

WHAT IS MONGODB? In Python In BSON { "_id" : ObjectId("..."), "author" : "Ross", "date" : datetime(2012,7,5,10,0), "text" : "About MongoDB...", "tags" : ["tech", "databases"], "comments" : [{ "author" : "Tim", "date" : datetime(2012,7,5,11,35), "text" : "Best Post Ever!" }], "comment_count" : 1 } { _id : ObjectId("..."), author : "Ross", date : ISODate("2012-07-05..."), text : "About MongoDB...", tags : [ "tech", "databases" ], comments : [{ author : "Tim", date : ISODate("2012-07-05..."), text : "Best Post Ever!" }], comment_count : 1 }

Slide 4

Slide 4 text

http://education.10gen.com Want to know more?

Slide 5

Slide 5 text

http://www.flickr.com/photos/51838104@N02/5841690990 WH Y DO YO U EVE N NE ED AN ODM?

Slide 6

Slide 6 text

MongoDB a good fit Documents schema in code Enforces schema Data validation Speeds up development Build tooling off it Can DRY up code... SCHEMA LESS != CHAOS

Slide 7

Slide 7 text

Inspired by Django's ORM Supports Python 2.5 - Python 3.3 Originally authored by Harry Marr 2010 I took over development in May 2011 Current release 0.7.5 http://github.com/MongoEngine/mongoengine

Slide 8

Slide 8 text

INTRODUCING MONGOENGINE class Post(Document): title = StringField(max_length=120, required=True) author = ReferenceField('User') tags = ListField(StringField(max_length=30)) comments = ListField(EmbeddedDocumentField('Comment')) class Comment(EmbeddedDocument): content = StringField() name = StringField(max_length=120) class User(Document): email = StringField(required=True) first_name = StringField(max_length=50) last_name = StringField(max_length=50)

Slide 9

Slide 9 text

CREATING A MODEL class Post(Document): title = StringField(max_length=120, required=True) author = ReferenceField('User') tags = ListField(StringField(max_length=30)) comments = ListField(EmbeddedDocumentField('Comment')) Define a class inheriting from Document Map a field to a defined data type strings, ints, binary, files, lists etc.. By default all declared fields aren't required Pass keyword arguments to apply constraints eg set if unique, max_length, default values.

Slide 10

Slide 10 text

INSERTING DATA # Pass data into the constructor user = User(email="[email protected]", name="Ross").save() # Create instance and edit / update in place post = Post() post.title = "mongoengine" post.author = user post.tags = ['odm', 'mongodb', 'python'] post.save() Create instance of the object Update its attributes Call save, insert, update to persist the data

Slide 11

Slide 11 text

QUERYING DATA # An `objects` manager is added to every `Document` class users = User.objects(email='[email protected]') # Pass kwargs to commands are lazy and be extended as needed users.filter(auth=True) # Iterating evaluates the queryset print [u for u in users] Documents have a queryset manager (objects) for querying You can continually extend it Queryset evaluated on iteration

Slide 12

Slide 12 text

6 LESSONS LEARNT

Slide 13

Slide 13 text

LE S SO N 1: DI VE IN ! http://www.flickr.com/photos/jackace/565857899/

Slide 14

Slide 14 text

In May 2011 >200 forks >100 issues ~50 pull requests I needed it Volunteered to help Started reviewing issues Supported Harry and community PROJECT STALLED

Slide 15

Slide 15 text

LE S SO N 2 : METACL ASSES http://www.flickr.com/photos/ubique/135848053

Slide 16

Slide 16 text

WHATS NEEDED TO MAKE AN ORM? Instance methods validation data manipulate data convert data to and from mongodb Queryset methods Finding data Bulk changes

Slide 17

Slide 17 text

METACLASSES class Document(object): __metaclass__ = TopLevelDocumentMetaclass ... class EmbeddedDocument(object): __metaclass__ = DocumentMetaclass ... Needed for: 1. inspect the object inheritance 2. inject functionality to the class Its surprisingly simple - all we need is: __new__

Slide 18

Slide 18 text

METACLASSES 101 TopLevelDocument Document python's type cls, name, bases, attrs IN new class Out

Slide 19

Slide 19 text

METACLASSES TopLevelDocument Document python's type Creates default meta data inheritance rules, id_field, index information, default ordering. Merges in parents meta Validation abstract flag on an inherited class collection set on a subclass Manipulates the attrs going in. IN

Slide 20

Slide 20 text

METACLASSES TopLevelDocument Document python's type Merges all fields from parents Adds in own field definitions Creates lookups _db_field_map _reverse_db_field_map Determine superclasses (for model inheritance) IN

Slide 21

Slide 21 text

METACLASSES TopLevelDocument Document python's type Adds in handling for delete rules So we can handle deleted References Adds class to the registry So we can load the data into the correct class OUT

Slide 22

Slide 22 text

METACLASSES TopLevelDocument Document python's type Builds index specifications Injects queryset manager Sets primary key (if needed) OUT

Slide 23

Slide 23 text

LESSONS LEARNT Spend time learning what is being done and why Don't continually patch: Rewrote the metaclasses in 0.7

Slide 24

Slide 24 text

LE S SO N 3: S TR AYI NG F ROM THE PATH http://www.flickr.com/photos/51838104@N02/5841690990

Slide 25

Slide 25 text

REWRITING THE QUERY LANGUAGE # In pymongo you pass dictionaries to query uk_pages = db.page.find({"published": True}) # In mongoengine uk_pages = Page.objects(published=True) # pymongo dot syntax to query sub documents uk_pages = db.page.find({"author.country": "uk"}) # In mongoengine we use dunder instead uk_pages = Page.objects(author__country='uk')

Slide 26

Slide 26 text

REWRITING THE QUERY LANGUAGE #Somethings are nicer - regular expresion search db.post.find({'title': re.compile('MongoDB', re.IGNORECASE)}) Post.objects(title__icontains='MongoDB') # In mongoengine # Chaining is nicer db.post.update({"published": False}, {"$set": {"published": True}}, multi=True) Post.objects(published=False).update(set__published=True)

Slide 27

Slide 27 text

LE S SO N 4 : NOT ALL IDEAS ARE GOOD http://www.flickr.com/photos/abiding_silence/6951229015

Slide 28

Slide 28 text

CHANGING SAVE # In pymongo save replaces the whole document db.post.save({'_id': 'my_id', 'title': 'MongoDB', 'published': True}) # In mongoengine we track changes post = Post.objects(_id='my_id').first() post.published = True post.save() # Results in: db.post.update({'_id': 'my_id'}, {'$set': {'published': True}})

Slide 29

Slide 29 text

CHANGING SAVE Any field changes are noted How to monitor lists and dicts? Custom List and Dict classes Observes changes and marks as dirty

Slide 30

Slide 30 text

HOW IT WORKS class Post(Document): title = StringField(max_length=120, required=True) author = ReferenceField('User') tags = ListField(StringField(max_length=30)) comments = ListField(EmbeddedDocumentField('Comment')) class User(Document): email = StringField(required=True) first_name = StringField(max_length=50) last_name = StringField(max_length=50) class Comment(EmbeddedDocument): content = StringField() name = StringField(max_length=120)

Slide 31

Slide 31 text

Post HOW IT WORKS - comments comment comment comment post = Post.objects.first() post.comments[1].name = 'Fred' post.save()

Slide 32

Slide 32 text

Post HOW IT WORKS - comments comment 1.Convert the comments data to a BaseList BaseList Stores the instance and name / location comment comment post.comments[1].name = 'Fred'

Slide 33

Slide 33 text

Post HOW IT WORKS - comments comment 2.Convert the comment data to BaseDict sets name as: 'comments.1' comment comment post.comments[1].name = 'Fred'

Slide 34

Slide 34 text

Post HOW IT WORKS - comments comment 3.Change name to "Fred" 4. Tell Post 'comments.1.name' has changed comment comment post.comments[1].name = 'Fred'

Slide 35

Slide 35 text

Post HOW IT WORKS - comments comment 5.On save() Iterate all the changes on post and run $set / $unset queries comment comment post.save() db.post.update( {'_id': 'my_id'}, {'$set': { 'comments.1.name': 'Fred'}} )

Slide 36

Slide 36 text

A GOOD IDEA? + Makes it easier to use + save acts how people think it should - Its expensive - Doesn't help people understand MongoDB

Slide 37

Slide 37 text

LE S SO N 5: M ANAGI NG A COMMUNIT Y http://kingscross.the-hub.net/

Slide 38

Slide 38 text

Github effect >10 django mongoengine projects None active on pypi Little cross project communication CODERS JUST WANT TO CODE * * Side effect of being stalled?

Slide 39

Slide 39 text

Flask-mongoengine on pypi There were 2 different projects Now has extra maintainers from the flask-mongorest Django-mongoengine* Spoke to authors of 7 projects and merged their work together to form a single library * Coming soon! REACH OUT

Slide 40

Slide 40 text

THE COMMUNITY Not all ideas are good! Vocal people don't always have great ideas Travis is great* * but you still have to read the pull request Communities have to be managed I've only just started to learn how to herding cats

Slide 41

Slide 41 text

LE S SO N 6 : DON' T BE AF R A ID TO ASK http://www.flickr.com/photos/kandyjaxx/2012468692

Slide 42

Slide 42 text

Website Documentation Tutorials Framework support Core mongoengine http://twitter.com/RossC0 I NEED HELP ;) http://github.com/MongoEngine

Slide 43

Slide 43 text

Q UE S T I ONS ? http://www.flickr.com/photos/9550033@N04/5020799468