Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Rapid and Scalable Development with MongoDB, PyMongo, and Ming - Rick Copeland, SourceForge

mongodb
November 28, 2011

Rapid and Scalable Development with MongoDB, PyMongo, and Ming - Rick Copeland, SourceForge

MongoDallas 2011

This intermediate-level talk will teach you techniques using the popular NoSQL database MongoDB and the Python library Ming to write maintainable, high-performance, and scalable applications. We will cover everything you need to become an effective Ming/MongoDB developer from basic PyMongo queries to high-level object-document mapping setups in Ming.

mongodb

November 28, 2011
Tweet

More Decks by mongodb

Other Decks in Technology

Transcript

  1. © 2011Geeknet Inc ! PyMongo: Getting Started >>> import pymongo!

    >>> conn = pymongo.Connection()! >>> conn! Connection('localhost', 27017)! >>> conn.test! Database(Connection('localhost', 27017), u'test')! >>> conn.test.foo! Collection(Database(Connection('localhost', 27017), u'test'), u'foo')! >>> conn['test-db']! Database(Connection('localhost', 27017), u'test-db')! >>> conn['test-db']['foo-collection']! Collection(Database(Connection('localhost', 27017), u'test-db'), u'foo-collection')! >>> conn.test.foo.bar.baz! Collection(Database(Connection('localhost', 27017), u'test'), u'foo.bar.baz')
  2. © 2011Geeknet Inc ! PyMongo: Insert / Update / Delete

    >>> db = conn.test! >>> id = db.foo.insert({'bar':1, 'baz':[ 1, 2, {`k':5} ] })! >>> id! ObjectId('4e712e21eb033009fa000000')! >>> db.foo.find()! <pymongo.cursor.Cursor object at 0x29c7d50>! >>> list(db.foo.find())! [{u'bar': 1, u'_id': ObjectId('4e712e21eb033009fa000000'), u'baz': [1, 2, {k': 5}]}]! >>> db.foo.update({'_id':id}, {'$set': { 'bar':2}})! >>> db.foo.find().next()! {u'bar': 2, u'_id': ObjectId('4e712e21eb033009fa000000'), u'baz': [1, 2, {k': 5}]}! >>> db.foo.remove({'_id':id})! >>> list(db.foo.find())! [ ]
  3. © 2011Geeknet Inc ! PyMongo: Queries, Indexes >>> db.foo.insert([ dict(x=x)

    for x in range(10) ])! [ObjectId('4e71313aeb033009fa00000b'), … ]! >>> list(db.foo.find({ 'x': {'$gt': 3} }))! [{u'x': 4, u'_id': ObjectId('4e71313aeb033009fa00000f')}, ! {u'x': 5, u'_id': ObjectId('4e71313aeb033009fa000010')}, ! {u'x': 6, u'_id': ObjectId('4e71313aeb033009fa000011')}, …]! >>> list(db.foo.find({ 'x': {'$gt': 3} }, { '_id':0 } ))! [{u'x': 4}, {u'x': 5}, {u'x': 6}, {u'x': 7}, {u'x': 8}, ! {u'x': 9}]! >>> list(db.foo.find({ 'x': {'$gt': 3} }, { '_id':0 } )! .skip(1).limit(2))! [{u'x': 5}, {u'x': 6}]! >>> db.foo.ensure_index([ ! ('x', pymongo.ASCENDING), ('y', pymongo.DESCENDING) ] )! u'x_1_y_-1'
  4. © 2011Geeknet Inc ! PyMongo and Locking One Rule (for

    now): Avoid Javascript http://www.flickr.com/photos/lizjones/295567490/
  5. © 2011Geeknet Inc ! PyMongo: Aggregation et.al. l  You gotta

    write Javascript L (for now) l  It`s pretty slow (single-threaded JS engine) L l  Javascript is used by l  $where in a query l  .group(key, condition, initial, reduce, finalize=None) l  .map_reduce(map, reduce, out, finalize=None, …) l  If you shard, you can get some parallelism across multiple mongod instances with .map_reduce() (and possibly b$where`). Otherwise you`re single threaded.
  6. © 2011Geeknet Inc ! PyMongo: GridFS >>> import gridfs! >>>

    fs = gridfs.GridFS(db)! >>> with fs.new_file() as fp:! ... fp.write('The file')! ... ! >>> fp! <gridfs.grid_file.GridIn object at 0x2cae910>! >>> fp._id! ObjectId('4e727f64eb03300c0b000003')! >>> fs.get(fp._id).read()! 'The file' l  Arbitrary data can be stored in the bfp` object – it`s just a Document (but please put it in bfp.metadata`) l  Mime type l  Filename
  7. © 2011Geeknet Inc ! PyMongo: GridFS Versioning >>> file_id =

    fs.put('Moar data!', filename='foo.txt')! >>> fs.get_last_version('foo.txt').read()! 'Moar data!`! >>> file_id = fs.put('Even moar data!', filename='foo.txt')! >>> fs.get_last_version('foo.txt').read()! 'Even moar data!`! >>> fs.get_version('foo.txt', -2).read()! 'Moar data!`! >>> fs.list()! [u'foo.txt']! >>> fs.delete(fs.get_last_version('foo.txt')._id)! >>> fs.list()! [u'foo.txt']! >>> fs.delete(fs.get_last_version('foo.txt')._id)! >>> fs.list()! []
  8. © 2011Geeknet Inc ! - Get started with PyMongo - Sprinkle in

    some Ming schemas - ORM: When a dict just won`t do
  9. © 2011Geeknet Inc ! Why Ming? l  Your data has

    a schema l  Your database can define and enforce it l  It can live in your application (as with MongoDB) l  Nice to have the schema defined in one place in the code l  Sometimes you need a lmigrationz l  Changing the structure/meaning of fields l  Adding indexes, particularly unique indexes l  Sometimes lazy, sometimes eager l  lUnit of work:z Queuing up all your updates can be handy l  Python dicts are nice; objects are nicer
  10. © 2011Geeknet Inc ! Ming: Engines & Sessions >>> import

    ming.datastore! >>> ds = ming.datastore.DataStore('mongodb://localhost:27017', database='test')! >>> ds.db! Database(Connection('localhost', 27017), u'test')! >>> session = ming.Session(ds)! >>> session.db! Database(Connection('localhost', 27017), u'test')! >>> ming.configure(**{'ming.main.master':'mongodb://localhost: 27017', 'ming.main.database':'test'})! >>> Session.by_name('main').db! Database(Connection(u'localhost', 27017), u'test')
  11. © 2011Geeknet Inc ! Ming: Define Your Schema from ming

    import schema, Field WikiDoc = collection(bwiki_page', session, Field('_id', schema.ObjectId()), Field('title', str, index=True), Field('text', str)) CommentDoc = collection(bcomment', session, Field('_id', schema.ObjectId()), Field('page_id', schema.ObjectId(), index=True), Field('text', str))
  12. © 2011Geeknet Inc ! Ming: Define Your Schema… Once more,

    with feeling from ming import Document, Session, Field! class WikiDoc(Document):! class __mongometa__:! session=Session.by_name(`main')! name='wiki_page`! indexes=[ ('title') ]! title = Field(str)! text = Field(str)! …! l  Old declarative syntax continues to exist and be supported, but it`s not being actively improved l  Sometimes nice when you want additional methods/ attrs on your document class
  13. © 2011Geeknet Inc ! Ming: Use Your Schema >>> doc

    = WikiDoc(dict(title='Cats', text='I can haz cheezburger?'))! >>> doc.m.save()! >>> WikiDoc.m.find()! <ming.base.Cursor object at 0x2c2cd90>! >>> WikiDoc.m.find().all()! [{'text': u'I can haz cheezburger?', '_id': ObjectId ('4e727163eb03300c0b000001'), 'title': u'Cats'}]! >>> WikiDoc.m.find().one().text! u'I can haz cheezburger?`! >>> doc = WikiDoc(dict(tietul='LOL', text='Invisible bicycle'))! >>> doc.m.save()! Traceback (most recent call last): File "<stdin>", line 1, …! ming.schema.Invalid: <class 'ming.metadata.Document<wiki_page>'>: Extra keys: set(['tietul'])
  14. © 2011Geeknet Inc ! Ming Bonus: Mongo-in-Memory >>> ming.datastore.DataStore('mim://', database='test').db!

    mim.Database(test) l  MongoDB is (generally) fast l  … except when creating databases l  … particularly when you preallocate l  Unit tests like things to be isolated l  MIM gives you isolation at the expense of speed & scaling
  15. © 2011Geeknet Inc ! - Get started with PyMongo - Sprinkle in

    some Ming schemas - ORM: When a dict just won`t do
  16. © 2011Geeknet Inc ! Ming ORM: Classes and Collections from

    ming import schema, Field from ming.orm import (mapper, Mapper, RelationProperty, ForeignIdProperty) WikiDoc = collection(bwiki_page', session, Field('_id', schema.ObjectId()), Field('title', str, index=True), Field('text', str)) CommentDoc = collection(bcomment', session, Field('_id', schema.ObjectId()), Field('page_id', schema.ObjectId(), index=True), Field('text', str)) class WikiPage(object): pass class Comment(object): pass ormsession.mapper(WikiPage, WikiDoc, properties=dict( comments=RelationProperty('WikiComment'))) ormsession.mapper(Comment, CommentDoc, properties=dict( page_id=ForeignIdProperty('WikiPage'), page=RelationProperty('WikiPage')))
  17. © 2011Geeknet Inc ! Ming ORM: Classes and Collections (declarative)

    class WikiPage(MappedClass):! class __mongometa__:! session = main_orm_session! name='wiki_page`! indexes = [ 'title' ]! ! _id=FieldProperty(S.ObjectId)! title = FieldProperty(str)! text = FieldProperty(str)! comments = RelationProperty(bComment`)! ! class Comment(MappedClass):! class __mongometa__:! session = main_orm_session! name='comment`! indexes = [ 'page_id' ]! ! _id=FieldProperty(S.ObjectId)! page_id = ForeignIdProperty(WikiPage)! page = RelationProperty(WikiPage)! text = FieldProperty(str)
  18. © 2011Geeknet Inc ! Ming ORM: Sessions and Queries l 

    Session è ORMSession l  My_collection.m… è My_mapped_class.query… l  ORMSession actually does stuff l  Track object identity l  Track object modifications l  Unit of work flushing all changes at once >>> pg = WikiPage(title='MyPage', text='is here')! >>> session.db.wiki_page.count()! 0! >>> main_orm_session.flush()! >>> session.db.wiki_page.count()! 1!
  19. © 2011Geeknet Inc ! Ming ORM: Extending the Session l 

    Various plug points in the session l  before_flush l  after_flush l  Some uses l  Logging changes to sensitive data or for analytics purposes l  Full-text search indexing l  llast modifiedz fields l  Performance instrumentation
  20. © 2011Geeknet Inc ! Ming ORM: Extending the Mapper l 

    Various plug points in the mapper l  before_/after_: l  Insert l  Update l  Delete l  Remove l  Some uses l  Collection/model-specific logging (user creation, etc.) l  Anything you might want a SessionExtension for but would rather do per-model
  21. Related Projects Ming http://sf.net/projects/ merciless/ MIT License Zarkov http://sf.net/p/zarkov/ Apache

    License Allura http://sf.net/p/allura/ Apache License PyMongo http://api.mongodb.org/ python Apache License