Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Rapid and Scalable Development with MongoDB, Py...

Sponsored · Ship Features Fearlessly Turn features on and off without deploys. Used by thousands of Ruby developers.
Avatar for rick446 rick446
April 20, 2012

Rapid and Scalable Development with MongoDB, PyMongo, and Ming

This intermediate-level talk will teach you techniques using the popular NoSQL database MongoDB and the Python library Ming to write maintainable, high-performance, and scalable applications. We will cover everything you need to become an effective Ming/MongoDB developer from basic PyMongo queries to high-level object-document mapping setups in Ming.

Avatar for rick446

rick446

April 20, 2012
Tweet

More Decks by rick446

Other Decks in Technology

Transcript

  1. -  Get  started  with  PyMongo   -  Sprinkle  in  some

     Ming  schemas   -  ODM:  When  a  dict  just  won’t  do  
  2. >>> import pymongo! >>> conn = pymongo.Connection()! >>> conn! Connection('localhost',

    27017)! >>> conn.test! Database(Connection('localhost', 27017), u'test')! >>> conn.test.foo! Collection(Database(Connection('localhost', 27017), u'test'), u'foo')! >>> conn['test-db']! Database(Connection('localhost', 27017), u'test-db')! >>> conn['test-db']['foo-collection']! Collection(Database(Connection('localhost', 27017), u'test- db'), u'foo-collection')! >>> conn.test.foo.bar.baz! Collection(Database(Connection('localhost', 27017), u'test'), u'foo.bar.baz')  
  3. >>> db = conn.test! >>> id = db.foo.insert({'bar':1, 'baz':[ 1,

    2, {'k':5} ] })! >>> id! ObjectId('4e712e21eb033009fa000000')! >>> db.foo.find()! <pymongo.cursor.Cursor object at 0x29c7d50>! >>> list(db.foo.find())! [{u'bar': 1, u'_id': ObjectId('4e712e21eb033009fa000000'), u'baz': [1, 2, {'k': 5}]}]! >>> db.foo.update({'_id':id}, {'$set': { 'bar':2}})! >>> db.foo.find().next()! {u'bar': 2, u'_id': ObjectId('4e712e21eb033009fa000000'), u'baz': [1, 2, {'k': 5}]}! >>> db.foo.remove({'_id':id})! >>> list(db.foo.find())! [ ]   Auto-­‐Generated  _id   Cursors  are  python   generators   Remove  uses  same   query  language  as  find()  
  4. >>> db.foo.insert([ dict(x=x) for x in range(10) ])! [ObjectId('4e71313aeb033009fa00000b'), …

    ] ! >>> list(db.foo.find({ 'x': {'$gt': 3} }))! [{u'x': 4, u'_id': ObjectId('4e71313aeb033009fa00000f')}, {u'x': 5, u'_id': ObjectId('4e71313aeb033009fa000010')}, {u'x': 6, u'_id': ObjectId('4e71313aeb033009fa000011')}, …] ! >>> list(db.foo.find({ 'x': {'$gt': 3} }, { '_id':0 } ))! [{u'x': 4}, {u'x': 5}, {u'x': 6}, {u'x': 7}, {u'x': 8}, {u'x': 9}]! >>> list(db.foo.find({ 'x': {'$gt': 3} }, { '_id':0 } ) ! ... .skip(1).limit(2))! [{u'x': 5}, {u'x': 6}]! >>> db.foo.ensure_index([! ... ('x’,pymongo.ASCENDING),('y’,pymongo.DESCENDING)])! u'x_1_y_-1’   Range  Query   Partial  Retrieval   Compound  Indexes  
  5.   You  gotta  write  Javascript    (for  now)    

    It’s  pretty  slow  (single-­‐threaded  JS  engine)       Javascript  is  used  by     $where  in  a  query     .group(key,  condition,  initial,  reduce,  finalize=None)     .map_reduce(map,  reduce,  out,  finalize=None,  …)     Sharding  gives  some  parallelism  with  .map_reduce()  (and   possibly  ‘$where’).  Otherwise  you’re  single  threaded.   MongoDB  2.2  with  New   Aggregation  Framework   Coming  Real  Soon  Now  ™    
  6. >>> import gridfs! >>> fs = gridfs.GridFS(db)! >>> with fs.new_file()

    as fp:! ... fp.write('The file')! ... ! >>> fp! <gridfs.grid_file.GridIn object at 0x2cae910>! >>> fp._id! ObjectId('4e727f64eb03300c0b000003')! >>> fs.get(fp._id).read()! 'The file'   Arbitrary data can be stored in the ‘fp’ object – it’s just a Document (but please put it in ‘fp.metadata’)   Mime type, links to other docs, etc. Python  context   manager   Retrieve  file  by  _id  
  7. >>> file_id = fs.put('Moar data!', filename='foo.txt')! >>> fs.get_last_version('foo.txt').read()! 'Moar data!’!

    >>> file_id = fs.put('Even moar data!', filename='foo.txt')! >>> fs.get_last_version('foo.txt').read()! 'Even moar data!’! >>> fs.get_version('foo.txt', -2).read()! 'Moar data!’! >>> fs.list()! [u'foo.txt']! >>> fs.delete(fs.get_last_version('foo.txt')._id)! >>> fs.list()! [u'foo.txt']! >>> fs.delete(fs.get_last_version('foo.txt')._id)! >>> fs.list()! []   Create  file  by   filename   “2nd  from  the  last”  
  8. -  Get  started  with  PyMongo   -  Sprinkle  in  some

     Ming  schemas   -  ODM:  When  a  dict  just  won’t  do  
  9.   Your  data  has  a  schema     Your  database

     can  define  and  enforce  it     It  can  live  in  your  application  (as  with  MongoDB)     Nice  to  have  the  schema  defined  in  one  place  in  the  code     Sometimes  you  need  a  “migration”     Changing  the  structure/meaning  of  fields     Adding  indexes,  particularly  unique  indexes     Sometimes  lazy,  sometimes  eager     “Unit  of  work:”  Queuing  up  all  your  updates  can  be  handy  
  10. >>> import ming.datastore! >>> ds = ming.datastore.DataStore('mongodb://localhost:27017', database='test')! >>> ds.db!

    Database(Connection('localhost', 27017), u'test')! >>> session = ming.Session(ds)! >>> session.db! Database(Connection('localhost', 27017), u'test')! >>> ming.configure(**{! ... 'ming.main.master':'mongodb://localhost:27017', ! ... 'ming.main.database':'test'})! >>> Session.by_name('main').db! Database(Connection(u'localhost', 27017), u'test')   Connection  +   Database   Optimized  for  config   files    
  11. from ming import schema, Field! WikiDoc = collection(‘wiki_page', session,! Field('_id',

    schema.ObjectId()),! Field('title', str, index=True),! Field('text', str))! CommentDoc = collection(‘comment', session,! Field('_id', schema.ObjectId()),! Field('page_id', schema.ObjectId(), index=True),! Field('text', str))   Index  created  on   import   Shorthand  for   schema.String  
  12. from ming import Document, Session, Field! class WikiDoc(Document):! class __mongometa__:!

    session=Session.by_name(’main')! name='wiki_page’! indexes=[ ('title') ]! title = Field(str)! text = Field(str)!   Old declarative syntax continues to exist and be supported, but it’s not being actively improved   Sometimes nice when you want additional methods/ attrs on your document class
  13. >>> doc = WikiDoc(dict(title='Cats', text='I can haz cheezburger?'))! >>> doc.m.save()!

    >>> WikiDoc.m.find()! <ming.base.Cursor object at 0x2c2cd90>! >>> WikiDoc.m.find().all()! [{'text': u'I can haz cheezburger?', '_id': ObjectId ('4e727163eb03300c0b000001'), 'title': u'Cats'}]! >>> WikiDoc.m.find().one().text! u'I can haz cheezburger?’! >>> doc = WikiDoc(dict(tietul='LOL', text='Invisible bicycle'))! >>> doc.m.save()! Traceback (most recent call last):! File "<stdin>", line 1,! …! ming.schema.Invalid: <class 'ming.metadata.Document<wiki_page>'>:! Extra keys: set(['tietul'])   Documents  are  dict   subclasses   Exception  pinpoints   problem  
  14. >>> ming.datastore.DataStore('mim://', database='test').db! mim.Database(test)     MongoDB  is  (generally)  fast

        …  except  when  creating  databases     …  particularly  when  you  preallocate         Unit  tests  like  things  to  be  isolated     MIM  gives  you  isolation  at  the  expense  of  speed  &  scaling  
  15. -  Get  started  with  PyMongo   -  Sprinkle  in  some

     Ming  schemas   -  ODM:  When  a  dict  just  won’t  do  
  16. from ming import schema, Field! from ming.odm import (mapper, Mapper,

    RelationProperty, ! ForeignIdProperty)! WikiDoc = collection('wiki_page', session, … )! CommentDoc = collection(’comment’, session, … )! class WikiPage(object): pass! class Comment(object): pass! odmsession.mapper(WikiPage, WikiDoc, properties=dict(! comments=RelationProperty('WikiComment')))! odmsession.mapper(Comment, CommentDoc, properties=dict(! page_id=ForeignIdProperty('WikiPage'),! page=RelationProperty('WikiPage')))! Plain  Old  Python   Classes   Map  classes  to   collection  +  session   “Relations”  
  17. class WikiPage(MappedClass):! class __mongometa__:! session = main_odm_session! name='wiki_page’! indexes =

    [ 'title' ]! _id = FieldProperty(S.ObjectId)! title = FieldProperty(str)! text = FieldProperty(str)! comments = RelationProperty(’Comment’)!
  18.   Session    ODMSession     My_collection.m…    My_mapped_class.query…  

      ODMSession  actually  does  stuff       Track  object  identity     Track  object  modifications     Unit  of  work  flushing  all  changes  at  once   >>> pg = WikiPage(title='MyPage', text='is here')! >>> session.db.wiki_page.count()! 0! >>> main_orm_session.flush()! >>> session.db.wiki_page.count()! 1!
  19.   Various  plug  points  in  the  session     before_flush

        after_flush     Some  uses     Logging  changes  to  sensitive  data  or  for  analytics     Full-­‐text  search  indexing     “last  modified”  fields     Performance  instrumentation  
  20.   Various  plug  points  in  the  mapper     before_/after_:

        Insert     Update     Delete     Remove     Some  uses     Collection/model-­‐specific  logging  (user  creation,  etc.)     Anything  you  might  want  a  SessionExtension  for  but   would  rather  do  per-­‐model