Slide 1

Slide 1 text

Asynchronous MongoDB with Python & Tornado A.  Jesse  Jiryu  Davis [email protected] emptysquare.net Tuesday, July 24, 12

Slide 2

Slide 2 text

Agenda • Async  is  good • Why  is  async  hard? • What  is  Tornado  and  how  does  it  work? • Using  Tornado  with  PyMongo • Motor,  my  experimental  driver Tuesday, July 24, 12

Slide 3

Slide 3 text

CPU-­‐bound  web  service Client Server sockets •  No  need  for  async •  Just  spawn  one  process  per  core Clients Tuesday, July 24, 12

Slide 4

Slide 4 text

Normal  web  service Client Server sockets •  Assume  backend  is  slow  but  high-­‐throughput •  Service  is  bound  by  memory Backend (DB,  web  service, SAN,  …) socket Clients Tuesday, July 24, 12

Slide 5

Slide 5 text

What’s  async  for? • Minimize  resources  per  connecTon • I.e.,  wait  for  backend  as  cheaply  as  possible Tuesday, July 24, 12

Slide 6

Slide 6 text

CPU-­‐  vs.  Memory-­‐bound Crypto Chat Most  web  services? • Memory-­‐bound CPU-­‐bound Tuesday, July 24, 12

Slide 7

Slide 7 text

HTTP  long-­‐polling  (“COMET”) • E.g.,  chat  server,  Twi[er  client • Async’s  killer  app • Short-­‐polling  is  CPU-­‐bound:  tradeoff  between   latency  and  load • Long-­‐polling  is  memory  bound • “C10K  problem”:  kegel.com/c10k.html • Tornado  was  invented  for  this Tuesday, July 24, 12

Slide 8

Slide 8 text

Why  is  async  hard  to  code? Backend Client Server request response store  state request response Tme Tuesday, July 24, 12

Slide 9

Slide 9 text

Ways  to  store  state Coding  difficulty MulTthreading Tornado,  Node.js Memory  per  connecTon Tuesday, July 24, 12

Slide 10

Slide 10 text

Threads: # pseudo-Python sock = listen() request = parse_http(sock.recv()) mongo_data = db.collection.find() response = format_response(mongo_data) sock.sendall(response) Tuesday, July 24, 12

Slide 11

Slide 11 text

Tornado: class MainHandler(tornado.web.RequestHandler): @tornado.web.asynchronous def get(self): AsyncHTTPClient().fetch( ! ! "http://example.com", callback=self.on_response) def on_response(self, response): formatted = format_response(response) self.write(formatted) self.finish() Tuesday, July 24, 12

Slide 12

Slide 12 text

Tornado  IOStream class IOStream(object): def read_bytes(self, num_bytes, callback): self.read_bytes = num_bytes self.read_callback = callback io_loop.add_handler( self.socket.fileno(), ! ! ! ! self.handle_events, ! ! ! ! events=READ) def handle_events(self, fd, events): data = self.socket.recv(self.read_bytes) self.read_callback(data) Tuesday, July 24, 12

Slide 13

Slide 13 text

Tornado  IOLoop class IOLoop(object): def add_handler(self, fd, handler, events): self._handlers[fd] = handler # _impl is epoll or kqueue or ... self._impl.register(fd, events) def start(self): while True: event_pairs = self._impl.poll() for fd, events in event_pairs: self._handlers[fd](fd, events) Tuesday, July 24, 12

Slide 14

Slide 14 text

Python,  MongoDB,  &  concurrency • Threads  work  great  with  PyMongo – to  the  extent  they  work  in  Python  at  all • Tornado  works  so-­‐so – AsyncMongo • No  replica  sets,  only  first  batch,  no  SON  manipulators,  no   document  classes,  … – PyMongo • OK  if  all  your  queries  are  fast • Use  extra  Tornado  processes Tuesday, July 24, 12

Slide 15

Slide 15 text

Introducing:  “Motor” • Mongo  +  Tornado • Experimental • Might  be  official  in  a  few  months • Uses  Tornado  IOLoop  and  IOStream • Presents  standard  Tornado  callback  API • Stores  state  internally  with  greenlets h[p://emptysquare.net/motor Tuesday, July 24, 12

Slide 16

Slide 16 text

Motor db = motor.MotorConnection().open_sync().my_database class MainHandler(tornado.web.RequestHandler): @tornado.web.asynchronous def get(self): db.collection.insert( {'x':1}, callback=self.inserted) def inserted(self, result, error): if error: raise error self.write('OK') self.finish() Tuesday, July 24, 12

Slide 17

Slide 17 text

Motor  (with  Tornado  Tasks!) db = motor.MotorConnection().open_sync().mydb class MainHandler(tornado.web.RequestHandler): @tornado.web.asynchronous @gen.engine def get(self): yield motor.Op( db.collection.insert, {'x':1}) self.write('OK') self.finish() Tuesday, July 24, 12

Slide 18

Slide 18 text

Demo:  “Chirp” h[ps://github.com/ajdavis/chirp • Using  PyMongo • Using  Motor Tuesday, July 24, 12

Slide 19

Slide 19 text

QuesTons? A.  Jesse  Jiryu  Davis [email protected] emptysquare.net Tuesday, July 24, 12