Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Boredom comes to those who wait

rbanffy
September 29, 2011

Boredom comes to those who wait

An introduction to Google's ndb asynchronous datastore API for App Engine applications.

rbanffy

September 29, 2011
Tweet

More Decks by rbanffy

Other Decks in Programming

Transcript

  1. Boredom comes to those Boredom comes to those who wait

    who wait Asynchronous calls to Datastore Plus Asynchronous calls to Datastore Plus
  2. What you should know What you should know •toplevels/tasklets toplevels/tasklets

    •futures futures •yield (you are going to build lots of generators) yield (you are going to build lots of generators) •you'll use yield instead of return (most of the time) you'll use yield instead of return (most of the time) •don't think about threads don't think about threads •..._async ..._async •the query API changes the query API changes •an insanely great use for comparator overrides an insanely great use for comparator overrides
  3. How it used to be done How it used to

    be done class Fact(model.Model): class Fact(model.Model): text = model.TextProperty() text = model.TextProperty() rating = model.FloatProperty(default = 400.) rating = model.FloatProperty(default = 400.) random_index = model.ComputedProperty( random_index = model.ComputedProperty( lambda self : random.randint(0, lambda self : random.randint(0, sys.maxint)) sys.maxint)) (...) (...) for i in range(10): for i in range(10): Fact(text = 'Fact %d' % i).put() Fact(text = 'Fact %d' % i).put()
  4. Digression: _ah/stats Digression: _ah/stats •A good way to understand the

    performance of your apps A good way to understand the performance of your apps •If you are doing something wrong, it'll become obvious If you are doing something wrong, it'll become obvious Easy to enable on app.yaml: Easy to enable on app.yaml: builtins: builtins: (...) (...) - appstats: on - appstats: on
  5. The asynchronous way The asynchronous way futures = [] futures

    = [] for i in range(10): for i in range(10): futures.append(Fact(text = 'Fact %d' % \ futures.append(Fact(text = 'Fact %d' % \ i).put_async()) i).put_async()) [ f.get_result() for f in futures ] [ f.get_result() for f in futures ] Gives the opportunity to aggregate puts into one large datastore Gives the opportunity to aggregate puts into one large datastore operation (and we don't have to worry about it) operation (and we don't have to worry about it)
  6. Better: toplevel/tasklet Better: toplevel/tasklet @context.toplevel @context.toplevel (decorating something – usually

    a request handler – that will call the following tasklet) (decorating something – usually a request handler – that will call the following tasklet) @tasklets.tasklet @tasklets.tasklet def init_facts() def init_facts() futures = [] futures = [] for i in range(10): for i in range(10): futures.append(Fact(text = 'Fact %d' % \ futures.append(Fact(text = 'Fact %d' % \ i).put_async()) i).put_async()) yield futures yield futures Yield allows the toplevel event loop to manage other generators making asynchronous calls Yield allows the toplevel event loop to manage other generators making asynchronous calls
  7. … … even better even better @context.toplevel @context.toplevel (decorating your

    handler) (decorating your handler) @tasklets.tasklet @tasklets.tasklet def init_facts() def init_facts() Futures = [] Futures = [] for i in range(10): for i in range(10): futures.append(Fact(text = 'Fact %d' \ futures.append(Fact(text = 'Fact %d' \ % i).put_async()) % i).put_async()) raise tasklets.Return(futures) raise tasklets.Return(futures) Because it's considered polite to raise things when a generator has nothing else to generate Because it's considered polite to raise things when a generator has nothing else to generate
  8. ab -n 10000 -c 50 (synchronous) ab -n 10000 -c

    50 (synchronous) Connection Times (ms) Connection Times (ms) min mean[+/-sd] median max min mean[+/-sd] median max Connect: 140 159 69.1 145 976 Connect: 140 159 69.1 145 976 Processing: 338 7408 5452.2 6231 46247 Processing: 338 7408 5452.2 6231 46247 Waiting: 338 7407 5452.2 6230 46247 Waiting: 338 7407 5452.2 6230 46247 Total: Total: 482 7567 5442.4 6377 46401 482 7567 5442.4 6377 46401 Percentage of the requests served within a certain time (ms) Percentage of the requests served within a certain time (ms) 50% 6377 50% 6377 66% 8540 66% 8540 75% 10131 75% 10131 80% 11068 80% 11068 90% 13419 90% 13419 95% 16077 95% 16077 98% 23883 98% 23883 99% 30173 99% 30173 100% 46401 (longest request) 100% 46401 (longest request)
  9. ab -n 10000 -c 50 (asynchronous) ab -n 10000 -c

    50 (asynchronous) Connection Times (ms) Connection Times (ms) min mean[+/-sd] median max min mean[+/-sd] median max Connect: 140 669 1375.6 151 21193 Connect: 140 669 1375.6 151 21193 Processing: 189 338 300.0 256 15320 Processing: 189 338 300.0 256 15320 Waiting: 189 335 243.7 255 4143 Waiting: 189 335 243.7 255 4143 Total: Total: 332 1007 1407.6 438 21450 332 1007 1407.6 438 21450 Percentage of the requests served within a certain time (ms) Percentage of the requests served within a certain time (ms) 50% 438 50% 438 66% 565 66% 565 75% 732 75% 732 80% 1272 80% 1272 90% 3372 90% 3372 95% 3456 95% 3456 98% 3762 98% 3762 99% 9366 99% 9366 100% 21450 (longest request) 100% 21450 (longest request)
  10. Wrong Wrong for f in Fact.query(): for f in Fact.query():

    f.rating = random.normalvariate(400, 20) f.rating = random.normalvariate(400, 20) f.put() f.put()
  11. Right: map_async Right: map_async @tasklets.tasklet @tasklets.tasklet def randomize_rating(f): def randomize_rating(f):

    f.rating = random.normalvariate(400, 20) f.rating = random.normalvariate(400, 20) raise tasklets.Return(f.put_async()) raise tasklets.Return(f.put_async()) @context.toplevel @context.toplevel def randomize_all(): def randomize_all(): Fact.query().map_async(randomize_rating) Fact.query().map_async(randomize_rating)
  12. What else should I know? What else should I know?

    •context and its event loop context and its event loop •caches caches •new datatypes new datatypes •new names for old types new names for old types •repeated = True repeated = True •StructuredProperty, LocalStructuredProperty StructuredProperty, LocalStructuredProperty •compress compress •shorter response times and more efficient instance usage shorter response times and more efficient instance usage
  13. Where do I find it? Where do I find it?

    •official builds official builds •http://code.google.com/p/appengine-ndb-experiment/downloads/list http://code.google.com/p/appengine-ndb-experiment/downloads/list •"Bleeding" edge "Bleeding" edge •hg clone https://code.google.com/p/appengine-ndb-experiment/ hg clone https://code.google.com/p/appengine-ndb-experiment/ •version 0.7 released yesterday version 0.7 released yesterday
  14. To know more To know more •documentation: documentation: http://code.google.com/p/appengine-ndb-experiment/ http://code.google.com/p/appengine-ndb-experiment/

    •Google group: Google group: http://groups.google.com/group/appengine-ndb-discuss/ http://groups.google.com/group/appengine-ndb-discuss/
  15. Thanks Thanks Thanks to the fine people who hang out

    on the appengine-ndb-discuss Thanks to the fine people who hang out on the appengine-ndb-discuss group, in special to Guido and Vladimir, whose suggestions pointed me on group, in special to Guido and Vladimir, whose suggestions pointed me on the right direction. the right direction.