Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ndb

 ndb

spicyj

May 28, 2014
Tweet

More Decks by spicyj

Other Decks in Technology

Transcript

  1. Why ndb? 1. Less stupid by default 2. More flexible

    queries 3. Tasklets with autobatching
  2. Less stupid by default With db: class UserVideo(db.Model): user_id =

    db.StringProperty() video = db.ReferenceProperty(Video) user_video = UserVideo.get_for_video_and_user_data( video, user_data) return jsonify(user_video) # slow
  3. Less stupid by default With ndb: class UserVideo(ndb.Model): user_id =

    ndb.StringProperty() video = ndb.KeyProperty(kind=Video) user_video = UserVideo.get_for_video_and_user_data( video, user_data) return jsonify(user_video) # not slow!
  4. More flexible queries ndb lets you build filters using ndb.AND

    and ndb.OR: questions = Feedback.query() .filter(Feedback.type == 'question') .filter(Feedback.target == video_key) .filter(ndb.OR( Feedback.is_visible_to_public == True, Feedback.author_user_id == current_id)) .fetch(1000) Magic happens.
  5. Performance The datastore is slow. How can we speed things

    up? 4 Batch operations together 4 Do things in parallel 4 Avoid the datastore
  6. Tasklets and autobatching def get_user_exercise_cache(user_data): uec = UEC.get_for_user_data(user_data) if not

    uec: user_exercises = UE.get_all(user_data) uec = UEC.build(user_exercises) return uec def get_all_uecs(user_datas): return map(get_user_exercise_cache, user_datas)
  7. Tasklets and autobatching @ndb.tasklet def get_user_exercise_cache_async(user_data): uec = yield UEC.get_for_user_data_async(user_data)

    if not uec: user_exercises = yield UE.get_all(user_data) uec = UEC.build(user_exercises) raise ndb.Return(uec) @ndb.synctasklet def get_all_uecs(user_datas): uecs = yield map(get_user_exercise_cache_async, user_datas) raise ndb.Return(uecs)
  8. Mysterious errors You heard from Marcia about this gem back

    in March: TypeError: '_BaseValue' object is not subscriptable
  9. Q: What's worse than code that doesn't work at all?

    A: Code that mostly works but breaks in subtle ways.
  10. Secret slowness #1 Multi-queries, with IN and OR: answers =

    Feedback.query() .filter(Feedback.type == 'answer') .filter(Feedback.in_reply_to.IN(question_keys)) .fetch(1000) Doesn't run in parallel!
  11. Secret slowness #1 A not-horribly-slow multi-query: answers = Feedback.query() .filter(Feedback.type

    == 'answer') .filter(Feedback.in_reply_to.IN(question_keys)) .order(Feedback.__key__) .fetch(1000)
  12. Secret slowness #2 Query iterators: query = Feedback.query().filter( Feedback.topic_ids ==

    'algebra') questions = [] for q in query.iter(batch_size=20): if q.is_visible_to(user_data): questions.append(q) if len(questions) >= 10: break