La historia de todo lo que pudo salir mal... pero salió bien

2e19e78ec961cd9c60642ce7fe9f8c68?s=47 Jorge Bastida
November 24, 2013

La historia de todo lo que pudo salir mal... pero salió bien

PyconES talk about lot's of things we have learn while migrating streetlife from mongodb to postgresql and from flask to django.

2e19e78ec961cd9c60642ce7fe9f8c68?s=128

Jorge Bastida

November 24, 2013
Tweet

Transcript

  1. de todo lo que pudo salir mal... salió bien. La

    historia... pero @jorgebastida Jorge Bastida
  2. @jorgebastida Jorge Bastida hola!

  3. None
  4. None
  5. None
  6. 100K+ active users

  7. 100K+ 200K+ active users tasks/day

  8. 100K+ 200K+ 2M+ active users tasks/day emails/month

  9. 100K+ 200K+ 2M+ 2 active users tasks/day emails/month backend

  10. 100K+ 200K+ 2M+ 2 1 active users tasks/day emails/month backend

    frontend
  11. 100K+ 200K+ 2M+ 2 1 0 active users tasks/day emails/month

    backend frontend ops
  12. 11 empleados

  13. Hace 10 meses...

  14. Python 2.6 Flask mongodb Python 2.7 Django PostgreSQL Redis RabbitMQ

  15. ⏳ 3 meses

  16. 0 downtime

  17.  qué hemos aprendido

  18. de todo lo que pudo salir mal... salió bien. La

    historia... pero @jorgebastida Jorge Bastida
  19. ℹ ¿Por qué migramos?

  20. None
  21. Let’s go!

  22.  1. Planificación

  23. Puppet Start Project Deploy Vistas Templates Modelos Tasks Tests Migración

  24. Gantt

  25. ✒ Proof of concept

  26. re-Planificar

  27. 2. Puppet ⚒

  28. None
  29. ⚙ 3. staging

  30.  one line deploys

  31.  $ fab deploy -R staging

  32. $ fab -l Available commands: deploy Deploy all the things!

    purge_varnish Purge varnish cache. start_maintenance_mode Start maintenance mode. stop_maintenance_mode Stop maintenance mode. track_deploy Track deploy. stop_workers Stop celery workers. start_workers Start celery workers. track_upgrade Track upgrade. restart_webserver Restart web server. restore_db Restore database. run_puppet Run Puppet. server_stats Get server stats. get_celery_logs Get celery logs from web servers. get_log Merge and dowload a log file. pg_create_db Create a database. pg_create_role Create a role. pg_drop_cluster Drop Postgresqls cluster pg_drop_db Drop a database. pg_grant_db_to_role Grant db roles. start_webserver Start web server. stop_webserver Restart the web server. ... fat-fab
  33. 4. Migración desde el primer día

  34. 240 minutos

  35. celery canvas (group, chain, chord, map, chunks)

  36. migrate_users #100 celery canvas ftw #200 #300 #400 #500 #600

    #n migrate_messages #100 #200 #300 #400 #500 #600 ... #n ... migrate_pages #100 #200 #300 #400 #500 #600 #n ... migrate_sales #100 #200 #300 #400 #500 #600 #n ... migrate_friends #100 #200 #300 #400 #500 #600 #n ... migrate_mail #100 #200 #300 #400 #500 #600 #n ... migrate_ ... #100 #200 #300 #400 #500 #600 #n ... #1 #2 #3 fab migrate <backup>
  37. ✉ 5. e-mails

  38. django-mail-views + ♥ ✉

  39. class BaseEmail(TemplatedHTMLEmailMessageView): unsubscribable = False mail_list = None stats_group =

    'other' bcc_ratio = 0.01 @classmethod def delay(cls, **kwargs): mailer.delay(cls.__module__, cls.__name__, **kwargs) def confirm(self, context): return True ...
  40. class UserBaseEmail(BaseEmail): def destination(self, context): return context['user'].email def confirm(self, context):

    checks = [super(UserBaseEmail, self).confirm(context)] checks.append(not context['user'].author.closed_account) checks.append(context['user'].valid_email) return all(checks) ActivationEmail.delay(user_id=user.id)
  41. None
  42. None
  43. None
  44. None
  45. None
  46. @celery.task(base=EmailTask) def mailer(module, name, **kwargs): email_cls = mailer.get_cls(module, name) try:

    email = email_cls(**kwargs) email.send() except BotoServerError, error: ... except AbortedEmailError, error: log.error(...) graphite.incr('sl.emails.{0}.aborted'.format(email_cls.stats_group)) else: log.info(...) graphite.incr('sl.emails.{0}.sent'.format(email_cls.stats_group))
  47. 6. Metrics

  48. graphite server carbon graphite statsd udp 8125 diamond tcp 80

  49. graphite server carbon graphite statsd udp 8125 diamond tcp 80

    db server diamond logs web server diamond app logs ☁
  50. graphite server carbon graphite statsd udp 8125 diamond tcp 80

    db server diamond logs web server diamond app logs dev env fab deploy ☁
  51. class BaseTask(Task): abstract = True def __call__(self, *args, **kwargs): graphite.incr('sl.tasks.{0}.start'.format(self.__class__.__name__))

    with graphite.timer('sl.tasks.{0}'.format(self.__class__.__name__)): return super(BaseTask, self).__call__(*args, **kwargs) def on_failure(self, *args, **kwargs): graphite.incr('sl.tasks.{0}.failed'.format(self.__class__.__name__)) def on_success(self, *args, **kwargs): graphite.incr('sl.tasks.{0}.succeeded'.format(self.__class__.__name__)) statsd
  52. class BaseTask(Task): abstract = True def __call__(self, *args, **kwargs): graphite.incr('sl.tasks.{0}.start'.format(self.__class__.__name__))

    with graphite.timer('sl.tasks.{0}'.format(self.__class__.__name__)): return super(BaseTask, self).__call__(*args, **kwargs) def on_failure(self, *args, **kwargs): graphite.incr('sl.tasks.{0}.failed'.format(self.__class__.__name__)) def on_success(self, *args, **kwargs): graphite.incr('sl.tasks.{0}.succeeded'.format(self.__class__.__name__)) statsd
  53. class BaseTask(Task): abstract = True def __call__(self, *args, **kwargs): graphite.incr('sl.tasks.{0}.start'.format(self.__class__.__name__))

    with graphite.timer('sl.tasks.{0}'.format(self.__class__.__name__)): return super(BaseTask, self).__call__(*args, **kwargs) def on_failure(self, *args, **kwargs): graphite.incr('sl.tasks.{0}.failed'.format(self.__class__.__name__)) def on_success(self, *args, **kwargs): graphite.incr('sl.tasks.{0}.succeeded'.format(self.__class__.__name__)) statsd
  54. varnish.hit varnish.miss

  55. web1.mem web2.mem agg(tasks.*) task1 taks2

  56. deploy nginx.connections yesterday(nginx.connections) celery.errors 5XX 4XX

  57. agg(deploys)

  58. 7.Redis

  59. ✉ open.gif redis open ☁ ☁ postgresql 30m click

  60. Stats class UserStats(rediscache.CacheableBaseModel): user = models.OneToOneField('users.User', related_name='stats', primary_key=True) # Regenerable

    messages = models.IntegerField(default=0) comments = models.IntegerField(default=0) ... # Buffers last_email_click = models.DateTimeField(null=True) last_seen = models.DateTimeField(null=True) ... django Model
  61. Stats class UserCache(rediscache.ModelCache): model = UserStats def regenerate(self, user): self.messages

    = Message.objects.filter(...) self.comments = Comments.objects.filter(...) self.save() cache “Model” Cache
  62. Stats class User(models.Model, rediscache.CachedModelMixin): cache_type = UserCache ... User Cache

    User
  63. Stats Cache User >>> UserCache.regenerate_all() every day >>> user =

    User.objects.get(...) >>> user.cache <users.models.UserCache at 0x44946d0> >>> user.cache.messages 23 .cache every 30m >>> UserCache.flush_all() #1 #2 #3 #4 #n #1 #2 #3 #4 #n only required ones.
  64. Stats Cache User >>> UserCache.regenerate_all() every day >>> user =

    User.objects.get(...) >>> user.cache <users.models.UserCache at 0x44946d0> >>> user.cache.messages 23 .cache every 30m >>> UserCache.flush_all() #1 #2 #3 #4 #n #1 #2 #3 #4 #n only required ones.
  65. 8.Testing ¿Cuanto?¿Cómo?

  66. ✉ Notificaciones (Inmediatas, Diarias, Semanales)

  67. Bots!

  68. 0:00 4:00 8:00 12:00 16:00 20:00 Mensajes Comentarios PM Menciones

  69. ⚙ Staging Graphite Migración desde el primer día

  70. ⚠ Life or Death github.com/sbook/lifeordeath

  71. None
  72. 9. Backups

  73. gpg lzop ▴▾ 9. WAL-E backup-push backup-fetch wal-push wal-fetch compression

    encryption
  74.  10.Varnish Cache all the things!

  75. Request Response Varnish gunicorn App Server Load Balancer hit miss

    nignx Django
  76. 0 downtime

  77. Feed Varnish. Logout everybody. Migrate allthethings! 2 3 4 Generate

    urls. 1
  78.  Planificación ⚙ Puppet ⚙ Staging Migración ✉ E-mails Metrics

    Redis Testing  Varnish Backups Fabric celery django-mail-views django-redis lifeordeath wal-e graphite
  79. ℹ Muchas Gracias ¿Preguntas?