Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PicCollage migration of its ‘likes’ feature

PicCollage migration of its ‘likes’ feature

Avatar for Rails Tuesday

Rails Tuesday

October 21, 2014
Tweet

Other Decks in Technology

Transcript

  1. Problem statement 18GB memory exceeded Added another one which was

    filled up again only 1.5 months later… Ziplist optimizations? Sharding? Key or Range?
  2. Problem statement Talked to guys in Redis discussion group, no

    good solution, twemproxy only for cache, Redis cluster not production ready, neither do service providers such as AWS ElasticCache provide distributed solutions. Cost: USD $1 / 10MB per month
  3. Solution requirements Support existing functionality (fortunately, “likes” are simple) Distributed

    (tired of sharding, migration, config updates) Preferably fully managed (minimize human intervention and errors) Cost competitive
  4. Solution AWS DynamoDB for likes AWS Data Pipeline and Elastic

    MapReduce for migration user_id collage_id created_at 11778407 71641331 1412205880.492 ... ... ...
  5. Solution Redis -> S3 backups -> DynamoDB tables and global

    indices Several Elastic MapReduce jobs and Hive queries running at the same time, whenever dependencies among them allow. Jobs running of clusters of different sizes depending on data size to save costs.
  6. Solution Multiple sweeps of reads and writes on 100M+ rows

    in three different kinds of storage. Sustained 80,000 writes/sec for 1.5 hours. Only limited by AWS account max quota.
  7. Conclusion We moved collage likes from Redis to DynamoDB because

    Redis is not the most space and cost efficient storage for this case. We used AWS Data Pipeline and AWS EMR to complete a heavy data migration with non-trivial step dependencies, in a fast (parallelized), automated, and cost effective manner.