DevOps in the wild - Gogobot use case

DevOps in the wild - Gogobot use case

Talk I did in a DevOps conference in Israel.

B7d890bed68fa564c18ff00dfd8207cd?s=128

Avi Tzurel

July 18, 2012
Tweet

Transcript

  1. 9.

    CDN • Edge servers across the world (hundreds) • Caches

    static assets • CSS, JS, Images • Caches full pages (with smart expire API) 9 Wednesday, July 18, 12
  2. 10.

    Reverse Proxy • Super fast connection from edges means users

    get a local experience, with minimal latency • Cache miss? get from load balancer • Logged out traffic almost never hits or affects logged in traffic • scale differently, control differently 10 Wednesday, July 18, 12
  3. 13.

    Front End • Serves user facing content • Communicates with

    cache layer and the DB • No heavy lifting, respond to the user as fast as possible 13 Wednesday, July 18, 12
  4. 14.

    Back End • Serves realtime services (Facebook, Twitter) • Heavy

    lifting thrown at it from the Front End Machines • Hosts workers for the queue service 14 Wednesday, July 18, 12
  5. 17.

    Cache • Memcached cluster • Memcached 1.4+ • 6+ Machines

    • 4.5 - 15K operations per second • Hosted on Amazon ElastiCache • Solved tons of problems with memcached dying 17 Wednesday, July 18, 12
  6. 20.

    MySql • Master + 2 Slaves • 64G memory for

    each machine with 400G storage • Hourly backups • Used as the main persistence layer for the site • Reads are from slaves, writes are from master • Logged out traffic will never have access the master 20 Wednesday, July 18, 12
  7. 21.

    MySql • EBS snapshots are used as backups • Multi

    region support for Amazon (1a, 1b, 1c) 21 Wednesday, July 18, 12
  8. 22.

    MongoDB • 3 Shard • 3 Replica sets in each

    shards • 16HD (100G) raid on each machine • 64G memory for each machine to provide out of memory index for all queries • Used for the Graph Engine + Scoring system 22 Wednesday, July 18, 12
  9. 23.

    Redis • Used for key+value store • Cache tagging solution

    on top of memcached • Queue services is hosted on Redis • Master / Slave replica • different redis cluster for different things 23 Wednesday, July 18, 12
  10. 24.

    Redis • Different Redis clusters for each need • Indexer

    • Cache tagging • Realtime push with Node.js to the client • When one down, others behave normally 24 Wednesday, July 18, 12
  11. 25.

    SOLR • Search index • NoSQL (Schema Less) • Master/Slave

    • Slave on each app machine, single master • Eventual consistency 25 Wednesday, July 18, 12
  12. 26.

    Numbers • 400m+ graph users • 10K triggers per user

    in the scoring system • Grew ~170X last 18 months • Announced 1m registered users a month ago • hit 2m registered users 2 weeks agter 26 Wednesday, July 18, 12
  13. 33.

    Embed it in your culture • Developers should support end

    users • Get Satisfaction for example • Bugs / Problems must be engaged early and often • Be completely and utterly open 33 Wednesday, July 18, 12
  14. 34.

    Communication • Single and agreed line of communication for everything

    • Chatroom through the day • Pull requests for code review • Email for updates • SMS notification for urgent stuff 34 Wednesday, July 18, 12
  15. 48.

    What can he do? (if you ask him nicely) Or

    sudo 48 Wednesday, July 18, 12
  16. 49.

    Gbot • Deploy anything, anywhere, anytime • Tell Jokes •

    Remind us about bugs • Run custom builds • Alert on server issues • cheer us up 49 Wednesday, July 18, 12
  17. 51.

    <keyword> tweet - Returns a link to a tweet about

    <keyword> 51 Wednesday, July 18, 12
  18. 53.

    Monitor • Monit • CPU, Memory, Server health • God

    • process lifecycle 53 Wednesday, July 18, 12
  19. 56.

    Measure • NewRelic • measure performance of everything • Ruby

    • MySql • Memcached • External Services 56 Wednesday, July 18, 12
  20. 64.

    Sharing • Share your ideas • talk about features, future

    plans • plan together • open source your brain 64 Wednesday, July 18, 12