Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥

#OSS2014

 #OSS2014

10th International Conference on Open Source Systems

Chris Aniszczyk

May 07, 2014
Tweet

More Decks by Chris Aniszczyk

Other Decks in Technology

Transcript

  1. Open Source and The Twitter Stack Chris Aniszczyk (@cra) http://aniszczyk.org

    10th International Conference on Open Source Systems #OSS2014
  2. What is Twitter? Twitter is a public real-time information network

    that connects you to what you find interesting The heart of Twitter: tweets
  3. Open Source Craft (operating principles) Use Open Assume Open Define

    Secret Sauce Measure Everything Default to GitHub Default to Permissive Acquire and Open Pay it Forward
  4. Use Open Use and benchmark open source software by default.

    When starting a new initiative, always evaluate open source options before going to reinvent the wheel. (e.g., if redis doesn’t work for you, you better have solid evidence)
  5. Define Secret Sauce Don’t open source anything that represents a

    core business value. Define your secret sauce so there’s a shared understanding that can guide decisions. Embed this secret sauce within your culture and company via training.
  6. Assume Open Assume that what you are developing will be

    opened in the future. Pretend the whole world will be watching. Use reasonable third party dependencies to prevent pain down the road. (we mostly use Apache’s Third Party Guidelines as a starting point)
  7. Default to GitHub The GitHub community is the largest open

    source community, with over three million users. You would be stupid to ignore that fact. Embrace social coding tools to lower the barrier to contribution and participation.
  8. Foundations are Good* We just prefer not to default to

    them. We view them as a place for stable projects that grow into maturity, not to incubate new projects. Our goal is to gain traction first as fast as possible. If not, fail fast and carry on.
  9. Be Permissive For outbound open source software, we default to

    OSI permissive licenses (the ALv2 in the majority of cases). We do this so we can maximize adoption and participation, which we favor instead of control.
  10. See http://antirez.com/news/48 Notes from Antirez (BSD) “First of all, open

    source for me is not a way to contribute to the free software movement, but to contribute to humanity. This means a lot of things, for instance I don't care about what people do with my code, nor if they'll release back their modifications. I simply want people to use my code in one way or the other. Especially I want people to have fun, learn new stuff, and make money with my code. For me other people making money out of something I wrote is not something that I lost, it is something that I gained.”
  11. Acquire and Open* Include open sourcing software in M&A discussions,

    especially if you’re mainly acquiring talent or shelving the product. There’s no need for software to go to waste.
  12. Measure Everything If you can’t measure what you’re doing, you

    have no idea what you’re doing. We measure everything inside of Twitter (affectionately called birdbrain) and make it accessible to everyone.
  13. Pay it Forward Support open source organizations and projects important

    to your business, it’s the right and smart thing to do. This can be financially or simply staffing projects that are strategic to you.
  14. Open Source Craft* Use Open Assume Open Define Secret Sauce

    Measure Everything Default to GitHub Default to Permissive Acquire and Open Pay it Forward Note: This fits in a tweet
  15. What was wrong? Fragile monolithic Rails code base: managing raw

    database and memcache connections to rendering the site and presenting the public APIs Throwing machines at the problem: instead of engineering solutions Trapped in an optimization corner: trade off readability and flexibility for performance
  16. Whale Hunting Expeditions We organized archeology digs and whale hunting

    expeditions to understand large scale failures
  17. Re-envision the system? We wanted big infra wins: in performance,

    reliability and efficiency (reduce machines to run Twitter by 10x) Failure is inevitable in distributed systems: we wanted to isolate failures across our infrastructure Cleaner boundaries with related logic in one place: desire for a loosely coupled services oriented model at the systems level
  18. Ruby VM Reflection Started to evaluate our front end server

    tier: CPU, RAM and network Rails machines were being pushed to the limit: CPU and RAM maxed but not network (200-300 requests/host) Twitter’s usage was growing: it was going to take a lot of machines to keep up with the growth curve
  19. JVM Experimentation We started to experiment with the JVM... Search

    (Java via Lucene) http://engineering.twitter.com/2010/10/twitters-new-search-architecture.html FlockDB: Social Graph (Scala) https://blog.twitter.com/2010/introducing-flockdb https://github.com/twitter/flockdb ...and we liked it, enamored by JVM performance! We weren’t the only ones either: http://www.slideshare.net/pcalcado/from-a-monolithic-ruby-on-rails-app-to-the-jvm
  20. The JVM Solution Level of trust with the JVM with

    previous experience JVM is a mature and world class platform Huge mature ecosystem of libraries Polyglot possibilities (Java, Scala, Clojure, etc)
  21. Decomposing the Monolith Created services based on our core nouns:

    Tweet service User service Timeline service DM service Social Graph service ....
  22. Routing Presentation Logic Storage MySQL Tweet Store Flock Redis Memcached

    Cache TFE (reverse proxy) Monorail Tweet Service User Service Timeline Service SocialGraph Service DM Service User Store API Web Search Feature X Feature Y HTTP THRIFT THRIFT*
  23. Services: Concurrency is Hard Decomposing the monolith: each team took

    slightly different approaches to concurrency Different failure semantics across teams: no consistent back pressure mechanism Failure domains informed us of the importance of having a unified client/server library: deal with failure strategies and load balancing
  24. Finagle Programming Model Takes care of: service discovery, load balancing,

    retrying, connection pooling, stats collection, distributed tracing Future[T]: modular, composable, async, non-blocking I/O http://twitter.github.io/effectivescala/#Concurrency
  25. Tracing with Zipkin Zipkin hooks into the transmission logic of

    Finagle and times each service operation; gives you a visual representation where most of the time to fulfill a request went. https://github.com/twitter/zipkin
  26. Hadoop with Scalding Services receive a ton of traffic and

    generate a ton of use log and debugging entries. @Scalding is a open source Scala library that makes it easy to specify MapReduce jobs with the benefits of functional programming! https://github.com/twitter/scalding
  27. Data Center Evils The evils of single tenancy and static

    partitioning Different jobs... different utilization profiles... Can we do better? STATIC PARTITIONING DATACENTER 0% 33% 0% 33% 0% 33%
  28. Borg and The Birth of Mesos Google was generations ahead

    with Borg/Omega “The Datacenter as a Computer” http://research.google.com/pubs/pub35290.html (2009) engineers focus on resources needed; mixed workloads possible Learn from Google and work w/ university research! http://wired.com/wiredenterprise/2013/03/google-borg-twitter-mesos DATACENTER
  29. Mesos, Linux and cgroups Apache Mesos: kernel of the data

    center obviates the need for virtual machines* isolation via Linux cgroups (CPU, RAM, network, FS) reshape clusters dynamically based on resources multiple frameworks; scalability to 10,000s of nodes
  30. Data Center Computing Reduce CapEx/OpEx via efficient utilization of HW

    http://mesos.apache.org 0% 33% 0% 33% 0% 33% 0% 25% 50% 75% 100% reduces latency! reduces CapEx and OpEx!
  31. How did it all turn out? Not bad... not bad

    at all... Where did the fail whale go?
  32. Site Success Rate Today :) 2006 2010 2014 World Cup

    not a lot of traffic Off the monorail 99._% 100%
  33. Growth Continues Today... 2500+ Employees Worldwide 50% Employees are Engineers

    255M+ Active Users 500M+ Tweets per Day 35+ Languages Supported 76% Active Users are on Mobile 100+ Open Source Projects
  34. Lesson #1 Embrace open source best of breed solutions are

    open these days learn from your peers code and university research don’t only consume, give back to enrich ecosystem: http://opensource.twitter.com
  35. Lesson #2 Incremental change always wins increase chance of success

    by making small changes small changes add up with minimized risk loosely coupled micro services work
  36. Lesson #3 “Data center as a computer” is the future

    direction of infrastructure Efficient use of hardware saves money Better programming model (large cluster as single resource) Check out Apache Mesos: http://mesos.apache.org