Pro Yearly is on sale from $80 to $50! »

Surviving technology transitions: Adding and (more importantly) removing tools from an existing stack

3d7b72d70ff07f8186126a4464bc6166?s=47 Maggie Zhou
October 27, 2015

Surviving technology transitions: Adding and (more importantly) removing tools from an existing stack

We share our lessons learned in removing and adding technologies, including stories from our Etsy experiences. Expect to come away with a better idea of the technical and political problems involved in these changes.

Your technology stack doesn’t remain static for the life of your company. As you need to scale, as you improve performance, as new tools become available, you will find yourself needing to add new tools to your stack, and hopefully remove their predecessors. The past choices made sense at the time, but now you need to evaluate new choices and make the effort to improve your infrastructure. Here’s what we’ve learned about the process.

The lifecycle of a tool: how we experiment with new infrastructure, upgrade existing things, and get rid of old infrastructure.

3d7b72d70ff07f8186126a4464bc6166?s=128

Maggie Zhou

October 27, 2015
Tweet

Transcript

  1. Surviving technology transitions: Adding and removing tools from an existing

    stack
  2. The “steady” state MIXTstudio.etsy.com

  3. Melissa Santos | @ansate Maggie Zhou | @zmagg Evaluating Change

    Have a process. ! ! ! ! It will probably suck at first. MareBearCrafts.etsy.com
  4. Melissa Santos | @ansate Maggie Zhou | @zmagg Evaluating Change

    Enumerate requirements ! and advantages of potential solutions
  5. Melissa Santos | @ansate Maggie Zhou | @zmagg Evaluating Change

    TheFreckledBerry.etsy.com
  6. Melissa Santos | @ansate Maggie Zhou | @zmagg Evaluating: Architecture

    Review CherryOrchardAttic.etsy.com
  7. Melissa Santos | @ansate Maggie Zhou | @zmagg Evaluating: Architecture

    Review EventPipe - a new ETL process for analytics logging, backed by Kafka
  8. Melissa Santos | @ansate Maggie Zhou | @zmagg EventPipe Architecture

    Review •The problem: old architecture was brittle. GET requests, logrotate •Wins: distributed system - near real time metrics into event health, sets us up for streaming analytics later
  9. Melissa Santos | @ansate Maggie Zhou | @zmagg Evaluating: Operability

    Review Understand: •how the system will break •how we will know •how we will react
  10. Melissa Santos | @ansate Maggie Zhou | @zmagg Operability Review:

    Eventpipe
  11. Melissa Santos | @ansate Maggie Zhou | @zmagg Operability Review:

    Eventpipe ! • Scenario: Franz dies on a single beacon box • Scenario: Apache dies on a single beacon box • Scenario: Franz dies on all beacon boxes • Scenario: Franz dies on *most* boxes • Scenario: One Kafka box shuts down cleanly • Scenario: Cut power to one Kafka box • Scenario: Cut power to one chassis (4 Kafka nodes) • Scenario: Most Kafka boxes are unreachable (see how many we can shut down and still stay up) • Scenario: One ZooKeeper dies • Scenario: Two ZooKeepers die
  12. Melissa Santos | @ansate Maggie Zhou | @zmagg Comfortable upgrades

    curlybracketdesign.etsy.com
  13. Melissa Santos | @ansate Maggie Zhou | @zmagg Comfortable upgrades

    Test, test, test! ! How do you gain confidence? !
  14. Melissa Santos | @ansate Maggie Zhou | @zmagg Comfortable upgrades

    Case study: HHVM
  15. Melissa Santos | @ansate Maggie Zhou | @zmagg Comfortable upgrades:

    Testing •benchmarking
  16. Melissa Santos | @ansate Maggie Zhou | @zmagg Comfortable upgrades

    How do you gain confidence? •slow ramp-up •run an a/b experiment •ramp up the things that could go badly by themselves (high traffic pages, unique features)
  17. Melissa Santos | @ansate Maggie Zhou | @zmagg Upgrading: HHVM

  18. Melissa Santos | @ansate Maggie Zhou | @zmagg Removing old

    tech StageFortPress.etsy.com
  19. Melissa Santos | @ansate Maggie Zhou | @zmagg Removing old

    tech mygoodbabushka.etsy.com
  20. Melissa Santos | @ansate Maggie Zhou | @zmagg Removing old

    tech SophieLadyDeParis.etsy.com
  21. Melissa Santos | @ansate Maggie Zhou | @zmagg Removing old

    tech a story: Cascading.Jruby
  22. Melissa Santos | @ansate Maggie Zhou | @zmagg 1,098 c.jr

    jobs Removing Cascading.jruby
  23. Melissa Santos | @ansate Maggie Zhou | @zmagg Removing Cascading.jruby

  24. Melissa Santos | @ansate Maggie Zhou | @zmagg Removing old

    tech: Celebrate! Celebrate!
  25. Melissa Santos | @ansate Maggie Zhou | @zmagg Evaluation Takeaways

    •It’s ok to choose NOT to do something •But if you do something, you gotta explain what problem it solves (and how) •Process helps people see how you made the decision
  26. Melissa Santos | @ansate Maggie Zhou | @zmagg Upgrading Takeaways

    •testestestestestest •Be clear about what’s different •Publicize and celebrate how it is better!
  27. Melissa Santos | @ansate Maggie Zhou | @zmagg Retiring Takeaways

    •It takes longer than you think it will •It’s gonna hurt •It’s going to feel so good once you’re done
  28. Melissa Santos | @ansate Maggie Zhou | @zmagg Thanks! title

    slide image from TheWoodChopShoppe.etsy.com