Late, Over Budget, & Happy: Our Service Extraction Story

206b01d5765d45a6db9f3fbd1173fd18?s=47 Nat Budin
November 20, 2019

Late, Over Budget, & Happy: Our Service Extraction Story

A 3-month project stretches out to 9 months. It's widely viewed as over-engineered and difficult to work with. But months after deployment, it's considered successful. What happened?

In this talk, a principal engineer and a director of engineering recount the extraction of a social feeds GraphQL service from a decade-old Rails monolith. Along the way, we'll cover topics including selling a big project, complexity in remote work, complexity in deployments, and complexity in emotions. We'll tell you about the scars we acquired and the lessons we learned.

206b01d5765d45a6db9f3fbd1173fd18?s=128

Nat Budin

November 20, 2019
Tweet

Transcript

  1. @amynewell / @natbudin rubyconf2019 Late, Over Budget, and Happy Our

    service extraction story
  2. @amynewell / @natbudin rubyconf2019 Hi, we’re Amy Newell and Nat

    Budin We both used to work at PatientsLikeMe (PLM)
  3. @amynewell / @natbudin rubyconf2019 Amy Newell - formerly Director of

    Engineering Now at Wistia (an excellent place to work!) Based in Boston Nat Budin - formerly Principal Software Eng Now at ActBlue (another excellent place to work!) Based in Seattle This talk is about a project we did at PLM
  4. @amynewell / @natbudin rubyconf2019 Hindsight boxes

  5. @amynewell / @natbudin rubyconf2019 Content Warning Frank discussion of mental

    illness
  6. @amynewell / @natbudin rubyconf2019 • It’s a social network for

    people with chronic medical conditions • Lets you track your symptoms, treatments, progress of your condition • A major reason people come to the site is to find and connect with others
  7. 2013

  8. @amynewell / @natbudin rubyconf2019 plm-website (aka “the monolith”) StreamEvent User-generated

    content Medical metadata Controllers and views Query for feed content 2013 PostgreSQL
  9. @amynewell / @natbudin rubyconf2019 Problem: it’s slow

  10. @amynewell / @natbudin rubyconf2019 plm-website (aka “the monolith”) StreamEvent User-generated

    content Medical metadata Controllers and views Query Redis for feed content 2014 PostgreSQL Redis
  11. @amynewell / @natbudin rubyconf2019 Problem: growth cap

  12. @amynewell / @natbudin rubyconf2019 Urgent technical problem With no obvious

    solution
  13. @amynewell / @natbudin rubyconf2019 Summer 2016 • Nat goes to

    DataLayer conference • “We tried X and it was great” • “…PS, our ops team hates it” • …and comes away with an unexpected idea
  14. @amynewell / @natbudin rubyconf2019 Newswire (aka “the microservice”) plm-website (aka

    “the monolith”) StreamEvent User-generated content Querying API Allows clients to query for feed data GraphQL PostgreSQL Elasticsearch
  15. @amynewell / @natbudin rubyconf2019 The answer to technical issues was

    sometimes in front of you all along
  16. @amynewell / @natbudin rubyconf2019 Welcome to the Skunkworks

  17. @amynewell / @natbudin rubyconf2019 ⏸

  18. @amynewell / @natbudin rubyconf2019 Why service extraction?

  19. @amynewell / @natbudin rubyconf2019 Big, old monorail

  20. @amynewell / @natbudin rubyconf2019 Messy coupling

  21. @amynewell / @natbudin rubyconf2019 #rubyfriends

  22. @amynewell / @natbudin rubyconf2019 Lead with realistic optimism

  23. @amynewell / @natbudin rubyconf2019 Pretend boundaries are easy to violate

  24. @amynewell / @natbudin rubyconf2019 ▶

  25. @amynewell / @natbudin rubyconf2019 “It’ll probably take about 3 months”

    Nat, October 2016
  26. @amynewell / @natbudin rubyconf2019 Why skunkworks?

  27. @amynewell / @natbudin rubyconf2019 • Autonomy • Leadership buy-in seemed

    hard • Seemed like less work
  28. @amynewell / @natbudin rubyconf2019 Market your project and get buy-in

  29. @amynewell / @natbudin rubyconf2019 Never start a project with a

    single engineer
  30. @amynewell / @natbudin rubyconf2019 Get QA, infra, etc. involved early

  31. @amynewell / @natbudin rubyconf2019

  32. @amynewell / @natbudin rubyconf2019 Skunk Cost Fallacy

  33. @amynewell / @natbudin rubyconf2019 Skunk bites can be nasty

  34. @amynewell / @natbudin rubyconf2019 January 2017 (3 months in) •

    We get Product buy-in • Amy spins off a separate team and adds two more engineers
  35. @amynewell / @natbudin rubyconf2019 March/April 2017 (6-7 months in) •

    Dark times for Amy and Nat • It emerges that we have to deploy this with zero downtime, and we bring on a devops expert to help • We find out we have to keep supporting the old, unmaintained iOS app • Patience running out for when this would ship, no actual end in sight
  36. @amynewell / @natbudin rubyconf2019 Tie your choices to actual business

    needs, not professional orthodoxy
  37. @amynewell / @natbudin rubyconf2019 Newswire ships to production! June 5,

    2017 (9 months in) Amy begins ketamine infusions!
  38. @amynewell / @natbudin rubyconf2019 Sell your successes

  39. @amynewell / @natbudin rubyconf2019 Success often feels like failure when

    you’re going through it
  40. @amynewell / @natbudin rubyconf2019 Life goes on

  41. @amynewell / @natbudin rubyconf2019 Fall 2017 • We start tweaking

    the feed algorithms to show stuff people actually want to read • PLM launches new mobile apps for iOS and Android, centering news feeds as the primary experience
  42. @amynewell / @natbudin rubyconf2019 Winter 2018 • Paying off a

    lot of tech debt • Major performance improvements
  43. @amynewell / @natbudin rubyconf2019 Spring 2018 We figure out that

    it’s possible to build an entirely new type of discussion experience (“topics”) on top of Newswire
  44. @amynewell / @natbudin rubyconf2019 My second tour on Newswire when

    we built tons of shit super fast is one of my favorite PLM memories. A software engineer It was great to see how the big upfront cost enabled us to quickly launch new beta features (“Topics”) without much technical lift. A product manager
  45. @amynewell / @natbudin rubyconf2019 Success

  46. @amynewell / @natbudin rubyconf2019 Every big technical project is a

    bet
  47. @amynewell / @natbudin rubyconf2019 –Merlin Mann “Everyone has their reasons.”

  48. @amynewell / @natbudin rubyconf2019 Projects and people are part of

    a system
  49. @amynewell / @natbudin rubyconf2019 You don’t have to know what

    you’re doing
  50. @amynewell / @natbudin rubyconf2019 Be kind to your past self

    They were doing their best
  51. @amynewell / @natbudin rubyconf2019 The News Crew Developers Adia, Evan,

    Margaret, Morgan, Rachel, Rocco, Stephanie Management Amy, Andy, Brittney, Matt QA Jessica, Linley Product Colin, Katelyn, Rebecca Devops Bryan, Logan, Ryan, Stuart Supporting cast Ian, Jonathan, Michael, Paul, Scott, Thijs
  52. Thank you • Questions? Please talk to us after! •

    Find us on Twitter! • @amynewell • @natbudin