Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Keep Calm and Carry On: Scaling Your Org With Microservices

Keep Calm and Carry On: Scaling Your Org With Microservices

Ask people about their experience rolling out microservices, and one theme dominates: engineering is the easy part, people are super hard! Everybody knows about Conway's Law, everybody knows they need to make changes to their organization to support a different product model, but what are those changes? How do you know if you're succeeding or failing, if people are struggling and miserable or just experiencing the discomfort of learning new skills? We'll talk through real stories of pain and grief as people modernize their team and their stack.

Charity Majors

March 26, 2017
Tweet

More Decks by Charity Majors

Other Decks in Technology

Transcript

  1. Keep Calm and Carry On:
    Scaling Your Org With Microservices
    Charity Majors, @mipsytipsy
    Bridget Kromhout, @bridgetkromhout

    View Slide

  2. Keep Calm and Carry On:
    Scaling Your Org With Microservices
    Charity Majors, @mipsytipsy
    Bridget Kromhout, @bridgetkromhout

    View Slide

  3. @mipsytipsy
    engineer, cofounder, CEO
    @bridgetkromhout

    #opslife, storyteller

    View Slide

  4. @mipsytipsy
    engineer, cofounder, CEO
    @bridgetkromhout

    #opslife, storyteller

    View Slide

  5. What even is a
    microservice
    (No one knows)

    View Slide

  6. What are microservices?
    • Independently deployable, small modular services

    • Monorepo vs multiple repos

    • Decentralized governance

    • Small teams, up to maybe a dozen people (“two pizzas”)

    • Operating independently, interacting with other teams via APIs

    View Slide

  7. Microservices are about
    changes.

    View Slide

  8. View Slide

  9. Conway’s
    “Law”

    View Slide

  10. Conway’s Law, post-Jobs

    View Slide

  11. “Conway’s Law” is not a law
    (and the most important word in
    Conway’s Law is “communication”)

    View Slide

  12. Growing your microservices org
    • Interfaces and abstractions

    • Data is just another service.

    • Ops is just another software engineering skill.

    • Implicit communication channels matter too.

    • Observability must be democratized.

    View Slide

  13. • Team structure (Conway’s
    Law?)
    • Communication pathways
    • “Smarter Edges”: For
    individual contributors
    • “Dumb Pipes”: for managers
    • Transitions are hard
    i can haz microservices?

    View Slide

  14. YAS! Has microservices: just the good parts
    • Don’t get religious. It’s not all or nothing.

    • What are your team’s strengths? What are
    their weaknesses?

    • Account for the operational cost

    View Slide

  15. View Slide

  16. How many engineers do you have?
    How good are they at operations?
    ** you need to be REALLY GOOD at operations to do microservices.

    View Slide

  17. How many products/services do you really have?
    Use a big fat service if it helps, plus some smaller ones
    Don’t microservice your shared libs, storage, or registry

    View Slide

  18. Don’t reinvent too many wheels.
    new wheels have too many unknown-unknowns
    (“choose boring technology”: still applies)

    View Slide

  19. “Dear Twitter …”

    View Slide

  20. “Software deploys … that take days to run, when they run.”
    “I’m responsible for it, but I can’t log in to it.”
    Hard things are hard.

    View Slide

  21. View Slide

  22. Interfaces and abstractions

    View Slide

  23. Scaling considerations for services
    (also teams!!)
    • Scalability

    • Redundancy and resiliency

    • Consensus knowledge of processes and arch

    • Load balancing, early warning alerts, graceful
    degradation

    • Communication problems to debug, black-box
    debugging with remote hands
    Interfaces

    View Slide

  24. Your team is a service, your humans are nodes
    Interfaces

    View Slide

  25. Interfaces
    Ownership is super key.
    Every service must be owned by a
    human just like it must be served by
    a dedicated set of resources.
    we’ve all been on teams that spend more time circularly routing jobs
    around or blackholing them than actually fixing them

    View Slide

  26. Management role #1: define the
    mission. Repeat the mission. Bore
    everyone to death with the mission.
    Interfaces
    Management role #2: routing, load
    balancing, health checking

    View Slide

  27. You can chaos monkey your people!
    You should!! \o/
    Interfaces

    View Slide

  28. Interfaces

    View Slide

  29. (what the actual fuck? do it anyway.)
    Interfaces

    View Slide

  30. Interfaces

    View Slide

  31. Communication channels

    View Slide

  32. Implicit communication channels matter just as
    much (more?) than explicit channels.
    Comms

    View Slide

  33. You can’t debug something you can’t name, describe,
    or understand.
    Comms
    Understand your communication channels.

    View Slide

  34. Pull requests, code reviews. Reporting structures.
    Work schedules. WFH vs in-office. 1x1s. On-call
    rotations, escalation paths. Promotions, interviews,
    recruiting, hiring pipelines, mentorship. Gossip.
    Happy hours.
    Comms
    Examples of communication flows:

    View Slide

  35. smart nodes, dumb pipes
    provision automatedly
    Managers’ job is primarily facilitating nodes
    Comms

    View Slide

  36. The more you map and understand these, the more
    power you have to effect change.
    Comms

    View Slide

  37. On call questions
    • Who is on call? Is it a necessary part of being an engineer?

    • How many rotations are there?

    • How often do people get woken up? *who* gets woken up?

    • How do you know? Who keeps track?

    • Are there different rotations for stateful and stateless services,
    front-end and backend?

    • Is there an escalation path?
    Comms

    View Slide

  38. Operations is just another software
    engineering skill

    View Slide

  39. You can’t be an effective SWE in a modern
    organization without ops skills.
    Ops

    View Slide

  40. Empowerment and responsibility go hand in hand
    … you can’t ask someone to care about something
    and fix it without also giving them the power
    Ops

    View Slide

  41. View Slide

  42. Do SWEs have to be on call?
    shrug. it helps.
    but it’s all about creating virtuous cause/effect loops
    Ops

    View Slide

  43. View Slide

  44. Snowflakes are enormously costly.
    The larger your org gets, the fewer snowflakes you
    are allowed to have.
    Ops

    View Slide

  45. View Slide

  46. networking: common theme

    View Slide

  47. Probe every software engineering candidate
    for their ops experience & attitude.
    … yep, even FE/mobile devs!

    View Slide

  48. “Operations is valued here.”
    you are signaling …

    View Slide

  49. Senior software engineers should be reasonably good at these things.
    So if they are not, don’t promote them.
    Operations engineering is about making systems
    maintainable, reliable, and comprehensible.

    View Slide

  50. Data ... is just another service.

    View Slide

  51. 1) it’s impossible to treat stateful services exactly like
    stateless services.
    Data

    View Slide

  52. YOU SHOULD TRY.
    The more you treat your stateful services like
    stateless ones, the more you win the future
    Data

    View Slide

  53. Common pattern: state is the last to microserviceify.
    Monolith db layer serves many small services.
    (because it’s hard, and usually is not the most evident source of pain)
    Data

    View Slide

  54. In the future, YOU are the DBA.**
    Data
    ** (Everything is going to be okay. Trust me.)

    View Slide

  55. Data

    View Slide

  56. and from a DBA at a different company … …
    Data

    View Slide

  57. Observability …
    is the rock on which your castle is built

    View Slide

  58. Technical observability:
    debugging, monitoring, metrics, instrumentation
    Observability

    View Slide

  59. People observability:
    1x1s, email, asking questions.
    Looking at their face and seeing if they are ok.
    Observability

    View Slide

  60. If you’re doing microservices, you’re signing up for
    hard people problems.
    Observability

    View Slide

  61. Observability

    View Slide

  62. You have a responsibility to your team’s well-being
    whether you’re a manager or not.
    Observability

    View Slide

  63. #truestory
    Observability

    View Slide

  64. Observability

    View Slide

  65. View Slide

  66. Talk to people BEFORE you launch any grand
    initiatives. Get their buy-in.

    View Slide

  67. if you didn’t …
    #truestory

    View Slide

  68. seek feedback
    move forward <3
    change is the only constant

    View Slide

  69. Get buy-in from *all* stakeholders.

    View Slide

  70. Tech leads, senior ICs

    View Slide

  71. Most failures happen around
    transitions.
    • unpacking a monolith -> microservices
    • rewriting from node.js into golang
    • acquiring or being acquired
    • migrating from hdfs in to new datastore
    • becoming a manager, or moving back to IC
    • getting married or divorced, having a kid

    View Slide

  72. Choose the problems you are not
    going to solve, or they will choose you.

    View Slide

  73. Making decisions:
    Get ready to talk to people a lot more about microservices.
    Sorry!

    View Slide

  74. TL;DR:
    • Innovate only where you need to/where you'll gain (and yes, this
    includes microservices, function-as-a service, and whatever's next)

    • Empower yourself; don't wait. Actively decentralize power and you'll
    decentralize points of failure.

    • Ask for permission strategically; move your org towards assume-
    yes.

    • Communication (implicit and explicit) is key to decentralizing &
    microservices

    • Look for the uncomfortable places. Be happy when you find them;
    that's where you and the org can grow.

    View Slide

  75. There is no fairy-tale answer
    Microservices give you flexibility; the rest is up to you, because hard
    things are still hard even when they're distributed and small.

    View Slide

  76. View Slide

  77. View Slide

  78. Operability / Teams.
    • The mission

    • Build a cult (j/k) (no really)

    • Let your team innovate.

    View Slide

  79. most outages are triggered by “events”,
    from humans. draw a line.

    View Slide

  80. Pair responsibility with
    empowerment.

    View Slide

  81. Have you considered … valuing non generalist
    SWEs and their work?

    View Slide

  82. Deploys

    On-Call

    Pull requests, arch reviews

    Observability

    Communication channels

    View Slide

  83. Deploys

    View Slide

  84. Deploys must be:
    • Fast. Rolling. Roll-back-able.

    • Reliable. Breaks rarely.

    • Draws a tagged vertical line in graphs.

    • *Anyone* should be able to invoke deploy

    • For bonus points: canarying or automated

    View Slide

  85. Revisit these tools regularly.
    part of every post mortem.

    View Slide

  86. On Call

    View Slide

  87. Haha, no.

    View Slide

  88. What should leaders know?
    Managers, tech leads, and engineers

    View Slide

  89. Things about leadership
    • Leadership is not a zero sum game. The best leaders try to empower literally
    everyone to perform a leadership role in at least some areas.

    • Create guard-rails, not walls.

    • Be conventional in the big things (salary, org), unconventional in the small.

    • If you give a shit about diversity, don’t wait 'til you’re “ready” to hire them … look
    for ways to support underrepresented groups now. Make friends. Help people.
    Diversify your friend groups and personal networks. Be creative.

    View Slide

  90. Management
    • Put the humans first, and the mission a close second

    • Be an enabler. Don’t starve your tech leads of growth opportunities by
    sucking all oxygen.

    • Reward intentionally.

    • Leadership is not zero-sum; encourage leadership everywhere

    • Managers, be friends with each other! Tolerance is not enough

    View Slide

  91. The most powerful weapon in your arsenal
    is always cause and effect.

    View Slide

  92. Engineers should be on call
    for their own services.

    View Slide

  93. Yes but ….
    Yes, microservices helps you drift a little bit and innovate independently …
    BUT, not as much as you might think.
    You all still share a fabric, after all.
    Stateful still gonna ruin your party. (and IPC, sec discovery, caching, cd
    pipelines, databases etc.)

    View Slide

  94. • I don’t think anyone should approach management as a thing they move in to
    permanently. It’s psychologically disfiguring.

    • Nor is the maturation process one way. New teams within the company
    should be springing up. Hackathons can be a great way, esp if it involves
    dogfooding. Empathy needs constant renewal.

    • Practice making mistakes together. Practice cheerful apologies, asking
    questions, giving awkward feedback. It gets easier.

    View Slide

  95. Charity Majors
    @mipsytipsy
    Bridget Kromhout
    @bridgetkromhout

    View Slide