Save 37% off PRO during our Black Friday Sale! »

The Future of Ops

The Future of Ops

Traditional Operations isn’t going away, it’s just retooling. The move from on-premise to cloud means Ops, in the classical sense, is largely being outsourced to cloud providers. What’s left is a thin but crucial slice between cloud providers and the products built by development teams, encompassing infrastructure and deployment automation, configuration management, log management, and monitoring and instrumentation—all through the lens of self-service.

Join me as I share my vision for the future of Operations as an organizational competency and how it relates to DevOps. We will discuss where industry practices are headed while sharing some real-world stories—the good and the bad—of applying these practices at Workiva. The intended outcome of this talk is to leave listeners with a better understanding of what an effective modern engineering organization looks like, including patterns and best practices, and the path to reaching it. The end goal is an organization which delivers value to customers reliably, efficiently, and continuously.

Ops is dead, long live Ops!

Dcbf01e42178cd9698fb3d4806e33d84?s=128

Tyler Treat

April 13, 2018
Tweet

Transcript

  1. @tyler_treat Tyler Treat • DevOpsDays Des Moines • 4/13/18 The

    Future of Ops
  2. @tyler_treat Welcome to the
 world of tomorrow!

  3. @tyler_treat @tyler_treat

  4. @tyler_treat Tyler Treat Managing Partner @ Real Kinetic Former infrastructure

    engineering manager @ Workiva bravenewgeek.com
  5. @tyler_treat Data Center

  6. @tyler_treat Data Center Compute Network Storage

  7. @tyler_treat Data Center Compute Network Storage App Servers Security Backups/DR

    Monitoring
  8. @tyler_treat Data Center Compute Network Storage Help Desk Procurement Compliance

    App Servers Security Backups/DR Monitoring
  9. @tyler_treat Data Center Compute Network Storage Help Desk Procurement Compliance

    App Servers Security Backups/DR Monitoring App App App App App App App
  10. @tyler_treat Data Center Compute Network Storage Help Desk Procurement Compliance

    App Servers Security Backups/DR Monitoring
  11. @tyler_treat Data Center Compute Network Storage Help Desk Procurement Compliance

    App Servers Security Backups/DR Monitoring App App App App App App App Ops
  12. @tyler_treat Data Center Compute Network Storage Help Desk Procurement Compliance

    App Servers Security Backups/DR Monitoring App App App App App App App DevOps
  13. @tyler_treat App App App App App App App NoOps

  14. @tyler_treat App App App App App App App Infrastructure Automation

    Deployment Automation Configuration Management Log Management Monitoring NewOps
  15. @tyler_treat

  16. @tyler_treat DevOps is a journey, not a destination.

  17. @tyler_treat Manual Provisioning Self-Service The DevOps Scale of Automation

  18. @tyler_treat Manual Provisioning Self-Service Large Enterprise Small Startup

  19. @tyler_treat Scaling DevOps

  20. @tyler_treat @tyler_treat Why do silos form?

  21. @tyler_treat Many companies start with a “DevOps” approach.

  22. @tyler_treat Manual Provisioning Self-Service Large Enterprise Small Startup DevOps by

    Necessity
 Devs push to production, unstable, high-risk, minimal cost control
  23. @tyler_treat As the product scales, we specialize. @tyler_treat

  24. @tyler_treat As the business scales,
 we add safety checks.

  25. @tyler_treat Developers write code.

  26. @tyler_treat Ops people run it.

  27. @tyler_treat QA gets blamed for defects.

  28. @tyler_treat Security blocks everything.

  29. @tyler_treat And management wonders why nothing gets shipped.

  30. @tyler_treat Manual Provisioning Self-Service Large Enterprise Small Startup Ops as

    Gatekeepers
 Stable, cost-controlled, risk-averse,
 delivery and innovation bottleneck

  31. @tyler_treat Specialization is good!

  32. @tyler_treat Misalignment is not good.

  33. @tyler_treat How do we scale specialization?

  34. @tyler_treat Cross-functional teams? @tyler_treat

  35. @tyler_treat DevOps encourages cooperation!

  36. @tyler_treat Just add an ops engineer to each team.

  37. @tyler_treat And maybe a reliability engineer.

  38. @tyler_treat Maybe a few extra for on-call backup.

  39. @tyler_treat And of course we need a QA engineer too.

  40. @tyler_treat Done!

  41. @tyler_treat Also, $$$

  42. @tyler_treat @tyler_treat

  43. @tyler_treat How do we scale specialization?

  44. @tyler_treat Vision and Product

  45. @tyler_treat Vision: a mental image of what the future could

    be like. @tyler_treat
  46. @tyler_treat Vision enables independent decision making and alignment.

  47. @tyler_treat But vision without execution is just hallucination…

  48. @tyler_treat Products are how we scale execution.

  49. @tyler_treat

  50. @tyler_treat The evolution of QA Test-focused Tools-focused

  51. @tyler_treat The evolution of QA QA SDET “Combined” Engineering

  52. @tyler_treat Production CD Pipeline CI

  53. @tyler_treat QA teams are shrinking, but what’s growing are the

    teams building the tools.
  54. @tyler_treat The same is becoming true of Ops.

  55. @tyler_treat build/release/deploy configuration management infrastructure automation logging & instrumentation monitoring

  56. @tyler_treat By productizing our infrastructure, we scaled.

  57. @tyler_treat We controlled costs.

  58. @tyler_treat We reduced risk.

  59. @tyler_treat We accelerated development.

  60. @tyler_treat We delivered value to customers faster…

  61. @tyler_treat from 3 - 4 releases per year to multiple

    releases per day.
  62. @tyler_treat Rethinking Ops

  63. @tyler_treat

  64. @tyler_treat

  65. @tyler_treat Data Center Compute Network Storage Help Desk Procurement Compliance

    App Servers Security Backups/DR Monitoring App App App App App App App Wake me up if anything goes wrong here. Ops as Masters of Production
  66. @tyler_treat Data Center Compute Network Storage Help Desk Procurement Compliance

    App Servers Security Backups/DR Monitoring App App App App App App App Jim Bob’s Frobulator service is out of memory. Ops as Masters of Production
  67. @tyler_treat Manual Provisioning Self-Service Large Enterprise Small Startup PaaS
 Stable,

    cost-controlled, risk-averse,
 delivery enabler, innovation bottleneck
  68. @tyler_treat

  69. @tyler_treat Enable developers to self-service through tooling and automation and

    empower them to deploy and operate their services… @tyler_treat The Vision
  70. @tyler_treat “Here’s a CloudFormation template and access to production…”

  71. @tyler_treat Manual Provisioning Self-Service Large Enterprise Small Startup IaaS
 Devs

    provision infrastructure as code, free-for-all, cost explosion, high-risk,
 delivery and innovation enabler
  72. @tyler_treat Enable developers to self-service through tooling and automation and

    empower them to deploy and operate their services… @tyler_treat The Vision
  73. @tyler_treat Enable developers to self-service through tooling and automation and

    empower them to deploy and operate their services…with minimal Ops intervention. @tyler_treat The Vision
  74. @tyler_treat

  75. @tyler_treat

  76. @tyler_treat Enable developers to self-service through tooling and automation and

    empower them to deploy and operate their services…with minimal Ops intervention. @tyler_treat The Vision
  77. @tyler_treat App App App App App App App Infrastructure Automation

    Deployment Automation Configuration Management Log Management Monitoring Ops as Product Team
  78. @tyler_treat App App App App App App App Infrastructure Automation

    Deployment Automation Configuration Management Log Management Monitoring Products Ops as Product Team
  79. @tyler_treat Enable developers to self-service through tooling and automation and

    empower them to deploy and operate their services…with minimal Ops intervention. @tyler_treat The Vision
  80. @tyler_treat Pain-Driven Development: making locally optimal decisions to minimize pain.

  81. @tyler_treat Silos promote pain displacement. Product Development QA Ops

  82. @tyler_treat Silos promote pain displacement. Product Development QA Ops pain

    of running software pain of testing software pain of building software
  83. @tyler_treat Misaligned incentives!

  84. @tyler_treat How do you expect devs to care about quality

    if they’re not on the hook?
  85. @tyler_treat How do you expect devs to care about operability

    if they’re not on the hook?
  86. @tyler_treat Devs won’t build truly reliable systems until they are

    on-call for them.
  87. @tyler_treat BUT!

  88. @tyler_treat Responsibility requires empowerment.

  89. @tyler_treat You can’t ask someone to care about something and

    fix it without also giving them the power to do so.
  90. @tyler_treat Most Ops teams simply haven’t done enough to empower

    and offload responsibility onto dev teams.
  91. @tyler_treat Products enable ownership.

  92. @tyler_treat App App App App App App App Infrastructure Automation

    Deployment Automation Configuration Management Log Management Monitoring Products Ops as Product Team
  93. @tyler_treat App App App App App App App Infrastructure Automation

    Deployment Automation Configuration Management Log Management Monitoring Products The Frobulator service is out of memory… Since you are the Frobulator expert, here are these tools to help you diagnose and resolve the problem autonomously. Ops as Product Team
  94. @tyler_treat Enable developers to self-service through tooling and automation and

    empower them to deploy and operate their services…with minimal Ops intervention. @tyler_treat The Vision
  95. @tyler_treat Products maintain control through enablement.

  96. @tyler_treat Enable teams to follow best practices.

  97. @tyler_treat Best practices for builds.

  98. @tyler_treat Best practices for testing.

  99. @tyler_treat Best practices for deploys.

  100. @tyler_treat Best practices for support.

  101. @tyler_treat Best practices for compliance.

  102. @tyler_treat Encode compliance and SDLC requirements into tooling and process.

  103. @tyler_treat Snowflakes kill…
 Use pain-driven development to your advantage by

    creating paths of least resistance.
  104. @tyler_treat Teams must make a case for going off-menu.

  105. @tyler_treat Products in Practice

  106. @tyler_treat Build Release Deploy Operate

  107. @tyler_treat Build Release Deploy Operate

  108. @tyler_treat Code Repository Dev Push change to branch Review by

    Peers Build QA & Compliance Continuous Integration
  109. @tyler_treat @tyler_treat

  110. @tyler_treat @tyler_treat

  111. @tyler_treat @tyler_treat

  112. @tyler_treat

  113. @tyler_treat

  114. @tyler_treat

  115. @tyler_treat @tyler_treat

  116. @tyler_treat • Build plan part of the code, not baked

    into build tool
 • Dev teams fully control their builds
 • Deep integration with GitHub 
 • Build controls into the process Continuous Integration
  117. @tyler_treat Build Release Deploy Operate

  118. @tyler_treat Code Repository Dev Tag branch for release Build/QA Continuous

    Delivery Dev Artifact Repository Sign-Off Prod Artifact Repository Deploy
  119. @tyler_treat • Artifact build/tagging/promotion automation
 • Container/machine image auditing
 •

    Machine image and security patch automation
 • Streamlining sign-off from different parties Continuous Delivery
  120. @tyler_treat Build Release Deploy Operate

  121. @tyler_treat

  122. @tyler_treat

  123. @tyler_treat • Self-service deploys
 • Self-service configuration (with guard rails)


    • Infrastructure provisioning is automated
 • No ticket-driven development Continuous Deployment
  124. @tyler_treat Build Release Deploy Operate

  125. @tyler_treat • Logging
 - Structured logging spec
 - Language libs

    implementing spec
 - Log pipeline (i.e. agent, collector, storage, search)
 • Telemetry, tracing, health checks, alerting
 • Canary deploys, A/B testing, traffic shadowing, etc. Continuous Operations
  126. @tyler_treat Many off-the-shelf solutions just need “glued” together.

  127. @tyler_treat Most problems are cultural, not technical.

  128. @tyler_treat Technology will not fix your broken culture!

  129. @tyler_treat Solutions need to fit the company, its culture, and

    its architecture.
  130. @tyler_treat Get the workflow correct, start manual, then automate.

  131. @tyler_treat Wrapping Up

  132. @tyler_treat Specialization leads to misalignment and broken feedback loops.

  133. @tyler_treat But specialization is an important part of scaling a

    business.
  134. @tyler_treat The question is:
 how do we specialize?

  135. @tyler_treat The traditional Ops model does not scale.

  136. @tyler_treat DevOps is about tightening feedback loops and building empathy.

  137. @tyler_treat NewOps is about empowering teams and providing autonomy.

  138. @tyler_treat It’s not a replacement for DevOps, it’s an evolution

    of it.
  139. @tyler_treat It’s applying a product mindset to the traditional Ops

    model.
  140. @tyler_treat Ops teams should be redefining their vision:

  141. @tyler_treat from masters of production to enablers of production.

  142. @tyler_treat Ops capabilities should be embedded within dev teams…

  143. @tyler_treat but they need to be enabled!

  144. @tyler_treat

  145. @tyler_treat NewOps treats Ops like a product team whose product

    is infrastructure.
  146. @tyler_treat Creating guard rails, not walls.

  147. @tyler_treat Offloading responsibility helps correct and scale feedback loops.

  148. @tyler_treat Traditional Ops isn’t going away, it’s just getting a

    product manager.
  149. @tyler_treat Thanks!
 bravenewgeek.com realkinetic.com