$30 off During Our Annual Pro Sale. View Details »

Compliance & Regulatory Standards Are NOT Incompatible With Modern Development Best Practices

Compliance & Regulatory Standards Are NOT Incompatible With Modern Development Best Practices

Everybody knows that modern development practices include things like testing in production, continuous delivery, observability driven development, and separating deploys from releases using feature flags. Yet far too many times I've heard engineers from highly regulated industries complain that they have to follow a bunch of security theater due to regulations and standards.

This is categorically false: there is NOTHING in ANY regulation or standard to prevent you from using modern development best practices. Let's take a stroll through the regulatory landscape and talk about how to make your case (and who to make your case to). A massive competitive advantage will accrue to those teams who can figure out how to make regulatory compliance compatible with fast feedback loops, which means that this is a fight very much worth fighting.

Also check out this complementary deck full of case studies!! Case Studies: Modern Development Practices In Highly Regulated Environments

Charity Majors

August 26, 2023
Tweet

More Decks by Charity Majors

Other Decks in Technology

Transcript

  1. @mipsytipsy
    Compliance & Regulatory
    Standards Are ✨Not✨
    Incompatible With Modern
    Development Best Practices

    View Slide

  2. @mipsytipsy


    engineer/cofounder/CTO


    https://charity.wtf

    View Slide

  3. “The Sociotechnical Path to
    High-Performing Teams”
    “It Is Time To Fulfill The Promise Of
    Continuous Delivery”
    “Debugging Is A Team Sport”
    “On Call Does Not Have To Suck”
    “Testing in Production”

    View Slide

  4. “Okay, could you talk about
    that stuff, but also explain
    how and why we can do
    these things in a heavily
    regulated environment?”
    YES I can!

    View Slide

  5. Modern software development practices
    1.Engineers owning their own code in production


    2.Practicing observability-driven development


    3.Testing in production


    4.Separating deploys from releases using feature flags


    5.Continuous deployment (or at least delivery)

    View Slide

  6. Getting your code into production as fast
    as possible after writing it.
    FAST FEEDBACK LOOPS
    Modern software development practices
    are ✨ALL✨ about

    View Slide

  7. These practices, which have gone mainstream
    just in the last five years, aren’t about being
    trendy or showing off on twitter.
    They represent thousands of people-years of research
    and experimentation into how to build better software.
    How well your team performs can make the difference between
    loving your job or hating it; an exciting career or stagnation; happy
    users or angry users; even the success or failure of your company.

    View Slide

  8. Engineers owning their code
    in production
    • No dev/ops divide


    • You write it, you are on call for it


    • You kick off your own deploys


    • Systems are becoming too complex
    for anyone to operate systems they
    didn’t write, or write systems they
    don’t also operate.
    #1
    Practice

    View Slide

  9. Observability-driven development
    • Instrument your code as you go


    • After you deploy it, you go and
    look at it in production


    • Is it doing what you expected?


    • Does anything else look…weird?
    #2
    Practice

    View Slide

  10. Testing in production
    • Everybody tests in production…


    • …but only some of us admit it.


    • Instrument your code. Get used to
    looking at it.


    • And not just when things are broken.
    Know what good looks like.


    • Close the loop by looking at your
    code after you deploy it, every time.
    #3
    Practice

    View Slide

  11. Separating deploys from releases
    using feature flags
    • The key to reliable software is shipping smaller
    diffs, more frequently.


    • Using feature flags is how you do this.


    • Deploy continuously and flexibly. Roll changes
    out to users gradually, by groups, opt-in, etc.


    • Get your diffs out swiftly, while honoring
    scheduled release dates for product features.
    #4
    Practice

    View Slide

  12. Continuous Delivery (or even better,
    Continuous Deployment)
    • NO manual QA, Change Advisory Board,
    or approval gates


    • We have an ocean of evidence that these
    do nothing to make software better, and
    in fact make software worse.


    • Deploy as fast as possible,


    • As automated as possible.


    • If you haven’t read it, read it: —>
    #5
    Practice

    View Slide

  13. Security: “Explain it to me like I’m five” (ELI5)
    Confidentiality,


    Integrity,


    Availability
    “You must protect customer data”
    You must demonstrate that you have policies, procedures, and safeguards
    in place to protect customer data, and supply evidence you are actually
    following those policies, procedures, and safeguards.
    “You must protect your code”

    View Slide

  14. ✅ Frameworks:
    ✅ Written policies for how you are going to comply with regulations (security team)
    ✅ Regulations: GDPR, CCPA, HIPAA, PCI/DSS, etc
    SOC2, ISO 27001, NIST, FedRAMP etc
    State banking regulations
    ❌ We are NOT fucking around with FedRAMP or state banking regulations in this talk.
    ✅ Contractual terms/DPAs for big customers (legal team)
    ELI5

    View Slide

  15. Frameworks are typically very loose on the specifics. None of
    them expressly forbid any modern development practices.
    However, they may conflict with your own written policies,
    the ones that are being used to demonstrate compliance.
    They may also conflict with terms in your own customer contracts.
    E.g. “People should not be able to see private data unless you have a business need to do so.”


    (but the definition of “business need” is left up to us)
    Like, “You need to be scanning your code for known vulnerabilities before it goes live”

    View Slide

  16. Frameworks can be used to achieve
    compliance with regulations.
    Policies are living documents. They should be subject to regular
    review and reconsideration.


    Contracts should be negotiated, not blindly signed. Is your
    security team reviewing contracts before signing them? Are YOU?
    Are you giving your teams guidance on where to push back?
    But!

    View Slide

  17. Compliance standards exist for a reason.


    Our goal here is NOT to avoid or evade them.
    The problem is that elaborate security theater makes us slower and less
    competitive, while also making us no more (or even LESS!) secure.
    Always honor the spirit of the control, when devising a solution. As
    engineers, we may be best positioned to find the solution that is
    actually secure, not only theatrically secure.

    View Slide

  18. “We can’t have continuous delivery because …”
    Jez Humble, “Continuous Delivery Sounds Great But It Won’t Work Here” DevOpsDays Seattle 2017
    1. We’re regulated


    2. We’re not building websites


    3. We have too much legacy


    4. Our people are too stupid
    Stated Reasons:
    • Our culture sucks


    • Our architecture sucks


    • We haven’t tried


    • We don’t care enough
    Actual Reasons:
    (borrowed from a Jez Humble slide circa 2017 👇)

    View Slide

  19. 1. We’re regulated


    2. We’re not building websites


    3. We have too much legacy


    4. Our people are too stupid
    But this is a solved problem.


    This was a solved problem a decade ago!
    Etsy, since 2013


    Amazon


    Stripe


    HP firmware


    Branch Insurance

    Jack Henry


    Moov


    Honeycomb


    US gov (!!)


    Some of your competitors


    You can be, too.

    View Slide

  20. How Etsy did it (in 2013!):
    • Decouple the cardholder data and PCI/DSS regulations
    from the rest of the system


    • The systems that form the cardholder data environment
    (CDE) are separated from the rest of Etsy’s environments at
    the physical, network, source code, and logical infra levels


    • The CDE is built and operated by an xfn team that is solely
    responsible for the CDE. Again, this limits the scope of the
    PCI DSS regulations to just this team.
    https://queue.acm.org/detail.cfm?id=3190610

    View Slide

  21. How Branch Insurance does it:
    • Regulated by 36 states and DC, annual SOC2s


    • Production data and envs mostly isolated from most
    engineers; only TLs can analyze production telemetry for
    PII purposes (despite masking and filtering and tokenizing)


    • Every developer has their own AWS account, massive
    investment in testing. Trunk-based development.


    • Uses serverless extensively; pushes to trunk many times/
    day, pushes to prod many times/week, in under an hour
    end to end.

    View Slide

  22. How Honeycomb does it:
    • Certified SOC2 Type 2. Subject to GDPR, HIPAA, CCPA, state regs


    • Auto-deploys once an hour off trunk via a cron job. Extensive
    investment into tests. Takes about an hour for code to go live.


    • Practices trunk-based development, short-lived branches, code
    reviews


    • Access Management policy based on least privilege model.
    Access to PII/production data is limited to those who have a
    business need for it, i.e. need it to do their jobs.

    View Slide

  23. Stop blaming regulations and frameworks.
    It’s all about how we choose to
    interpret the standards.

    View Slide

  24. Interpretations vary based on risk tolerance.
    Far too often, the paperwork seems to matter more than the actual
    security of the implementation. ☹
    The difficulty here is that every product, company, and
    architecture is sui generis, so we can’t apply cookie-
    cutter solutions — we need to actually understand each
    use case before we can negotiate a solution.
    Also, we are terrible about sharing the
    solutions we do find.
    Every situation is ✨unique✨

    View Slide






  25. Architecture
    The biggest architectural obstacle to continuous
    delivery is when you want to ship a single line of
    code, but you have to deploy the whole world.
    Can you deploy the service you’re working on
    without having to deploy all the dependencies?
    Can you test the service you’re working on on your
    laptop, without needing an integrated environment?

    View Slide

  26. Architectural considerations:
    • Use a well-designed PaaS, if you can


    • Design for testability and deployability


    • Invest heavily in your test suite


    • If you need to unbundle a monolith, do not rip and
    replace; redesign iteratively into services.


    • Make sure services have their own databases!


    • Bring security in to the discussion from day one.

    View Slide

  27. In general, engineers shouldn’t need to be
    constantly thinking about compliance.
    Mostly just when setting up a new thing, or when gathering PII — does
    this matter, and where should I put it?
    Engineering performance and productivity, on the other hand, should
    ALWAYS be on our minds. Entropy is constantly eating away at our efficiency.

    View Slide

  28. If you want


    category-defining,
    competition-crushing
    engineering excellence,
    your engineering leadership
    will have to engage with
    security and legal as
    partners.
    One thing is exceptionally clear:

    View Slide

  29. We need engineering leaders who
    understand the existential urgency of a
    short cycle time, and will fight for it.
    Not just once or twice.
    Every day.

    View Slide

  30. “How well does your team perform?”
    != “how good are you at engineering”

    View Slide

  31. High-performing teams
    get to spend the majority of their time solving interesting,
    novel problems that move the business materially forward.
    Lower-performing teams
    spend almost all their time firefighting, waiting on code review, context switching, rolling back,
    rolling forwards, reproducing tricky bugs, solving problems they thought were fixed, responding
    to customer complaints, fixing flaky tests, running deploys by hand, fighting with their
    infrastructure, fighting with their tools, fighting with each other, debugging merge conflicts,
    triaging failed deploys, debugging and reproducing problems for each other when the rest of
    the team can’t use the debugging tools adequately, waiting on CI/CD to complete, waiting on
    tests to run, waiting on the queue to deploy, re-running tests because they aren’t sure if the one
    that failed is a real failure or not, paging in a different project to work on while your other project
    is stalled… basically everything BUT making progress on core business problems.

    View Slide

  32. 🔥1 — How frequently do you deploy?
    🔥2 — How long does it take for code to go live?
    🔥3 — How many of your deploys fail?
    🔥4 — How long does it take to recover from an outage?
    🔥5 — How often are you paged outside work hours?
    How high-performing is YOUR team?
    DORA metrics: https://dora.dev

    View Slide

  33. It really, really, really,


    really, really


    pays off


    to be on a


    high performing team.
    Like REALLY. 2019 numbers
    2021 numbers

    View Slide

  34. “Hire the smartest people


    you can find. Recruit from


    the best schools. Aggressively


    poach as much talent from
    FAANG as you can.”
    How do we build high-performing teams?

    View Slide

  35. Who is going to be a better engineer in two years?
    An engineer on an “Elite” team


    3000 deploys/year


    9 outages/year


    6 hours firefighting
    An engineer on a “Medium” team


    5 deploys/year


    65 outages/year


    firefighting: constant

    View Slide

  36. Q: What happens when an engineer from
    the “elite” yellow bubble joins a medium-
    performing team in the blue bubble?
    A: Your productivity tends
    to rise (or fall) to match that
    of the team you join.

    View Slide

  37. Great teams make great engineers. ❤

    View Slide

  38. Your ability to ship code swiftly and safely has less to do with
    your personal knowledge of algorithms and data structures,
    sociotechnical (n)
    “Technology is the sum of ways in which social groups construct the
    material objects of their civilizations. The things made are socially
    constructed just as much as technically constructed. The merging of
    these two things, construction and insight, is sociotechnology” —
    wikipedia
    and more to do with the sociotechnical system you participate in.

    View Slide

  39. Technical leadership should focus intensely on constructing
    and tightening the feedback loops at the heart of their system.
    The smallest unit of software delivery is the team.

    View Slide

  40. which brings us to…
    ✨CI / CD✨
    💜 Shipping is the heartbeat of your company. 💜


    Shipping new code should be as small, as common, as regular, as
    boring, as unremarkable as a heartbeat.
    and CI/CD is how we get there. Right?


    So … do YOU do CI/CD?!??

    View Slide

  41. “YES!


    We do CI/CD.”
    …but do you really?
    “Well, we have a


    Circle-CI account?”

    View Slide

  42. Most people are doing *CI*… sorta …
    But CI is only the prelude to the main course.

    The ENTIRE POINT of CI is to prepare the path for you to do CD.
    Continuous Deployment
    Continuous DELIVERY? At least.
    Better yet,

    View Slide

  43. If you aren’t going to hook CI up to production, honestly,


    why even bother with CI? Just run your tests continuously in a shell loop from your laptop.
    Same deal, less hassle. ¯\_(ツ)_/¯
    Once you merge your code to main, it should be automatically
    deployed by default. No manual gates. ✨One hour or less✨
    Continuous Deployment is what will change your life.
    Continuous Deployment is what will change your life.
    Continuous Deployment is what will change your life.
    Continuous Deployment is what will change your life

    View Slide

  44. P.S.: Fear of deploys is the single largest source of
    technical debt in most organizations.

    View Slide

  45. The speed, coverage, and cadence of your
    CI/CD pipeline will set the high water mark
    for your team’s performance.
    The “You Had One Job” of engineering
    leadership is tuning the feedback loops of
    our sociotechnical systems.
    It can’t get any better or faster than that,
    but it can definitely get slower and worse
    downstream.

    View Slide

  46. That precious interval of time between when you wrote the code and
    when the code has been deployed is everything.
    wrote the code deployed the code
    This is the cornerstone of high performing teams.

    View Slide

  47. At that moment when you finish solving a problem, your mental state
    holds everything: your original intent, motivation, implementation
    details tried and tossed, tradeoffs, variable names, etc.
    This lasts for … minutes? hours? 😬


    Until you move on to the next problem, maybe.

    View Slide

  48. Which is why engineers can find upwards of 80% of all bugs in that
    magical, fleeting interval, so long as they 1) have good observability
    tooling, 2), instrument their code and 3) go and look at it. Ask yourself:


    🌟 is it doing what I expected it to?


    🌟 and does anything else look … weird?
    A predictable interval of a few minutes lets you to hook into the body’s
    own intrinsic reward systems. Muscle memory. Dopamine hits! 🥰

    View Slide

  49. https://deepsource.io/blog/exponential-cost-of-fixing-bugs/
    The cost of finding and fixing bugs goes up exponentially
    with time elapsed since development.

    View Slide

  50. welcome to the software development death spiral.
    If it takes you hours (or even days!) to get a single line of code out,

    View Slide

  51. a longer interval between when code is written & deployed leads to
    … larger diffs
    … longer turnaround time for code review
    … multiple changes getting batched up and deployed at once
    … makes it hard to identify whose code is at fault
    … which severs ownership of changes
    … and soon requires specialists to deploy, run, monitor, and debug
    … more and more engineering cycles are spent waiting on each other
    … now we need to hire more engineers, managers, TPMs, project managers
    … more people and teams incur more coordination costs
    … more time spent paging state in and out of your brain
    … which all costs MORE TIME …😱

    View Slide

  52. large diffs, long review turnaround, batched up changes in a single deploy, complicated outage recovery
    processes, bloated org, coordination costs, tool proliferation, too many teams, burnout, boredom,
    boilerplate, unhappy customers, competitive losses, too little time spent on core business problems…
    You can spend your life chasing symptoms and pathologies …
    Or you can fix it at the source.
    60 minutes or bust.

    View Slide

  53. A fast cycle time is an enormous
    competitive advantage.
    It is worth taking up this fight. ☺
    I have never known a company where engineers were happy and
    customers were unhappy, or vice versa.


    Users’ and engineers’ happiness tends to rise and fall in tandem.

    View Slide

  54. “We can’t do this because of regulations…”
    Bullshit.
    Engineers can be overly literal. You are interpreters between
    security, legal, and tech…not transcriptionists.
    YOU are the experts in your code. YOU are the experts in
    software development. YOU are responsible for resolving
    conflicting requirements from security, legal and dev.

    View Slide

  55. Again: there is NO LAW or regulatory
    framework preventing you from following
    modern software development best practices.
    None.


    Zero.


    Zip.

    View Slide

  56. We are all on the same side.


    This is about better security, not worse.
    Documentation is a HUGE part of what matters, so use this to
    your advantage. Document what you’re going to do up front, do
    what you say you’re going to do, then document that you did it.

    View Slide

  57. Start small. Look for ways to demonstrate what you’re talking about
    with small wins that benefit everyone.
    Come to understand their pain, develop empathy for them. Then help
    them understand your pain and develop some empathy for you.
    Start by… building relationships. Get to know your peers in security and legal.
    Understand the constraints they are working under. They are probably held
    responsible for a pile of nightmares that you have no idea even exists. ☠
    This will take time…possibly years, at calcified organizations.


    And you won’t progress much without SOME cover from the top.
    Get anyone and everyone you can to read “Accelerate”.
    How to drive change in your org:

    View Slide

  58. P.S. Learn this phrase:


    “Compensating Controls”
    “I’m not following the letter of the law, but I have this other
    system that proves I’m following the spirit of the law”

    View Slide

  59. Instrument for observability.
    Engineers shouldn’t need full production access; you should be able to
    understand your software with just commit access and observability.
    Observability is what gives us the confidence to move swiftly, not blindly.

    View Slide

  60. Good SLOs actually check
    multiple boxes for us.
    Executive visibility into important numbers, monitoring, alerts, etc …
    instead of needing a different system for each one, SLOs cover many.

    View Slide

  61. “How well does your team perform?”
    Your team’s performance is defined by your
    sociotechnical systems, and especially by
    the speed of your feedback loops.
    It isn’t just about the security or economic arguments…

    View Slide

  62. High-performing teams
    spend the majority of their time solving interesting, novel
    problems that move the business materially forward.
    Everybody wants to be on teams like these.

    View Slide

  63. This is a quality of life issue.
    This is an ethical issue.
    We must build high-performing teams that are low in toil and
    high in Autonomy, Mastery, and Meaning. This begins with
    keeping your intervals low and your feedback loops tight.

    View Slide

  64. The End ☺

    View Slide

  65. Charity Majors


    @mipsytipsy

    View Slide