Slide 1

Slide 1 text

@mipsytipsy Modern Development Best Practices in Highly Regulated Environments: Case Studies “How We Did It”

Slide 2

Slide 2 text

Modern software development practices 1.Engineers owning their own code in production 2.Practicing observability-driven development 3.Testing in production 4.Separating deploys from releases using feature flags 5.Continuous deployment (or at least delivery)

Slide 3

Slide 3 text

Getting your code into production as fast as possible after writing it. FAST FEEDBACK LOOPS Modern software development practices are ✨ALL✨ about

Slide 4

Slide 4 text

“Explain it to me like I’m five”: Regulations: you are subject to these if you operate under their domain, e.g. GDPR, CCPA, HIPAA, PCI/DSS, etc ✨Security✨ Frameworks: you may be audited to ensure you conform to these, e.g. SOC2, ISO 27001, NIST, FedRAMP etc Your security team has written policies for compliance with these, and your legal team signs contracts with customers.

Slide 5

Slide 5 text

Frameworks & regulations are not prescriptive. None of them forbid any modern development practices. However, these practices MAY conflict with your own written policies. They might also conflict with terms in your own customer contracts.

Slide 6

Slide 6 text

Policies are living documents. They should be subject to regular review and reconsideration. But! Do your security and legal teams know when to push back or loop you in? Contracts should be negotiated, not just signed. Engineering should have a say.

Slide 7

Slide 7 text

“We’re a regulated industry. Therefore…” ❌ We can’t let developers deploy their own code due to segregation of duties ❌ All changes must be approved by a Change Advisory Board ❌ Trunk based development is not allowed ❌ No testing in production, or developer access to production ❌ You cannot log anything ❌ You must log everything, and cannot delete anything ❌ You are not allowed to use any SaaS, or multi tenant databases or compute ❌ You are not permitted to refactor your code ❌ Manual testing must occur before each deploy ❌ Auto-deploying your code is not permissible ❌ Auto-deploying is mandatory For more, see this thread ➡ ➡ ➡ https://twitter.com/mipsytipsy/status/1694163770753601887 How many times have you heard:

Slide 8

Slide 8 text

✨Bullshit.✨ All of that is Stand by for proof: a long list of case studies of companies who are auto-deploying, developing off trunk, getting code into production in a matter of minutes, etc. All of them are subject to the same regulations you are. Some of them may be your competitors.

Slide 9

Slide 9 text

How Etsy did it (in 2013!!): • Decouple the cardholder data and PCI/DSS regulations from the rest of the system • The systems that form the cardholder data environment (CDE) are separated from the rest of Etsy’s environments at the physical, network, source code, and logical infra levels • The CDE is built and operated by an xfn team that is solely responsible for the CDE. Again, this limits the scope of the PCI DSS regulations to just this team. https://queue.acm.org/detail.cfm?id=3190610

Slide 10

Slide 10 text

How Honeycomb does it: • Subject to privacy laws such as GDPR, CCPA, HIPAA (BAA) • Security framework adapted to SOC2 trust services criteria (confidentiality and security • Auto-deploys once an hour off trunk via a cron job. Extensive investment into tests. Takes about an hour for code to go live. • Practices trunk-based dev, short-lived branches, code reviews • Access Management policy based on least privilege model. Access to PII/prod data is limited to those with a business need.

Slide 11

Slide 11 text

How Branch Insurance does it: • Regulated by 36 states and DC, annual SOC2s • Production data and envs mostly isolated from most engineers; only TLs can analyze production telemetry for PII purposes (despite masking and filtering and tokenizing) • Every developer has their own AWS account, massive investment in testing. Trunk-based development. • Uses serverless extensively; pushes to trunk many times/day, pushes to prod many times/week, in under an hour end to end.

Slide 12

Slide 12 text

How Stytch does it: • Certified ISO27001 and SOC2 Type 2, subject to GDPR, CCPA • Auto-deploys on PR merge with an average of 13 min before code goes live, approximately 30 times/week • Trunk-based development with optional on-demand preview environments for PRs. Extensive integration testing before merge! • Data access granted to people who need it for their jobs, with data auditing and masking to further ensure user privacy

Slide 13

Slide 13 text

How Entrata does it: • Subject to A LOT of compliance audits, including PCI-DSS • Keeps PCI environment isolated on a separate private network, AWS account, GitHub org, etc. PCI codebase has no external deps, can be tested in isolation. Owned by a single eng team. • Can deploy a line of PCI-compliant code to production in 15 min • Code review before merging to main, then test on staging, cut a release to production branch, deploy to prod. Access to db, app servers is extremely limited. • 20 year old company; code originally written w/o unit tests

Slide 14

Slide 14 text

How Ocado Technology does it: • Certified SOC1, SOC2, PCI/DSS; also subject to GDPR • Hundreds of apps in production, owned by ~200 teams • On average, code gets deployed to production every 3 minutes • Takes ~1 hour for code to get to production after a merge. Practices canary + rolling deploys over the course of 4-5 days. • Data access granted to people who need it for their jobs, with data auditing and masking to further ensure user privacy https://handbook.ocado.tech/#/sw-development/technical-standards?id=encryption-of-personal-data https://handbook.ocado.tech/#/sw-development/hallmarks https://handbook.ocado.tech/#/sw-development/maturity-model

Slide 15

Slide 15 text

How ClarityAI does it: • Certified ISO27001 and SOC2 Type 2, practiced a joint audit strategy to streamline time and resources • Some teams practice Continuous Deployment and deploy several times per day using trunk-based development, TDD, and pairing • Other teams deploy at least once per day using short-lived branches https://medium.com/clarityai-engineering/iso27001-and-soc2-type-ii-from-greenfield-to-success-24ca99decb26

Slide 16

Slide 16 text

How Bankwest (Perth) does it: • Deploys to production within a few hours • Worked to get rid of the Change Advisory Board for most uses. First defined some types of changes as lower risk to avoid Change Approval processes, then worked hard to make almost every change fit those lower risk definitions. • Feature flags, separating deploys/releases, backwards compatible changes, API expand/rollout/contract, small releases deployed often, observability in production

Slide 17

Slide 17 text

How Cabify does it: • Practices Continuous Delivery, deploys 1-6 times a day, lead time for changes is 35 min • Certified PCI/DSS on the payments side, financial audit for the entire company • Feature flags, separating deploys/releases, backwards compatible changes, API expand/rollout/contract, small releases deployed often, observability in production

Slide 18

Slide 18 text

How Ping Identity does it: • Certified ISO27001, SOC2 Type 2 • Took about an hour to deploy • Auditors cared about what pipeline did, what gates there were, what controls we had. • Merge requests required approval from someone not the author, tests needed to run and pass, someone needed to approve before deployment

Slide 19

Slide 19 text

How SALTO does it: • Certified ISO27001, working on SOC2 Type 2 • Deploys several times a day • No one has access to raw data. If something must be checked against databases, it must be 1) requested, 2) approved by a manager, 3) run through a system that anonymizes data • Practices GitOps (TF, Flux2, k8s) to avoid manually writing to prod • Oncall and a few other people have read access to prod

Slide 20

Slide 20 text

How Duffel does it: • PCI L1 compliant • Can get a line of code into prod in 30 min (!!!) • Deploys from trunk, runs static analysis as part of CI?CD • Mandatory PR review approvals from an accepted PCI group, which turns into a merge commit after approval. • Merge commit SHA is the source of a container image • Uses a lot of Security Command Center premium features for threat detection, vulnerabilities, time to resolution.

Slide 21

Slide 21 text

How toplyne.io does it: • Certified SOC2 Type 2, subject to GDPR, HIPAA, and CCPA, working on ISO27001:2022 • Trunk-based deployment, manual PR reviews via GitStream • All teams deploy multiple times a day, and can deploy one line of code in <15 min • Platform engineering owns Security and Compliance • Multiple tests run for SAST and DAST in CI and during deploys

Slide 22

Slide 22 text

How AudioStack does it: • Certified SOC2 Type 1, subject to GDPR; working on ISO27001 and SOC2 Type 2 • All security checks run automatically with GitHub and other tools as part of CI/CD architecture • Deploy takes about 30 minutes • Deploys to prod at least daily, after tests pass and a merge request has been reviewed and approved • Restricts access to data, least privilege access

Slide 23

Slide 23 text

How Jack Henry does it: • Certified SOC1, SOC2, PCI/DSS; subject to FBA, state banking regs • 300+ different applications in K8s. 100+ deploys per day. • Column-level encryption on DBs allows devs to have read access to prod DBs (✨cool!!✨) • Code review before merging to main. Release gets cut and runs through user acceptance testing; approvals sent to stakeholders, deploy to production kicks off once approved • Takes about 30 min for code to get to production after a merge. All changes are canaried.

Slide 24

Slide 24 text

How up.com.au does it: • Certified SOC1, SOC2, PCI/DSS; subject to Aussie banking regs • Deploys to production around hourly • Massively parallel, fully automated test suite spins up a replica of production in seconds, uses Rspec and Appium to run thousands of tests on every change • Takes around 20 min to run the full test suite, then decommissions replica. Can turn around changes to prod in minutes • Two-speed architecture lets us deploy changes constantly on the customer-facing side, and deliberately on the banking side.

Slide 25

Slide 25 text

Stop blaming regulations and frameworks. This is all about how we decide to interpret the standards. ✨This is not their fault.✨

Slide 26

Slide 26 text

We are all on the same side. ❤ This is about better security, too.

Slide 27

Slide 27 text

We need engineers & leaders who understand the existential urgency of a short cycle time, and will fight for it. Not just once or twice. Every day.

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

Hey, you! ✨Hi!✨ Do YOU work at a company that is subject to regulations and standards, but uses modern development best practices (continuous deployment, observability- driven development, fast feedback loops, auto-deploys, etc)? Would you like to be on a slide? ☺ DM me on twitter @mipsytipsy or email me at [email protected] and let’s do this! 🥰 You don’t have to be “perfect” (no one is). Let’s show the world just how doable this is!! ❤🔥 P.S. This is also GREAT for recruiting…just sayin’.

Slide 30

Slide 30 text

For more, see my slides on “Why Compliance And Regulatory Standards Are Not Incompatible With Modern Development Best Practices” https://speakerdeck.com/charity/compliance-and-regulatory-standards- are-not-incompatible-with-modern-development-best-practices https://speakerdeck.com/charity or just go to: