Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How We Harden Platform Security at Mercari

How We Harden Platform Security at Mercari

This is a slide for CloudNative Days Tokyo 2021 Keynote (https://event.cloudnativedays.jp/cndt2021/talks/1208).

At Mercari, we've been building internal development platform top on Kubernetes and Cloud-native ecosystem for more than 3 years. The history of building the platform is the history of security hardening. In this session, I'm going to introduce what kind of security hardening we've implemented from basic k8s manifest security policy enforcement to supply chain integrity checking, IaC automation security, and zero-touch-based access automation.

taichi nakashima

November 04, 2021
Tweet

More Decks by taichi nakashima

Other Decks in Technology

Transcript

  1. Table of Contents
 • Microservices Platform Overview • 3 Cases

    of Harden Platform Security ◦ Multi Tenant Security ◦ Production Operation Security ◦ Supply Chain Security • Lessons Learned
  2. Service C Service D Service B Service E Service A

    Mercari and Merpay Microservices Google Kubernetes Engine 200+ Microservices 4000+Kubernetes Pods 2 main business
  3. Service A Team Mercari SRE Merpay SRE Service A Service

    B Team Service B Service C Team Service C Work closely or embedded Platform Platform Team
  4. Multi-tenancy
 Multi-tenancy is architecture pattern which, instead of building platforms

    per business or services, prepares isolated tenant per services and hosts them together on single platform. While multi-tenancy increase complexity, you can avoid reinventing wheels in the organization, reduce the operational costs, and leverage improvements to all.
  5. Principle: Least Privilege
 Least privilege means human user or workload

    must be able to access only resources that are necessary for its legitimate purpose. In multi-tenancy context, it’s important to make sure only tenant owners are able to access its tenant’s resources.
  6. Service A Namespace Kubernetes Cluster Service B Namespace System Namespace

    Container A Container A Resources Container A Container A Resources Container A Container A Resources Service A Team ✅ RBAC 🚫 🚫
  7. CI System Build System Service A Tenant Container A Container

    A Resources CI System Service B Tenant Container A Container A Resources CI System System Tenant Container A Container A Resources IaC Monorepo Service A Team Service B Team Platform Team PR Configure
  8. CI System Build System Service A Tenant Container A Container

    A Resources CI System Service B Tenant Container A Container A Resources CI System System Tenant Container A Container A Resources IaC Monorepo Service A Team Service B Team Platform Team PR Configure
  9. Service A Team ✅ CODEOWNER 🚫 🚫 /service-a module.tf google_spanner_database.tf

    google_storage_bucket.tf ... /service-b module.tf google_bigquery_dataset.tf google_pubsub_topic.tf ... /system module.tf google_container_cluster.tf google_compute_firewall.tf ... Infra as Code Monorepo
  10. CI System Build System Service A Tenant Container A Container

    A Resources CI System Service B Tenant Container A Container A Resources CI System System Tenant Container A Container A Resources IaC Monorepo Service A Team Service B Team Platform Team PR Configure
  11. CI System Build System Service A Tenant Container A Container

    A Resources Build Account IAM CI System Service B Tenant Container A Container A Resources CI System System Tenant Container A Container A Resources IAM IAM
  12. CI System Build System Service A Tenant Container A Container

    A Resources Service A Account (Keyless) Build Account (keyless) CI System Service B Tenant Container A Container A Resources CI System System Tenant Container A Container A Resources Impersonate IAM Service B Account (Keyless) IAM Impersonate System Account (Keyless) IAM Impersonate
  13. CI System Build System Service A Tenant Container A Container

    A Resources CI System Service B Tenant Container A Container A Resources CI System System Tenant Container A Container A Resources Short-lived token 🚫 🚫 ✅ Impersonate IAM Service A Account (Keyless) Build Account (keyless)
  14. Goal: Zero Touch Production
 The specific goal of these interfaces—like

    Zero Touch Production (ZTP) ,..., is to make Google safer and reduce outages by removing direct human access to production roles. Instead, humans have indirect access to production through tooling and automation that make predictable and controlled changes to production infrastructure. - Building Secure and Reliable Systems, Chapter 5
  15. Service A Team Service A Namespace Kubernetes Cluster Container A

    Container A Resources View Edit IaC Repository +Build System Service B Namespace System Namespace
  16. Service A Team Service A Namespace Kubernetes Cluster Container A

    Container A Resources View Edit Edit IaC Repository +Build System Temporary Role Grant Service B Namespace System Namespace
  17. Service A Team Service A Namespace Kubernetes Cluster Container A

    Container A Resources View Edit Edit Edit IaC Repository +Build System Automated Workflows Temporary Role Grant Service B Namespace System Namespace
  18. Source Build Deploy Registry Cluster Dependency Compromise build system Compromise

    artifact registry Inject bad container image Bypass code review Inject bad/vulnerable dependency Compromise source control system Alter code Compromise deploy system Use bad image
  19. Practice: Verify Artifacts, Not Just People The controls around the

    source, build, and test infrastructure have limited effect if adversaries can bypass them by deploying directly to production. It is not sufficient to verify who initiated a deployment, because that actor may make a mistake or may be intentionally deploying a malicious change. Instead, deployment environments should verify what is being deployed. - Building Secure and Reliable Systems, Chapter 14
  20. Secure By Default
 Security hardening = “migration” takes lots of

    time and costs... Build the security policy by allowlist, instead of denylist!
  21. Build Abstraction
 Hide infrastructure and security complexity from the developers

    and control them centrally in background by experts. Make the future migration easy!