Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How We Harden Platform Security at Mercari

How We Harden Platform Security at Mercari

This is a slide for CloudNative Days Tokyo 2021 Keynote (https://event.cloudnativedays.jp/cndt2021/talks/1208).

At Mercari, we've been building internal development platform top on Kubernetes and Cloud-native ecosystem for more than 3 years. The history of building the platform is the history of security hardening. In this session, I'm going to introduce what kind of security hardening we've implemented from basic k8s manifest security policy enforcement to supply chain integrity checking, IaC automation security, and zero-touch-based access automation.

Ecb3acc2d246962361a4f8b3f7a6dd12?s=128

taichi nakashima

November 04, 2021
Tweet

Transcript

  1. How We Harden Platform Security CloudNative Days Tokyo 2021

  2. None
  3. How We Harden Platform Security CloudNative Days Tokyo 2021

  4. Taichi Nakashima @deeeet / @tcnksm Engineering head of Developer Productivity

    Engineering
  5. https://e34.fm

  6. Table of Contents
 • Microservices Platform Overview • 3 Cases

    of Harden Platform Security ◦ Multi Tenant Security ◦ Production Operation Security ◦ Supply Chain Security • Lessons Learned
  7. Microservices Platform Overview

  8. Service C Service D Service B Service E Service A

    Mercari and Merpay Microservices Google Kubernetes Engine 200+ Microservices 4000+Kubernetes Pods 2 main business
  9. Service A Team Mercari SRE Merpay SRE Service A Service

    B Team Service B Service C Team Service C Work closely or embedded Platform Platform Team
  10. Harden Platform Security

  11. Base Principle: Shared Responsibility


  12. Base Principle: Defense in Depth


  13. Harden Platform Security
 • Multi-tenant Security • Production Operation Security

    • Supply Chain Security
  14. Multi-tenancy
 Multi-tenancy is architecture pattern which, instead of building platforms

    per business or services, prepares isolated tenant per services and hosts them together on single platform. While multi-tenancy increase complexity, you can avoid reinventing wheels in the organization, reduce the operational costs, and leverage improvements to all.
  15. Principle: Least Privilege
 Least privilege means human user or workload

    must be able to access only resources that are necessary for its legitimate purpose. In multi-tenancy context, it’s important to make sure only tenant owners are able to access its tenant’s resources.
  16. Multi-tenant Least Privileges on
 • Kubernetes Cluster • Infrastructure as

    Code (IaC) Monorepo and Build System
  17. Service A Namespace Kubernetes Cluster Service B Namespace System Namespace

    Container A Container A Resources Container A Container A Resources Container A Container A Resources Service A Team ✅ RBAC 🚫 🚫
  18. Multi-tenant Least Privileges on
 • Kubernetes Cluster • Infrastructure as

    Code (IaC) Monorepo and Build System
  19. CI System Build System Service A Tenant Container A Container

    A Resources CI System Service B Tenant Container A Container A Resources CI System System Tenant Container A Container A Resources IaC Monorepo Service A Team Service B Team Platform Team PR Configure
  20. CI System Build System Service A Tenant Container A Container

    A Resources CI System Service B Tenant Container A Container A Resources CI System System Tenant Container A Container A Resources IaC Monorepo Service A Team Service B Team Platform Team PR Configure
  21. Service A Team ✅ CODEOWNER 🚫 🚫 /service-a module.tf google_spanner_database.tf

    google_storage_bucket.tf ... /service-b module.tf google_bigquery_dataset.tf google_pubsub_topic.tf ... /system module.tf google_container_cluster.tf google_compute_firewall.tf ... Infra as Code Monorepo
  22. CI System Build System Service A Tenant Container A Container

    A Resources CI System Service B Tenant Container A Container A Resources CI System System Tenant Container A Container A Resources IaC Monorepo Service A Team Service B Team Platform Team PR Configure
  23. CI System Build System Service A Tenant Container A Container

    A Resources Build Account IAM CI System Service B Tenant Container A Container A Resources CI System System Tenant Container A Container A Resources IAM IAM
  24. CI System Build System Service A Tenant Container A Container

    A Resources Service A Account (Keyless) Build Account (keyless) CI System Service B Tenant Container A Container A Resources CI System System Tenant Container A Container A Resources Impersonate IAM Service B Account (Keyless) IAM Impersonate System Account (Keyless) IAM Impersonate
  25. CI System Build System Service A Tenant Container A Container

    A Resources CI System Service B Tenant Container A Container A Resources CI System System Tenant Container A Container A Resources Short-lived token 🚫 🚫 ✅ Impersonate IAM Service A Account (Keyless) Build Account (keyless)
  26. Harden Platform Security
 • Multi-tenant Security • Production Operation Security

    • Supply Chain Security
  27. https://sre.google/books/building-secure-reliable-systems/

  28. Goal: Zero Touch Production
 The specific goal of these interfaces—like

    Zero Touch Production (ZTP) ,..., is to make Google safer and reduce outages by removing direct human access to production roles. Instead, humans have indirect access to production through tooling and automation that make predictable and controlled changes to production infrastructure. - Building Secure and Reliable Systems, Chapter 5
  29. Service A Team Service A Namespace Kubernetes Cluster Container A

    Container A Resources View Edit IaC Repository +Build System Service B Namespace System Namespace
  30. Service A Team Service A Namespace Kubernetes Cluster Container A

    Container A Resources View Edit Edit IaC Repository +Build System Temporary Role Grant Service B Namespace System Namespace
  31. Service A Team Service A Namespace Kubernetes Cluster Container A

    Container A Resources View Edit Edit Edit IaC Repository +Build System Automated Workflows Temporary Role Grant Service B Namespace System Namespace
  32. Harden Platform Security
 • Multi-tenant Security • Production Operation Security

    • Supply Chain Security
  33. Source Build Deploy Registry Cluster Dependency

  34. Source Build Deploy Registry Cluster Dependency Compromise build system Compromise

    artifact registry Inject bad container image Bypass code review Inject bad/vulnerable dependency Compromise source control system Alter code Compromise deploy system Use bad image
  35. Practice: Verify Artifacts, Not Just People The controls around the

    source, build, and test infrastructure have limited effect if adversaries can bypass them by deploying directly to production. It is not sufficient to verify who initiated a deployment, because that actor may make a mistake or may be intentionally deploying a malicious change. Instead, deployment environments should verify what is being deployed. - Building Secure and Reliable Systems, Chapter 14
  36. Source Build Deploy Registry Metadata Cluster Kritis Dependency ✅ Sign

    Check
  37. Lessons Learned

  38. Secure By Default
 Security hardening = “migration” takes lots of

    time and costs... Build the security policy by allowlist, instead of denylist!
  39. Build Abstraction
 Hide infrastructure and security complexity from the developers

    and control them centrally in background by experts. Make the future migration easy!
  40. Example abstraction built internally at Mercari with CUE

  41. Thank you!

  42. We are Hiring! https://careers.mercari.com/search-jobs/