Slide 1

Slide 1 text

Vault on GKE


Slide 2

Slide 2 text

Profile Takaaki Komazaki @komattaka SRE Team at Merpay

Slide 3

Slide 3 text

Agenda ● ● ● 
 ● ● ● ● ● ●

Slide 4

Slide 4 text

Architecture


Slide 5

Slide 5 text

Microservice Architecture
 Clients (API) Merpay API Gateway API service microservice A microservice B microservice C API service Mercari App Clients (Browser) CDN

Slide 6

Slide 6 text

Microservice Architecture & Organization
 ● Microservice architecture can be improved with the services and organizations that scale
 
 ● Features and data are separated for each service, and each can develop and operate with OWNERSHIP
 ○ Code
 ○ Team
 ○ Database
 ○ Kubernetes Namespace
 ○ GCP Project


Slide 7

Slide 7 text

Terraform module
 “Microservices Starter Kit”


Slide 8

Slide 8 text

namespace Microservices Starter Kit
 API Kubernetes Engine namespace API namespace microservice B namespace microservice A namespace Merpay API Gateway Stackdriver Monitoring Logging BigQuery Cloud Load Balancing … Project for Centralized GKE Cluster … Developers
 Microservice resources for team A Microservice resources for team B GitHub Team
 GCP Project + resources 
 Vault
 Datadog
 PagerDuty
 Sentry
 Developers
 Vault
 Datadog
 PagerDuty
 Sentry
 GitHub Team
 GCP Project + resources 


Slide 9

Slide 9 text

Microservices Starter Kit Coverage
 namespace microservice A Developers
 GitHub Team
 GCP Project + resources 
 Vault
 Datadog
 PagerDuty
 Sentry
 ● Kubernetes Namespace
 ● Kubernetes RBAC
 ○ view
 ○ edit
 ● Istio injection
 ● GCP
 ○ Project
 ○ Audit Logging
 ○ Folder
 ● Vault
 ○ Path
 ○ Auth N/Z
 ● etc.
 ➸ The Terraform module cannot be modified freely, so it can handle "Forcing some settings for developers" besides "Easing developers."
 ➸ The Terraform module is not necessarily supported in places that are left to the decision of each team 
 RBAC


Slide 10

Slide 10 text

Benefits of Terraform module for
 the Platform Managers
 ● The module can force some settings for the developers
 ○ Settings which is according to the design doc,
 (Kubernetes Namespace, Vault Policy, ...)
 ○ Security requirements
 (Audit logging, Container Scanning, Container Analysis, ...)
 ● Using a Module doesn't mean the developers can't use raw Terraform resources, so it won't be a blocker
 ○ Resources which depend on the decision of each microservice team
 ○ E.g. microservice A want to use Cloud Spanner but Cloud Datastore for microservice B


Slide 11

Slide 11 text

Vault on Kubernetes


Slide 12

Slide 12 text

● In the past, we used GCP Cloud KMS
 ● The performance deterioration of Cloud KMS became conspicuous after the number of users' personal information to be encrypted increased.
 ● That’s why we’ve improved the in-house encryption mechanism.
 Why are we using HashiCorp Vault?


Slide 13

Slide 13 text

GKE Vault Cluster Vault Vault Vault Cloud KMS BigQuery Cloud Storage Logging GCP project for auditing Store encrypted data, system data, etc. Send audit log Log Sink to BQ Request to encrypt the credential data via vault-client GitHub Repository Terraform Vault Admins microservice A microservice B microservice teams Service for internal Service for external Store the secret into the KV via Web UI GCP project for centralized GKE cluster GCP project for Vault Vault Architecture

Slide 14

Slide 14 text

Kubernetes Vault Cluster Vault and Kubernetes configuration Vault ● Enable Vault High Availability Mode Kubernetes ● requiredDuringSchedulingIgnoredDuringExecution ○ Using “Node Anti Affinity” to force the Vault Pods not to concentrate on the same Node ● PodDisruptionBudget ○ To guarantee the number of pods by the number of quorums; avoiding the split-brain Vault Vault Vault Kubernetes Node Autoscaler is enabled on our GKE,
 but there are no incidents so far

Slide 15

Slide 15 text

● We’ve created a Helm Chart for the Vault
 ○ We don’t have Official Helm Chart(https://github.com/hashicorp/vault-helm) back then.
 ○ The difference is, the in-house version has settings for Datadog, and official one has Kubernetes bug support (it’s a bug that does not affect our cluster)
 ○ So you can listen my presentation as no difference between them.
 How we manage Vault on GKE Vault Cluster Vault Vault Vault

Slide 16

Slide 16 text

● Microservices use Key-Value store and Transit Engine
 ○ Encrypt and decrypt personal information etc. with Transit Engine.
 ○ Storing the secrets used by Microservice in the key-value store, it will be synchronized with Kubernetes Secret.
 ● Considering whether to use Dynamic Auth
 ○ It’s better to use Dynamic Auth for multi-cloud but there’s Cloud IAM Conditions for GCP users...
 Vault for Microservices
 Kubernetes Vault Cluster microservice A Pods microservice B Pods Vault Vault Service for internal Vault

Slide 17

Slide 17 text

● The most important thing:
 How to control Path, Policy and Namespace for Vault
 ○ Notes:
 ■ The Vault has a concept like the directory path. We can enable some Vault engines on the Path.
 ● E.g. /transit/hr-info for HR information encryption with Transit 
 ● E.g. /auth/gcp/project-a for the Auth Endpoint of the project-A 
 ■ Policies are how authorization is done in Vault
 ● E.g. Alice is able to creates, reads, update, and delete on the Path /kv/team-a/*, but Bob is able to only read on the Path 
 ■ Namespace is similar to the Kubernetes Namespace, It can separate the Vault virtually.
 Vault x Microservices Starter Kit


Slide 18

Slide 18 text

Vault x Microservices Starter Kit
 ● Vault into Microservices Starter Kit (Terraform module)
 ○ Benefits for the developers
 ■ All settings are done automatically by Terraform
 ○ Benefits for the platform managers
 ■ Namespace and Path will be created according to the rule which we laid down, the Policies can be managed easily
 ■ Make it easier to find which teams are using the Vault
 ■ Terraform is easy to restrict resources by Policy as Code such as Sentinel, conftest
 ● https://www.terraform.io/docs/cloud/sentinel/index.html 
 ● https://github.com/instrumenta/conftest