Slide 1

Slide 1 text

1 Cloud Run CI/CD + QA at ソウゾウ Ryuzo Yamamoto Cloud Run Casual Talk!

Slide 2

Slide 2 text

2 山本 竜三 自己紹介 @dragon3 Software Engineer Lead Architect / SRE at Souzoh in Fukuoka

Slide 3

Slide 3 text

3 ソウゾウ / メルカリShops

Slide 4

Slide 4 text

4 ● Architecture, Tech Stack ● CI/CD with GitHub Actions self-hosted runner ● Pull Request Environment ● Deploy & Canary Rollout Agenda

Slide 5

Slide 5 text

5 Architecture Next.js Cloud Run GraphQL Cloud Run imgproxy Cloud Run microservice Cloud Run microservice Cloud Run Cloud Storage Cloud Load Balancing Cloud SQL Memorystore Cloud Run (70~ services) microservice(s) Cloud Run

Slide 6

Slide 6 text

6 Tech Stack ● Monorepo ○ Go, TypeScript, Python, Java ○ 70~ microservices ● Bazel, Turborepo ● GraphQL / gRPC ● Serverless (Cloud Run) ● PostgreSQL, Redis ● Cloud PubSub, Tasks, Workflows, Scheduler, VertexAI

Slide 7

Slide 7 text

7 ● Architecture, Tech Stack ● CI/CD with GitHub Actions self-hosted runner ● Pull Request Environment ● Deploy & Canary Rollout Agenda

Slide 8

Slide 8 text

8 CI/CD with GitHub Actions self-hosted runner monorepo self-hosted runners runner runner job job Cloud Run Service Deploy GCR Push development Same as development production External Services NAT runner job Bazel Remote Cache

Slide 9

Slide 9 text

9 ● Architecture ● CI/CD with GitHub Actions self-hosted runner ● Pull Request Environment ● Deploy & Canary Rollout Agenda

Slide 10

Slide 10 text

10 Pull Request Environment Pull Request による変更をマージする前にデプロイし、 End-to-End でテストできる環境 Stable Environment (main) Pull Request Environment #123 Pull Request Environment #465

Slide 11

Slide 11 text

11 Pull Request Environment Next.js Cloud Run Cloud Load Balancing stable pr123 Host: example.com X-PR-ENV: 123 Host: pr123.nextjs.example.com GraphQL Cloud Run stable pr123 Service A Cloud Run stable pr123 URL mask routing ..example.com

Slide 12

Slide 12 text

12 Pull Request Environment Next.js Cloud Run Cloud Load Balancing stable pr123 Host: example.com X-PR-ENV: 123 Path: /graphql Host: pr123.graphql.example.com URL mask routing GraphQL Cloud Run stable pr123 Service A Cloud Run stable pr123 Cloud Run Tag URL https://pr123---echo-XXXXXXXXXX-an.a.run.app ..example.com

Slide 13

Slide 13 text

13 Pull Request Environment self-hosted runners runner GCR development Service A Cloud Run main pr123 Service A Cloud Run main pr123 Service A Cloud Run main pr123 Service A Cloud Run main pr123 Service A Cloud Run main pr123 Service A Cloud Run main pr123 Service A Cloud Run main pr123 Service A Cloud Run stable pr123 Deploy job

Slide 14

Slide 14 text

14 ● Architecture ● CI/CD with GitHub Actions self-hosted runner ● Pull Request Environment ● Deploy & Canary Rollout Agenda

Slide 15

Slide 15 text

15 Deploy - Custom deploy tool # Production 環境に deploy cli deploy main \ --spec production/services.yaml \ --service echo \ --image-tag 1.0.0 # Pull Request 環境 #123 に deploy cli deploy pr \ --spec development/services.yaml \ --service echo \ --number 123 # production/services.yaml project: awesome-project environment: production location: asia-northeast1 services: - name: echo image: gcr.io/awesome-project/echo env: - name: LOG_LEVEL value: info auto_scaling: min: 0 max: 100 capacity: memory: 512Mi cpu: 1 concurrency: 100 request_timeout: 30 connection: use_http2: true ...

Slide 16

Slide 16 text

16 Canary Rollout canaryrollout Cloud Run Service A Cloud Run stable canary 50% 50% Service A Cloud Run stable 100% Service A Cloud Run stable canary 100% 0% Service B Cloud Run stable canary 50% 50% Service B Cloud Run stable 100% Service B Cloud Run stable canary 100% 0% Error

Slide 17

Slide 17 text

17 Service A Cloud Run Cloud Scheduler Canary Rollout canaryrollout Cloud Run Service A Cloud Run stable canary Trigger every minute Update traffic Custom deploy tool Monitoring Get metrics labels: canaryrollout: enabled

Slide 18

Slide 18 text

18 Canary Rollout # production/services.yaml ... services: - name: echo ... canary_rollout: enable: true rollout_percent_steps: [10, 30, 60, 100] min_requests: 50 metrics_provider_type: grpc time_between_rollouts: 300s max_error_rate: 0.1 ... # Cloud Run Service apiVersion: serving.knative.dev/v1 kind: Service metadata: name: echo ... labels: canaryrollout: enabled ... annotations: canaryrollout.souzoh.com/rolloutPercentSteps: 10,30,60,100 canaryrollout.souzoh.com/minRequests: '50' canaryrollout.souzoh.com/metricsProviderType: grpc canaryrollout.souzoh.com/timeBetweenRollouts: 300s canaryrollout.souzoh.com/maxErrorRate: '0.1' ... Deploy

Slide 19

Slide 19 text

19 ● Architecture ● CI/CD with GitHub Actions self-hosted runner ● Pull Request Environment ● Deploy & Canary Rollout Wrap Up