Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deployment flow of PayPay Apps

Deployment flow of PayPay Apps

PayPay Corporation.

July 07, 2021
Tweet

More Decks by PayPay Corporation.

Other Decks in Technology

Transcript

  1. About me Masaya Ozawa - Joined PayPay in July 2018

    - 2018/07 ~ Backend engineer - 2019/05 ~ Infrastructure engineer Work From Anywhere - I'm improving my home environment
  2. Previous deployment flow 1. Pull Request 2. Merge 3. Deploy

    Non-production clusters Production cluster K8s manifests repository PayPayでのk8s活用事例: Kubernetes Meetup Tokyo #22 https://www.slideshare.net/PayPay_career/paypayk8s
  3. About Argo CD - One of the OSS projects called

    argo project - https://argoproj.github.io/argocd - Provides a CRD that behaves like aligning the state of the cluster with the manifest on GitHub - Can deploy various middleware and applications including Argo CD itself
  4. Previous deployment flow 1. Pull Request 2. Merge 3. Deploy

    Non-production clusters Production cluster K8s manifests repository PayPayでのk8s活用事例: Kubernetes Meetup Tokyo #22 https://www.slideshare.net/PayPay_career/paypayk8s
  5. Previous deployment flow issues - As the number of applications

    increased, problems such as dependencies between deployments increased. - Deployment order of function addition across multiple services, etc. - It was solved by communication - However, accidents still occurred, so we decided to consider the mechanism. 1. Pull Request 2. Merge 3. Deploy Non-production clusters Production cluster K8s manifests repository
  6. Previous deployment flow issues Issue - When doing a production

    deployment, it can be difficult to see if the version of each application deployed in production meets the requirements of the application. - The stg cluster has a newer version deployed than the production cluster due to the active development of functions. Solution - Build an environment equivalent to production and make sure to perform integration tests on it
  7. Canary Environment Dedicated Load Balancer Public Load Balancer Canary Environment

    Canary environment App Public App Production Environment Run and verify automated tests on all releases Operation check with a dedicated app DB is shared with production Safely verify operations equivalent to production
  8. Canary Environment Requirements - Mandatory automated testing in this environment

    during production deployment. - The canary environment must always be maintained in a production-like environment - Image version of application, etc. Integrate into our deployment flow!!
  9. Workflow design In a production deployment, perform the following steps

    in 1 PR. 1. Create a PR in the manifests repository 2. Tech Lead Approval 3. Deploy to Canary Environment 4. Automatic integration test 5. Manual testing of features outside the cover of automated testing 6. Deploy to Production Environment
  10. Which technology to use What technology to use to build

    a deployment flow?? - Jenkins ?? - GitHub Actions ?? - Or other OSS ??
  11. Previous deployment flow (repost) 1. Pull Request 2. Merge 3.

    Deploy Non-production clusters Production cluster K8s manifests repository
  12. Which technology to use To use GitHub Actions - High

    affinity with GitHub features including Pull Request - Common management mechanisms such as Organization Secret have become available - You don't have to check other GUIs while deploying
  13. Workflow design Doing all the processing with 1PR cannot be

    triggered by branch merge X< So we decided to use GitHub tags as a means of tracking deployments. - https://argoproj.github.io/argo-cd/user-guide/tracking_strategies/#tag-tracking Create the following tags <env>_release For deployment <env>_<YYYYMMDDhhmmss> For history
  14. Workflow design About Tech-Lead Approval. Since the pull request itself

    has an approve function, we decided to use it. We will prepare a dedicated team for TechLead and check in the following two places. - GitHub Actions - Argo CD PreSync - Assuming a GitHub tag is created outside the deployment flow
  15. Workflow design How to proceed with Workflow steps on 1PR

    - Execute a comment starting with a slash to trigger by referring to prow - It is over-engineered to prepare prow just for this purpose, so I decided to realize it with Github Actions. - How to do it with GitHub Actions - PR comments can be retrieved in the issue_comment event - However, detailed PR information cannot be obtained at issue_comment event. - So, by creating a label, it is converted to the labeled event of pull_request.
  16. Workflow design What to do with the processing of each

    step ? - All applications need to perform operations related to the deployment flow Manage the script to process in another repository - In the workflow job, use the checkout action to get and execute the script - Private Access Token required for use from GitHub Actions - However, central management of scripts is possible.
  17. Workflow design How to run automated tests ? - All

    applications need to perform operations related to the deployment flow This also manages test cases in another repository in the same way - Get test cases managed in another repository in workflow - Run them in parallel
  18. Deployment flow 1. Create a PR and get approval from

    Tech-Lead 2. Deploy to Canary and integration test 3. Manual testing 4. Deploy to Production - As a result of the design, We decided to take such a deployment flow - Step 1 can be done in advance - The actual deployment flow is steps 2-4
  19. Deployment flow 1. Create a PR and get approval from

    Tech-Lead 2. Deploy to Canary and integration test 3. Manual testing 4. Deploy to Production Create a PR for deployment - We are using Kustomize for deployment - Use Kustomize's images feature to specify the image version of the application you want to deploy - Get approval from Tech-Lead for the created PR
  20. Deployment flow 1. Create a PR and get approval from

    Tech-Lead 2. Deploy to Canary and integration test 3. Manual testing 4. Deploy to Production The process is executed by the /canary_release comment. The following processing is executed - Create canary_release tag - Deploy to canary environment - Automatic QA test in Canary environment
  21. Deployment flow 1. Create a PR and get approval from

    Tech-Lead 2. Deploy to Canary and integration test 3. Manual testing 4. Deploy to Production If check other than the case of automatic QA is required, perform manual check Register the check result with the following comment.
  22. Deployment flow 1. Create a PR and get approval from

    Tech-Lead 2. Deploy to Canary and integration test 3. Manual testing 4. Deploy to Production The process is executed by the /prod_release comment. The following processing is executed - Create prod_release tag - Deploy to production environment We're also using argo-rollouts, so this deployment is a gradual release, making it even more secure.
  23. Deployment flow (Rollback) 1. Create a PR and get approval

    from Tech-Lead 2. Deploy to Canary and integration test 3. Manual testing 4. Deploy to Production 5. Production environment rollback If rollback is required for some reason, you can rollback with the following comments This process does the following - Recreate prod_release tag with previous content
  24. Deployment flow (Rollback) 1. Create a PR and get approval

    from Tech-Lead 2. Deploy to Canary and integration test 3. Manual testing 4. Deploy to Production 5. Production environment rollback 6. Canary environment rollback After the production rollback, the canary environment also needs to be rolled back. This process does the following - Recreate canary_release tag with previous content
  25. - A canary environment was prepared to solve the compatibility/dependency

    problems between services that the existing deployment flow had. Conclusion Dedicated Load Balancer Public Load Balancer Canary Environment Canary environment App Public App Production Environment
  26. Conclusion - Maintenance of the environment In order to make

    it mandatory to execute tests, a Deployment flow was established. 1. Pull Request 4.Canary deploy Production cluster K8s manifests repository developer Canary release Integrat ion test Manual test result registra tion Prod release approver 2. approve Canary cluster 3.canary_release 5.Integration test 6.Manual test 7.prod_release 8.Prod deploy
  27. Conclusion - In this way, a canary environment is prepared

    just before the production deployment, and the final test is performed there to reduce the probability of an incident. - In the future, we plan to make the following improvements. - Reduce deployment time - Automatic Rollback for fault detection - Etc, etc...
  28. Icons used in this slide, etc. - Kubernetes : -

    https://github.com/kubernetes/kubernetes/tree/master/logo - https://github.com/kubernetes/community/tree/master/icons - GitHub: https://github.com/logos - Argo proj: https://cncf-branding.netlify.app/projects/argo/ - Icon font - https://github.com/google/material-design-icons - https://fontawesome.com/license/free
  29. Deployment flow 1. Pull Request 4.Canary deploy Production cluster K8s

    manifests repository developer Canary release Integrat ion test Manual test result registra tion Prod release approver 2. approve Canary cluster 3.canary_release 5.Integration test 6.Manual test 7.prod_release 8.Prod deploy