Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Serving GitHub Actions Self-Hosted Runner as a Platform (Part 1: Introduction)

46fdd2ebc85d68659b83d5eb5c6a49aa?s=47 Keke
November 05, 2021

Serving GitHub Actions Self-Hosted Runner as a Platform (Part 1: Introduction)

Merpay SRE Tech Talk 2021/11/05

46fdd2ebc85d68659b83d5eb5c6a49aa?s=128

Keke

November 05, 2021
Tweet

More Decks by Keke

Other Decks in Programming

Transcript

  1. 1 Serving GitHub Actions Self-Hosted Runner as a Platform Merpay・Mercoin

    SRE @keke Merpay SRE Tech Talk 2021/11/05
  2. 2 Agenda 01 • Introduction to GitHub Actions Self-Hosted Runner

    (Part 1) 02 • Serve as a Platform (Part 2) Today is part 1 😄
  3. 3 01 Introduction to GitHub Actions Self-Hosted Runner (Part 1)

  4. 4 What’s GitHub Actions? GitHub native continuous Integration (CI) tool

    which is triggered on GitHub events (e.g., Pull request creation, comment on an issue, etc.) A “Workflow” has “Job(s)” that runs an “Action”, which is offered step officially by third-party or by yourself.
  5. 5 Workflow

  6. 6 How it works? Developers GitHub Runs Workflow on Runners

    ② Operation ① Listen to a queue ( HTTP long polling) ⑤ Results & Updates Commit Status ③ Dispatch to queue GitHub Queue ④ Get event
  7. 7 Types of Runners Cloud Hosted Self Hosted

  8. 8 Specify Runner types 👈

  9. 9 Types of Runners 1. Cloud Hosted (GitHub Managed) Runners

    Users doesn’t need to prepare anything (instead of money 💰) Developers GitHub Runs Workflow on Runners ② Operation ① Listen to a queue ( HTTP long polling) ⑤ Results & Updates Commit Status ③ Dispatch to queue GitHub Queue ④ Get event Managed by GitHub
  10. 10 • Windows Server ◦ 2022 ◦ 2019 • macOS

    ◦ Big Sur 11 ◦ Catalina • Ubuntu ◦ 20.04 ◦ 18.04 Installed software are limited. OS Major Limitations of Cloud Hosted Runners IP addresses are not static in GitHub organization level, shared by other organizations. Can’t configure good firewall. GitHub can have IP restrictions on organization so the runners can’t access GitHub.
 Ref: Access GitHub Meta API for IP addresses (there are many 😄) 
 $ curl https://api.github.com/meta Network Egress Only offers one machine type • 2 CPU core
 • 7GB RAM
 • 14GB SSD
 Sometimes there's too much or too little, not cost efficient. For example, big Go projects wants CPU specific machines while Next.js project wants memory specific machines.
 Hardware • About GitHub-hosted runners, https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners • GItHub Actions Virtual Environments, https://github.com/actions/virtual-environments
  11. 11 Types of Runners 1. Self-Hosted Runners (actions/runner) GitHub only

    dispatches event to the GitHub queue. Developers GitHub Runs Workflow on Runners ② Operation ① Register runner, Listen to a queue ⑤ Results & Updates Commit Status ③ Dispatch to queue GitHub Queue ④ Get event Managed by users
  12. 12 Organization Level 3 Types of Self-Hosted Runners Repository Level

    
 Enterprise Level Repo A Repo B Repo Z Repo W Org A Org P Repo A Repo B Repo Z Repo W Org A Org P Repo A Repo B Repo C
  13. 13 Any 😄 OS Major advantages of Self-Hosted Runners Send

    egress thought NAT, VPN, etc. depends on user 😄 (Can make IP static) Network Egress Any 😄
 Hardware
  14. 14 Hard to manage efficiently. Scalability Major disadvantages of Self-Hosted

    Runners Two big problems: • Need to register the Runner to organization, repository, etc • Need to managed the Runner for each organizations, repository, etc Provisioning (Platform) Hard to provision the runners. Sometimes we want to offer many machine types: • Standard • CPU specific • Memory specific • “Big” machines like other cloud provider products (GCP Cloud Build, AWS EC2, etc) Provisioning (Machines) Actual metrics

  15. 15 How to manage it? Disclaimer: Following slides include things

    that have already been implemented or are planned to be implemented. As such, they may be changed later or not implemented yet.
  16. 16 Provisioning Runner’s Machines Offer repository level runners (currently organization

    level) to isolate the runners by microservices. Runs a dedicated GKE cluster with actions-runner-controller/actions-runner-controller, a Kubernetes controller to manage Self-Hosted Runners. The runners created by this controller’s CRD RunnerDeployment is backed by StatefulSets. On runner start, the runner will access the GitHub API and register the runner to the repository.
  17. 17 Kubernetes Manifest to provision runners

  18. 18 Provisioning Machines variations Cloud Hosted Runners only provides one

    machine type (2 CPU cores, 7GB RAM, 14GB SSD) while the Self-Hosted Runners can be any machine type by adding labels. 1 CPU core 100GB RAM standard memory 20 CPU core 20GB RAM high 0.1 CPU core 1GB RAM tiny 👉 Big Next.js App 👉 Docker builds 👉 Closing stale PRs
  19. 19 Runners with labels 👈

  20. 20 Specify Runner by labels Labels

  21. 21 Scaling Runners actions-runner-controller/actions-runner-controller also offers a webhook server and

    a CRD HorizontalRunnerAutoscaler (HRA) that receive the webhook event and scale the runners (similar to Kubernetes HorizontalPodAutoscaler(HPA)).
  22. 22 Scaling Runners Developers GitHub Runners ① Operation ② Webhook

    (workflow_job) ③ Create Runner Webhook Server ② Dispatch to queue GitHub Queue ④ Register, listen to queue ⑤ Event
  23. 23 Network Egress For security reasons, some GitHub Organizations has

    IP restriction on GitHub Organization. Cloud Hosted Runners has wide IP range and scope to the GitHub not the organization. Like normal “microservices” Egress of Self-Hosted Runners static public IP made possible by Cloud NAT + Cloud Router.
  24. 24 ✅ Provisioning Runners variations ✅ Scalability of Runners ✅

    GitHub IP restrictions
  25. 25 02 Serve as a Platform (Part 2) Next Time

    🌕 🚀 Some topics ・How to manage the tenants on the Kubernetes? ・How to automatically register the repository runners? ・Integration with Starter Kits?
  26. 26 Thank you!