4 What’s GitHub Actions? GitHub native continuous Integration (CI) tool which is triggered on GitHub events (e.g., Pull request creation, comment on an issue, etc.) A “Workflow” has “Job(s)” that runs an “Action”, which is offered step officially by third-party or by yourself.
6 How it works? Developers GitHub Runs Workflow on Runners ② Operation ① Listen to a queue ( HTTP long polling) ⑤ Results & Updates Commit Status ③ Dispatch to queue GitHub Queue ④ Get event
9 Types of Runners 1. Cloud Hosted (GitHub Managed) Runners Users doesn’t need to prepare anything (instead of money 💰) Developers GitHub Runs Workflow on Runners ② Operation ① Listen to a queue ( HTTP long polling) ⑤ Results & Updates Commit Status ③ Dispatch to queue GitHub Queue ④ Get event Managed by GitHub
10 ● Windows Server ○ 2022 ○ 2019 ● macOS ○ Big Sur 11 ○ Catalina ● Ubuntu ○ 20.04 ○ 18.04 Installed software are limited. OS Major Limitations of Cloud Hosted Runners IP addresses are not static in GitHub organization level, shared by other organizations. Can’t configure good firewall. GitHub can have IP restrictions on organization so the runners can’t access GitHub. Ref: Access GitHub Meta API for IP addresses (there are many 😄) $ curl https://api.github.com/meta Network Egress Only offers one machine type ● 2 CPU core ● 7GB RAM ● 14GB SSD Sometimes there's too much or too little, not cost efficient. For example, big Go projects wants CPU specific machines while Next.js project wants memory specific machines. Hardware ● About GitHub-hosted runners, https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners ● GItHub Actions Virtual Environments, https://github.com/actions/virtual-environments
11 Types of Runners 1. Self-Hosted Runners (actions/runner) GitHub only dispatches event to the GitHub queue. Developers GitHub Runs Workflow on Runners ② Operation ① Register runner, Listen to a queue ⑤ Results & Updates Commit Status ③ Dispatch to queue GitHub Queue ④ Get event Managed by users
13 Any 😄 OS Major advantages of Self-Hosted Runners Send egress thought NAT, VPN, etc. depends on user 😄 (Can make IP static) Network Egress Any 😄 Hardware
14 Hard to manage efficiently. Scalability Major disadvantages of Self-Hosted Runners Two big problems: ● Need to register the Runner to organization, repository, etc ● Need to managed the Runner for each organizations, repository, etc Provisioning (Platform) Hard to provision the runners. Sometimes we want to offer many machine types: ● Standard ● CPU specific ● Memory specific ● “Big” machines like other cloud provider products (GCP Cloud Build, AWS EC2, etc) Provisioning (Machines) Actual metrics
15 How to manage it? Disclaimer: Following slides include things that have already been implemented or are planned to be implemented. As such, they may be changed later or not implemented yet.
16 Provisioning Runner’s Machines Offer repository level runners (currently organization level) to isolate the runners by microservices. Runs a dedicated GKE cluster with actions-runner-controller/actions-runner-controller, a Kubernetes controller to manage Self-Hosted Runners. The runners created by this controller’s CRD RunnerDeployment is backed by StatefulSets. On runner start, the runner will access the GitHub API and register the runner to the repository.
18 Provisioning Machines variations Cloud Hosted Runners only provides one machine type (2 CPU cores, 7GB RAM, 14GB SSD) while the Self-Hosted Runners can be any machine type by adding labels. 1 CPU core 100GB RAM standard memory 20 CPU core 20GB RAM high 0.1 CPU core 1GB RAM tiny 👉 Big Next.js App 👉 Docker builds 👉 Closing stale PRs
21 Scaling Runners actions-runner-controller/actions-runner-controller also offers a webhook server and a CRD HorizontalRunnerAutoscaler (HRA) that receive the webhook event and scale the runners (similar to Kubernetes HorizontalPodAutoscaler(HPA)).
22 Scaling Runners Developers GitHub Runners ① Operation ② Webhook (workflow_job) ③ Create Runner Webhook Server ② Dispatch to queue GitHub Queue ④ Register, listen to queue ⑤ Event
23 Network Egress For security reasons, some GitHub Organizations has IP restriction on GitHub Organization. Cloud Hosted Runners has wide IP range and scope to the GitHub not the organization. Like normal “microservices” Egress of Self-Hosted Runners static public IP made possible by Cloud NAT + Cloud Router.
25 02 Serve as a Platform (Part 2) Next Time 🌕 🚀 Some topics ・How to manage the tenants on the Kubernetes? ・How to automatically register the repository runners? ・Integration with Starter Kits?