Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Serving GitHub Actions Self-Hosted Runner as a Platform (Part 1: Introduction)

Keke
November 05, 2021

Serving GitHub Actions Self-Hosted Runner as a Platform (Part 1: Introduction)

Merpay SRE Tech Talk 2021/11/05

Keke

November 05, 2021
Tweet

More Decks by Keke

Other Decks in Programming

Transcript

  1. 1
    Serving GitHub Actions Self-Hosted Runner
    as a Platform
    Merpay・Mercoin SRE
    @keke
    Merpay SRE Tech Talk 2021/11/05

    View Slide

  2. 2
    Agenda
    01 ● Introduction to GitHub Actions Self-Hosted Runner
    (Part 1)
    02 ● Serve as a Platform (Part 2)
    Today is
    part 1 😄

    View Slide

  3. 3
    01 Introduction to GitHub Actions Self-Hosted Runner
    (Part 1)

    View Slide

  4. 4
    What’s GitHub Actions?
    GitHub native continuous Integration (CI) tool which is triggered on
    GitHub events (e.g., Pull request creation, comment on an issue, etc.)
    A “Workflow” has “Job(s)” that runs an “Action”, which is offered step
    officially by third-party or by yourself.

    View Slide

  5. 5
    Workflow

    View Slide

  6. 6
    How it works?
    Developers GitHub Runs
    Workflow
    on
    Runners
    ② Operation
    ① Listen to a queue
    ( HTTP long polling)
    ⑤ Results &
    Updates
    Commit
    Status
    ③ Dispatch
    to queue
    GitHub
    Queue
    ④ Get event

    View Slide

  7. 7
    Types of Runners
    Cloud
    Hosted
    Self
    Hosted

    View Slide

  8. 8
    Specify Runner types
    👈

    View Slide

  9. 9
    Types of Runners
    1. Cloud Hosted (GitHub Managed) Runners
    Users doesn’t need to prepare anything (instead of money 💰)
    Developers GitHub Runs
    Workflow
    on
    Runners
    ② Operation
    ① Listen to a queue
    ( HTTP long polling)
    ⑤ Results &
    Updates
    Commit
    Status
    ③ Dispatch
    to queue
    GitHub
    Queue
    ④ Get event
    Managed by
    GitHub

    View Slide

  10. 10
    ● Windows Server
    ○ 2022
    ○ 2019
    ● macOS
    ○ Big Sur 11
    ○ Catalina
    ● Ubuntu
    ○ 20.04
    ○ 18.04
    Installed software are limited.
    OS
    Major Limitations of Cloud Hosted Runners
    IP addresses are not static in
    GitHub organization level, shared
    by other organizations. Can’t
    configure good firewall. GitHub
    can have IP restrictions on
    organization so the runners can’t
    access GitHub.

    Ref: Access GitHub Meta API for IP addresses
    (there are many 😄) 

    $ curl https://api.github.com/meta
    Network Egress
    Only offers one machine type
    ● 2 CPU core

    ● 7GB RAM

    ● 14GB SSD

    Sometimes there's too much or
    too little, not cost efficient. For
    example, big Go projects wants
    CPU specific machines while
    Next.js project wants memory
    specific machines.

    Hardware
    ● About GitHub-hosted runners, https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners
    ● GItHub Actions Virtual Environments, https://github.com/actions/virtual-environments

    View Slide

  11. 11
    Types of Runners
    1. Self-Hosted Runners (actions/runner)
    GitHub only dispatches event to the GitHub queue.
    Developers GitHub Runs
    Workflow
    on
    Runners
    ② Operation
    ① Register runner,
    Listen to a queue
    ⑤ Results &
    Updates
    Commit
    Status
    ③ Dispatch
    to queue
    GitHub
    Queue
    ④ Get event
    Managed by
    users

    View Slide

  12. 12
    Organization Level
    3 Types of Self-Hosted Runners
    Repository Level

    Enterprise Level
    Repo A
    Repo B Repo Z
    Repo W
    Org A
    Org P
    Repo A
    Repo B Repo Z
    Repo W
    Org A
    Org P
    Repo A Repo B Repo C

    View Slide

  13. 13
    Any 😄
    OS
    Major advantages of Self-Hosted Runners
    Send egress thought NAT, VPN, etc.
    depends on user 😄
    (Can make IP static)
    Network Egress
    Any 😄

    Hardware

    View Slide

  14. 14
    Hard to manage efficiently.
    Scalability
    Major disadvantages of Self-Hosted Runners
    Two big problems:
    ● Need to register the
    Runner to organization,
    repository, etc
    ● Need to managed the
    Runner for each
    organizations, repository,
    etc
    Provisioning
    (Platform)
    Hard to provision the runners.
    Sometimes we want to offer
    many machine types:
    ● Standard
    ● CPU specific
    ● Memory specific
    ● “Big” machines
    like other cloud provider
    products (GCP Cloud Build,
    AWS EC2, etc)
    Provisioning
    (Machines)
    Actual metrics


    View Slide

  15. 15
    How to manage it?
    Disclaimer:
    Following slides include things that have already been implemented or are planned to be
    implemented. As such, they may be changed later or not implemented yet.

    View Slide

  16. 16
    Provisioning Runner’s Machines
    Offer repository level runners (currently organization level) to
    isolate the runners by microservices.
    Runs a dedicated GKE cluster with
    actions-runner-controller/actions-runner-controller, a Kubernetes
    controller to manage Self-Hosted Runners.
    The runners created by this controller’s CRD RunnerDeployment is
    backed by StatefulSets.
    On runner start, the runner will access the GitHub API and register
    the runner to the repository.

    View Slide

  17. 17
    Kubernetes Manifest to provision runners

    View Slide

  18. 18
    Provisioning Machines variations
    Cloud Hosted Runners only provides one machine type (2 CPU
    cores, 7GB RAM, 14GB SSD) while the Self-Hosted Runners can be
    any machine type by adding labels.
    1 CPU core
    100GB RAM
    standard
    memory
    20 CPU core
    20GB RAM
    high
    0.1 CPU core
    1GB RAM
    tiny
    👉 Big Next.js App 👉 Docker builds 👉 Closing stale PRs

    View Slide

  19. 19
    Runners with labels
    👈

    View Slide

  20. 20
    Specify Runner by labels
    Labels

    View Slide

  21. 21
    Scaling Runners
    actions-runner-controller/actions-runner-controller also offers a
    webhook server and a CRD HorizontalRunnerAutoscaler (HRA)
    that receive the webhook event and scale the runners (similar to
    Kubernetes HorizontalPodAutoscaler(HPA)).

    View Slide

  22. 22
    Scaling Runners
    Developers GitHub Runners
    ① Operation ② Webhook
    (workflow_job)
    ③ Create
    Runner
    Webhook
    Server
    ② Dispatch
    to queue
    GitHub
    Queue
    ④ Register,
    listen to queue
    ⑤ Event

    View Slide

  23. 23
    Network Egress
    For security reasons, some GitHub
    Organizations has IP restriction
    on GitHub Organization.
    Cloud Hosted Runners has wide IP
    range and scope to the GitHub
    not the organization.
    Like normal “microservices”
    Egress of Self-Hosted Runners
    static public IP made possible by
    Cloud NAT + Cloud Router.

    View Slide

  24. 24
    ✅ Provisioning Runners variations
    ✅ Scalability of Runners
    ✅ GitHub IP restrictions

    View Slide

  25. 25
    02 Serve as a Platform (Part 2)
    Next Time 🌕 🚀
    Some topics
    ・How to manage the tenants on the Kubernetes?
    ・How to automatically register the repository runners?
    ・Integration with Starter Kits?

    View Slide

  26. 26
    Thank you!

    View Slide