Idobata on GKE - Moving an Ordinary Rails App

Idobata on GKE Moving an Ordinary Rails App 2019/08/21 ESM
Real Lounge

Hi there • Gentaro Terada (@hibariya) • Works at ESM,
Inc. • A college student (engineering) • Likes Ruby, Internet, and Programming • https://hibariya.org

Today we will talk about: How was moving a Rails
app from EC2 to GKE? • Beneﬁts • Challenges

Motivations / Why Kubernetes Moving from a PaaS built on
AWS EC2... • to terminate an environment which is reaching its EOL • to scale quickly • to take advantage of containers' mobility • to save running costs • to make staging env more similar to production one • to make development env more similar to production one

Why GKE • Well managed (Monitoring, networking, logging, etc) •
Amount of information source • Relatively newer K8s

IaC: Terraform + Kubernetes Terraform for managing cloud resources that
are not cared for by K8s. +

The components of a Rails app (Idobata) • Web (RoR)
• SSE Server (Go) • Background Job (Que) • PostgreSQL (Cloud SQL) • Redis (Cloud Memorystore) • Memcached

The components of a Rails app (Idobata) VPC Network Cloud
Load Balancing (L4) PostgreSQL Cloud SQL Redis Cloud Memorystore K8s Cluster Kubernetes Engine Web Rails Background Job Workers Que Eventd (SSE Server) Memcached NGINX Ingress Controller Rails CDN AWS Cloudfront peering peering HTTP Traffics

The things we have done this time ☑ Terminate an
environment which is reaching its EOL ☑ Scale quickly ☑ Take advantage of containers’ mobility ☑ Save running costs ☑ Make staging env more similar to production one ☐ Make development env more similar to production one (WIP)

Scale Quickly

HPA + Cluster Autoscaler Let’s say we got a sudden
traﬃc increase... 1. When the CPU usage of a services becomes higher 2. Then HPA increases the number of pods 3. If the CPU usage does not settle down, HPA keeps increasing the number of pods and eventually runs out of the CPU/Memory resources 4. Then the cluster increases the number of nodes

HPA + Cluster Autoscaler All you have to do is
code it in a concise manner. The cluster will add/remove nodes to satisfy the pod resource requests automatically. HPA: Horizontal Pod Autoscaler (K8s) Settings for cluster autoscaling (Terraform) https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/ https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler

Take Advantage of Containers’ Mobility

Take advantage of containers’ mobility For example: when you are
going to upgrade Ruby, then... • you do not have to update Chef cookbooks • you do not always have to stop the whole service • you can run tests on a container which is very similar to the production (or uses the same base image)

Make Staging Env More Similar to Production One

Stopped using Heroku as a "staging" • It was far
from similar to our production env ◦ Running on totally different infrastructure ◦ Using different gems • This time, a very similar and much cheaper environment could be prepared with preemptible machines ◦ Of course, it is not free, has a trade-off The app uses different gems on Heroku

Challenges

Keep Existing NGINX-rate-limiting Spec

Options • Use NGINX Ingress • Use API gateway such
as Ambassador ◦ did not try this time • Proxy traﬃcs between GLBC and Rails by NGINX ◦ not sure if it is correct ◦ did not try this time

Using NGINX Ingress on GKE: Pros • You can use
NGINX to control the behavior • Applying conﬁguration seems faster • The development environment will be more similar to the production

Using NGINX Ingress on GKE: Cons • Cannot use Cloud
CDN ◦ Cloud CDN requires L7 LB ◦ NGINX Ingress uses L4 LB ◦ -> We made Cloudfront handle traﬃcs for /assets/* this time • Cannot use Google-managed DV certiﬁcate renewal ◦ The same reason as above ◦ -> We arranged our own Cert-Manager instead

VPC Network Cloud Load Balancing (L4) PostgreSQL Cloud SQL Redis
Cloud Memorystore K8s Cluster Kubernetes Engine Web Rails Background Job Workers Que Eventd (SSE Server) Memcached NGINX Ingress Controller Rails CDN AWS Cloudfront peering peering HTTP Traffics

Managing Pod Memory Resource

OOMKilled with 9 (SIGKILL) If a container exceeds its memory
limit, it will be killed with SIGKILL immediately. HTTP requests might be aborted (broken pipe) due to this behavior. https://github.com/kubernetes/kubernetes/issues/40157

How to terminate a container gradually? Disabled pod OOM Killer
and created our own liveness probe in Go that mark a container as “unhealthy” when it exceeds its memory limit. https://github.com/golang-samples/gopher-vector

Moving from Unicorn to Puma

Motivation • Single process seems match Docker • Puma is
the default HTTP server from Rails 5.0 • Hopefully, it reduces request queueing time of our app

A problem: memory usage grew up rapidly And came up
to about x1.5-2.0 usage comparing Unicorn. This problem set us oﬀ trying the followings: • Reduce the number of arena by setting MALLOC_ARENA_MAX ◦ Worked well for our app ◦ Although it is not expensive, it is not free: space-time tradeoﬀ (AFAIK) • Change the memory allocator such as jemalloc ◦ Worked well like a charm; adopted ◦ Growing usage seemed relatively slower ◦ This time we have chosen jemalloc 3.6 https://www.speedshop.co/2017/12/04/malloc-doubles-ruby-memory.html https://bugs.ruby-lang.org/issues/14718

Conclusion K8s on GKE made infrastructure-management easier and made our
application more robust. Although there were several challenges we had to struggle with, well-managed K8s for ordinary Rails apps seems one good choice today.

Idobata on GKE - Moving an Ordinary Rails App

Idobata on GKE - Moving an Ordinary Rails App

Hibariya Hi

More Decks by Hibariya Hi

Other Decks in Programming

Featured

Transcript