https://sorah.jp/ | GitHub @sorah | Twitter @sora_h Site Reliability Engineer at Cookpad Global Rubyist, Ruby committer Operating AS59128 Interests: Site Reliability, Networking, Distributed systems
Cookpad? • The world’s largest recipe sharing platform • 69 countries, 22 regions • Rapidly growing in the world • Code is different, unshared between Global and JP
Cookpad? • The world’s largest recipe sharing platform • 69 countries, 22 regions • Rapidly growing in the world • Code is different, unshared between Global and JP • But the domain name is shared!
• We chose Route53 Latency Based Routing at first • DNS returns IP addresses of the closer region from a resolver • If a requested service lives in another region, reverse- proxy to the alternate region Fighting with Latency
• This is the minimum • 2 regions are not enough • We have to serve in Middle East, Europe, UK … • They’re too far away from both US east and Japan Fighting with Latency
Past CDN usage in Cookpad • User provided Images • Dynamic resizing provided by an internal service • Cache resized images • Static assets (CSS, JavaScript, Images) • No Dynamic Content
No Dynamic Content • No dynamic content • (Contents served via CDN are guaranteed static by their URL) • As an old-school CDN user: • Slow deployments, slow purging… • It’s too risky when we fail at something
The first use case • Images and Assets for Global Platform • Achieved higher hit rate, better performance than our previous CDN provider • Shielding <3 • Few and fast POPs
Fastly in Cookpad • And https://cookpad.com/ (except JP) • Existing traffic in JP is high, but most doesn’t need CDN • We have many legacy clients, also we determined so risky, on TLSv1.0/1.1 and 3DES issue.
Managing Fastly services • In Cookpad, we do codification for (almost) everything • i.e. AWS Route 53, EC2 Security Groups, ELB, … • Codification makes reviewing and managing history very easy • Fastly isn’t a exception
sorah/codily • https://github.com/sorah/codily • Simple tool to manage Fastly services • (Alternate way available now is Terraform, but we’ve never used)
sorah/codily • codily --apply --target my-awesome—service --activate • So easy • Useful… especially when testing VCLs (Edit in local editor → Run command → Activation done → Test → Edit → …)
Retry with Restart • “restart” allows us to restart processing from the (almost) beginning • What’s important here is “some variables are kept after restarts” • We can do any magics here, like dispatching multiple backend requests, in a single user request, intentionally, using “restart” • (Original use case is just for retry I guess)
Retry with Restart • We replicates some production data to development servers • But for images, without copying, we implement “retry on production backend” when 404 returned from development backend
Retry with Restart • (Found this is mentioned in the official doc now) • https://docs.fastly.com/guides/performance-tuning/ checking-multiple-backends-for-a-single-request
Dealing with X-Forwarded-For • Source IP of requests is important. • Logging, Analyzing … • We have to deal with X-Forwarded-For header when requests come from CDN
Dealing with X-Forwarded-For • Most web application framework whitelists “private IP range” for a proxy included in X-Forwarded-For header • Implementing whitelisting Fastly IPs carefully in every app takes maintenance cost • And, some load balancers (e.g. AWS ELB) overwrites XFF header!
Dealing with X-Forwarded-For • How we did: • Send X-Forwarded-For in different name (Requires VCL) • Overwrite X-Forwarded-For with it, , on our-side reverse proxy… when request comes from Fastly IPs
˖ 8FˏSFTPDBSFGVMMZQVUUJOHEZOBNJDDPOUFOUPO $%/ ˖ 6TFreturn (fetch);POMZJOTQFDJDDPOEJUJPO • .VDITBGFS • 8FEPTBNFXIJUFMJTUJOHPOPVS7BSOJTIPO&$ Use as a reverse proxy
What we require to a CDN • Stable • I think it’s really improved in these years • Don’t obscure failures, incidents • status.fastly.com always truthful • High cache efficiency