More bang for your buck: How Yelp autoscales Mesos & Marathon on AWS Spotfleet

Rob Johnson [email protected] More bang for your buck How Yelp
autoscales Mesos & Marathon on AWS Spot Fleet

Yelp’s Mission Connecting people with great local businesses.

• Mesos at Yelp • Autoscaling ◦ Services ◦ Clusters
• AWS Spotfleet

Yelp’s Platform as a Service

Marathon

• Transitioned to SOA architecture over 3 year period •
Monolith still exists, but is now deployed + monitored like any other service (just not so micro)

• 3 main production clusters • ~900 Marathon Apps in
biggest cluster • ~5500 Mesos tasks • ~600 Mesos Agents • Spans across metal DC and AWS

Autoscaling

Service Autoscaler

$ cat yelpsoa_configs/my_service/marathon-norcal-prod.yaml main: cpu: 10 mem: 500 instances: 10
cmd: dumb-init exec ./run-yelp

$ cat yelpsoa_configs/my_service/marathon-norcal-prod.yaml main: cpu: 10 mem: 500 min_instances: 5
max_instances: 20 metrics_provider: cpu decision_policy: pid cmd: dumb-init exec ./run-yelp

*/10 * * * * sudo autoscale_all_services

service_autoscaler.py target = get_target_utilization(service, instance) real = get_real_utilization(service, instance) required_instances
= decision_policy.get_instances(target,real) zk.set( ‘/autoscaling/service/instance/instances’, required_instances )

deploy_daemon.py zookeeper.watch(‘/autoscaling/’, handle_instance_change) def handle_instance_change(service, instance, new_val): current_val = get_instance_count(service,
instance) If new_val >= current_val: marathon.scale_app(‘service.instance’, new_val) else: drain_and_scale(‘service.instance)

Metrics Providers

- mesos_cpu - uwsgi - http

Decision Policies

- PID Controller - Threshold - Bespoke - Proportional

Results

Cluster Autoscaler

• Single decision policy - proportional • Runs every 20
minutes • Aim for 80% utilization • Errs on the side of defence - lots of checks to avoid accidently killing too much capacity

• Give each host a ‘fitness’ score according to how
much churn is caused by shutting it down. • Data points = AWS events, number of tasks, chronos batches.

AWS Spot Fleet

- Users bid for Amazon’s spare capacity - Lowest winning
bid is the $$ paid Used Used Used Available Available Available Available User A - $4 User A - $4 User B - $3 User C - $2 User C - $2 User D - $1 User D - $1 User D - $1

bid is the $$ paid Used Used Used User A - $2 User A - $2 User B - $2 User C - $2 User A - $4 User A - $4 User B - $3 User C - $2 User C - $2 User D - $1 User D - $1 User D - $1

bid is the $$ paid Used Used Used User A - $3 User A - $3 User B - $3 User B - $3 User A - $4 User A - $4 User B - $3 User B - $3 User C - $2 User C - $2 User D - $1 User D - $1

Conditions of Sale

2 minute timer http://169.254.169.254/latest/meta-data/spot/termination-time

Strategies for reducing risk

• High Bid Price

• High Bid Price ◦ Savings in low periods will
outweigh expenditure in expensive periods

• High Bid Price ◦ Savings in low periods will
outweigh expenditure in expensive periods ◦ We bid 2X instance price

• Diversify by AZ, Instance Type

• Diversify by AZ, Instance Type ◦ Ask Amazon to
fulfill diversifying across instance types, rather than picking the cheapest selection (Allocation Strategy)

module "norcal-prod-uswest1a-highcpus6" { source = "git::ssh://[email protected]/terraform-modules/paasta_spot_cluster" cluster = "norcal-prod" region
= "${var.region}" account = "${var.account}" ecosystem = "${var.ecosystem}" instances_data = "${file("instances_high_cpus_weighted.json")}" account_id = "${var.account_id}" valid_until = "2118-12-31T23:59:59Z" # One unit = 100 vCPU min_capacity = 7 max_capacity = 70 ami_type = "paasta-optimized" initial_target_capacity = 25 spot_price = 0.154 instance_profile = "paasta" }

robj@xenialdev1-uswest1cdevc:~/terraform/paasta master % cat instances_high_cpus_weighted.json { "instance_data": [ { "type":
"c4.4xlarge", "price": "2.098", "weight": "0.15" }, { "type": "c4.8xlarge", "price": "4.196", "weight": "0.35" }, { "type": "m4.4xlarge", "price": "2.234", "weight": "0.15" } ] ] } 16 vCPUs 36 vCPUs 14 vCPUs

PaaSTA’s best effort shut down

http://localhost:5050/maintenance/schedule client.scale_app(‘yelp-main.main’, 202) http://localhost:5050/master/reserve client.kill_task(‘yelp-main.main-foo’) $updown down all http://169.254.169.254/latest/meta-data/spot/termination-time

Is it worth it?

Type / Region us-west-1 us-east-1 us-west-2 c3.4xlarge 29.00% 0.00% 0.00%
c3.8xlarge 27.00% 0.00% 42.00% c4.4xlarge 52.00% 49.00% 78.00% c4.8xlarge 49.00% 53.00% 81.00% m4.10xlarge 65.00% 77.00% 65.00% m4.16xlarge 47.00% 59.00% 58.00% m4.4xlarge 60.00% 70.00% 62.00% r3.4xlarge 32.00% 0.00% 34.00% r3.8xlarge 41.00% 0.00% 48.00% r4.16xlarge 71.00% 62.00% 61.00% r4.4xlarge 45.00% 49.00% 35.00% r4.8xlarge 48.00% 34.00% 42.00% Weighted Total 47.00% 51.00% 60.00%

Future Plans

Predictive vs Reactive Autoscaling

Parallel Scaling

Deployment to more services

Conclusion

• Autoscaling provides Yelp business value as we save $$$
by reducing excess capacity • Running at 80% efficiency means we can quickly scale up services. • Spotfleet can further reduce our AWS bill, but comes with significant risk. • Mesos maintenance primitives provide building blocks for us to reduce this risk.

Questions

www.yelp.com/careers/ We're Hiring!

@YelpEngineering fb.com/YelpEngineers engineeringblog.yelp.com github.com/yelp

More bang for your buck: How Yelp autoscales Me...

More bang for your buck: How Yelp autoscales Mesos & Marathon on AWS Spotfleet

More Decks by Rob Johnson

Other Decks in Programming

Featured

Transcript