Slide 1

Slide 1 text

An Introduction j.mp/to-matsya

Slide 2

Slide 2 text

Ashwanth Kumar Dev/Ops @ Indix Hindu Mythology Fan OSS Contributor ashwanthkumar.in

Slide 3

Slide 3 text

Pre Presentation Poll

Slide 4

Slide 4 text

Pre Presentation Poll How many are using / have used AWS? How many of those are using / have used Spot Instances? How many of those instances are used in Production (like) Environment(s)?

Slide 5

Slide 5 text

AWS Spot Primer AWS is a Cloud provider of IAAS and PAAS They lease unused hardware at a lower cost as Spot instances No guarantees on how long they’re available Spot Prices are highly volatile But, highly cost effective if used right Spot’s “Demand vs Supply” is local to it’s Spot Market

Slide 6

Slide 6 text

AWS Layout + Spot Market Primer

Slide 7

Slide 7 text

US-East AWS Layout Primer - Regions US-West Singapore Tokyo

Slide 8

Slide 8 text

AWS Layout Primer - Availability Zones I 1 I 2 I 3 I 4 us-east-1a I 1 I 2 I 3 I 4 us-east-1b I 1 I 2 I 3 I 4 us-east-1c Region - US East (North Virginia)

Slide 9

Slide 9 text

AWS Layout Primer - Instances I 4 I 1 I 2 I 3 I 1 I 2 I 3 I 4 I 1 I 2 I 3 I 4

Slide 10

Slide 10 text

For a cluster, the spot markets can be viewed in the following dimensions ● Instance Types, Regions and Availability Zones The number of spot markets is product of all the above numbers. Example - Requirement for 36 CPUs per instance ● Instances - [d2.8xlarge, c4.8xlarge] ● AZs - [us-east-1a, us-east-1b, us-east-1c, …] ● Region - [us-east, us-west, …] - 9 regions ● Total in US-EAST (alone) => 2 * 1 * 5 = 10 spot markets AWS Spot Markets

Slide 11

Slide 11 text

Deep Dive github.com/ashwanthkumar/matsya

Slide 12

Slide 12 text

You run Spot clusters to save on costs Clusters span across AZs to protect against Spot price fluctuations Results in HUGE data transfer costs ASG always try to evenly distribute the machines and doesn’t take cost into account Matsya - Motivation

Slide 13

Slide 13 text

Goal - Always optimize for cost and keep the fleet running Scala app that monitors spot prices and moves the ASG to cheapest AZ Meant to be run as a CRON task Can fallback to OD (if required) Posts notifications to Slack when migrating Matsya

Slide 14

Slide 14 text

How Matsya Works?

Slide 15

Slide 15 text

How Matsya Works? ASG

Slide 16

Slide 16 text

How Matsya Works? Spot ASG

Slide 17

Slide 17 text

How Matsya Works? us-east-1a us-east-1c ... Spot ASG

Slide 18

Slide 18 text

How Matsya Works? us-east-1a us-east-1c ... Spot ASG

Slide 19

Slide 19 text

How Matsya Works? OD ASG us-east-1a us-east-1c ... Spot ASG (Optional)

Slide 20

Slide 20 text

matsya { working-dir = “local_run” slack-webhook = “http://hooks.slack.com/services/foo/bar/baz” clusters = [{ name = “Staging Hadoop Cluster” spot-asg = “as-hadoop-staging-spot” od-asg = “as-hadoop-staging-od” machine-type = “c3.2xlarge” bid-price = 0.420 od-price = 0.420 max-threshold = 0.99 nr-of-times = 3 fallback-to-od = false subnets = { “us-east-1a” = “subnet-east-1a” “us-east-1b” = “subnet-east-1b” “us-east-1c” = “subnet-east-1c” } }] } Matsya - Configuration

Slide 21

Slide 21 text

Deployed across the board for 2 months now Along with Vamana, enabled us to achieve ● ~50% of AWS Infrastructure is on Spot ● 100% of Hadoop MR workloads is on Spot Matsya at Indix

Slide 22

Slide 22 text

Support for other Spot products - Spot Fleet and Spot Blocks More notification systems Multiple Region support Multiple Product support Minimum number of OD instances Work In Progress Questions? github.com/ashwanthkumar/matsya

Slide 23

Slide 23 text

http://serverfault.com/questions/448746/ec2-auto-scaling-with-spot-and-on-demand-instances http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html https://aws.amazon.com/ec2/spot/bid-advisor/ References