Slide 1

Slide 1 text

Introducing Vamana j.mp/to-vamana

Slide 2

Slide 2 text

About Me ● Dev/Ops @ Indix ● Hindu Mythology Fan ● OSS Contributor ● ashwanthkumar.in

Slide 3

Slide 3 text

init Auto scalars are init systems in your data center

Slide 4

Slide 4 text

Enter AWS

Slide 5

Slide 5 text

● Scales massively ● Has Cool Off capabilities to avoid scale storms ● Auto Rebalancing across AZs / Subnets ● Integration with ELB Amazon ASG - Good Parts

Slide 6

Slide 6 text

Amazon ASG - Limitations ● Can support only 1 launch configuration actively ○ Single Instance type ○ Single life cycle - Spot / OD ● Auto Scaling Tightly coupled with only Cloudwatch ● Alarms can be triggered (automatically) but only based on a single metric ● Limited stat functions - avg, sum, min, max and # data samples

Slide 7

Slide 7 text

Enter Vamana http://github.com/ashwanthkumar/vamana2

Slide 8

Slide 8 text

Vamana Architecture Vamana - Collect Demand and Supply metrics from CloudWatch / Custom Metrics System - Use the Scalar Implementation to compute the new desired capacity. - Update the ASG with the new “Desired”. Push Demand and Supply Metrics Get Demand And Supply Metrics Set the computed “Desired” Value

Slide 9

Slide 9 text

Vamana for Hadoop1 Vamana - Collect Demand and Supply metrics from CloudWatch / Custom Metrics System - Use the Scalar Implementation to compute the new desired capacity. - Update the ASG with the new “Desired”. Push MR Demand and Supply Metrics Get MR Demand And Supply Metrics Set the computed “Desired” Value

Slide 10

Slide 10 text

Demand vs Supply Metrics for Hadoop ● We collect supply metrics from the Cluster Summary table ○ map_supply ○ reduce_supply ● Demand metrics are collected as cumulative sum of map & reduce tasks of all Running jobs ○ map_demand ○ reduce_demand https://github.com/ashwanthkumar/hadoop-as-publisher

Slide 11

Slide 11 text

Demand vs Supply Metrics for Hadoop

Slide 12

Slide 12 text

Vamana - Configuration

Slide 13

Slide 13 text

After Vamana - Demand vs Supply

Slide 14

Slide 14 text

After Vamana - Cost Reduction Savings ~ $ 30 per day

Slide 15

Slide 15 text

Vamana Vamana - Collect Demand and Supply metrics from CloudWatch / Custom Metrics System - Use the Scalar Implementation to compute the new desired capacity. - Update the ASG with the new “Desired”. Push Demand and Supply Metrics Get Demand And Supply Metrics Set the computed “Desired” Value

Slide 16

Slide 16 text

Vamana - Pluggable App Scalar Vamana - Collect Demand and Supply metrics from CloudWatch / Custom Metrics System - Use the Scalar Implementation to compute the new desired capacity. - Update the ASG with the new “Desired”. Push Demand and Supply Metrics Get Demand And Supply Metrics Set the computed “Desired” Value

Slide 17

Slide 17 text

Vamana - Pluggable Metric Store Vamana - Collect Demand and Supply metrics from CloudWatch / Custom Metrics System - Use the Scalar Implementation to compute the new desired capacity. - Update the ASG with the new “Desired”. Push Demand and Supply Metrics Get Demand And Supply Metrics Set the computed “Desired” Value

Slide 18

Slide 18 text

Vamana - Pluggable Auto scalar Vamana - Collect Demand and Supply metrics from CloudWatch / Custom Metrics System - Use the Scalar Implementation to compute the new desired capacity. - Update the ASG with the new “Desired”. Push Demand and Supply Metrics Get Demand And Supply Metrics Set the computed “Desired” Value

Slide 19

Slide 19 text

Questions? Thank you! https://github.com/ashwanthkumar/vamana2

Slide 20

Slide 20 text

Meta Implementation Details and other notes

Slide 21

Slide 21 text

Metric Collector

Slide 22

Slide 22 text

App Scalar

Slide 23

Slide 23 text

Hadoop Scalar

Slide 24

Slide 24 text

Auto scalar

Slide 25

Slide 25 text

● We’ve lots of datapipelines running on various versions of Hadoop ● Each of them have their own usage pattern ○ A Staging cluster has only workloads for 3-4 hours a day ○ Production cluster has workloads 24x7 ● We started having Scale Up and Scale Down stages in our Datapipelines ● Initially it helped but started breaking when ○ Pipeline fails before completion and the cluster is not scaled down ○ Every new pipeline created had to have a scale up and scale down stage ○ More than 1 pipeline started sharing the cluster Hadoop @ Indix