Slide 1

Slide 1 text

SCALING ON EC2 Shawn Stratton!

Slide 2

Slide 2 text

WHY THE CLOUD? • Virtual infrastructure can be added quickly. • Control the costs. • Reduce the headaches.

Slide 3

Slide 3 text

WHY AMAZON? • Maturity - Amazon was the first real player in this space. • Additional Services Provided - RDS, SES, SNS, SQS. • Entire Ecosystem.

Slide 4

Slide 4 text

WHAT’S DIFFERENT ABOUT THE CLOUD? • Instances are NOT servers, they behave differently. • You’re bound by the providers SLA (Service Level Agreement). • You don’t have as many options.

Slide 5

Slide 5 text

SCALING

Slide 6

Slide 6 text

THE TWO FACES OF SCALING V e r t i c a l H o r i z o n t a l

Slide 7

Slide 7 text

VERTICAL SCALING Instance

Slide 8

Slide 8 text

VERTICAL SCALING • Pros • Simple - No change in Code or Maintenance. • Cost Effective to a point. ! • Cons • Cutover is painful - to a point. • Limit to how far you can scale.

Slide 9

Slide 9 text

HORIZONTAL Instance

Slide 10

Slide 10 text

HORIZONTAL SCALING • Pros • Can scale up/down without downtime. • Smaller increments in capacity. ! • Cons • Code and Maintenance complexity.

Slide 11

Slide 11 text

LET’S WALK THROUGH SOME ARCHITECTURES US East 1A EC2 Instance Elastic IP Internet Static Assets S3 RDS

Slide 12

Slide 12 text

HORIZONTAL SCALING US East 1A EC2 Instance Internet Static Assets S3 RDS EC2 Instance

Slide 13

Slide 13 text

REDUNDANT HORIZONTAL US East 1A Internet Static Assets S3 App DB US East 1B App DB Master DB

Slide 14

Slide 14 text

INTRODUCING AUTO SCALING GROUPS • Automated scaling in the cloud! • Generally based on Cloud Watch Metrics • Self Healing based on Health Tests • Scales On: • CPU • Network • Disk Usage • Load Balancer Metrics

Slide 15

Slide 15 text

GROUPS, LAUNCH CONFIGS, POLICIES, OH MY Auto Scaling Group CPU Up Policy CPU Down Policy Subnets Desired Size Max / Min Health Check Launch Configuration Instance Type Security Groups User Data (Init Scripts) Amazon Machine Image (AMI) Instances

Slide 16

Slide 16 text

QUICK WORD ON DEPLOYMENTS

Slide 17

Slide 17 text

END GAME Internet Content Delivery Network (CloudFront, Akamai, etc) Elastic Load Balancer Elastic Load Balancer Application Servers Static Servers Database Servers Utility Servers Master Database Cron CMS / App Admin

Slide 18

Slide 18 text

LOAD BALANCING TOOLS

Slide 19

Slide 19 text

ELASTIC LOAD BALANCERS • Supports HTTP, HTTPS, TCP, SSL and “Custom” protocols. • Integrates with Auto Scaling Groups. • Simple Configuration. • Can be created to be Internal to a VPC only. • Can only be used as a CNAME or Route 53 Alias as IP addresses change.

Slide 20

Slide 20 text

HAPROXY • Lightweight HTTP and TCP proxy/load balancer. • Simple configuration, can pre-configure a
 class of servers. • Supports: • Round Robin, Least Connections, URI & URL, param designation, HDR, 
 and RDP based balancing. • Configurable health checks and failover.

Slide 21

Slide 21 text

HTTP ACCELERATORS  (REVERSE PROXIES)

Slide 22

Slide 22 text

• C like configuration language. • Extendable. • Supports: • Edge Side Includes. • Stale-while-revalidate. • Redirecting, Rewriting, and URL Mapping. VARNISH

Slide 23

Slide 23 text

APACHE TRAFFIC SERVER • Apache like configuration. • Extendable via plugins. • Supports: • Edge Side Includes (via plugin). • Stale-while-revalidate (via plugin). • Redirects, Rewrites, and URL Mapping.

Slide 24

Slide 24 text

CONFIGURATION MANAGEMENT

Slide 25

Slide 25 text

PUPPET • Ruby based. • Declarative format. • Uses Ruby templates. • Has orchestration tier - mcollective. • Large Open Source Community. • Puppet Forge.

Slide 26

Slide 26 text

CHEF • Ruby based. • Declarative recipes, little more programatic. • Uses Ruby templates. • Has orchestration tier in Enterprise • Large Open Source Community. • Knife.

Slide 27

Slide 27 text

SALT • Python based. • Declarative format. • Started as an orchestration system. • Large Open Source Community.

Slide 28

Slide 28 text

AMAZON OPSWORKS

Slide 29

Slide 29 text

AMAZON OPSWORKS • Amazon service to manage “layers” of applications. • Based on Chef, adds AWS control. • Supports AutoScaling Groups. • Well documented in the Amazon Documentation.

Slide 30

Slide 30 text

MANAGEMENT TOOLS

Slide 31

Slide 31 text

MANAGEMENT CONSOLE

Slide 32

Slide 32 text

USING THE API • Can write your own API clients in any language you chose. • Restful and SOAP API. • Amazon believes in “Dog Fooding”. • Popular SDK’s out for many languages: • PHP (includes Zend Framework 2 integration for v2) • Java • Python • Ruby • Node.js • .NET • Android • iOS

Slide 33

Slide 33 text

CLOUD FORMATION • Uses JSON templates to build out infrastructure. • Can describe services to other services. ! • Supports:
 EC2 Instances & Security Groups, EBS Volumes, ELB, Elastic IPs, Auto Scaling Groups & Policies, RDS, DynamoDB, SimpleDB, SQS, SNS, Elastic Beanstalk, ElasticCache, CloudWatch alarms, CloudFront, S3, Identity & Access Management, Route 53 record management, VPC configuration including Subnets, Gateways, Route Tables, and ACLS.

Slide 34

Slide 34 text

CLOUD FORMATION TEMPLATE

Slide 35

Slide 35 text

THIRD PARTY SOLUTIONS - RIGHTSCALE • Supports multiple vendors. • Uses Templates & Right Scale images. • Basically replaces Amazon Console and Amazon specific services.

Slide 36

Slide 36 text

THIRD PARTY SOLUTIONS - OPEN SOURCE ! • Open Source. • Also available as SaaS via Scalr.com • Supports multiple cloud vendors. • GUI driven configuration. ! ! • Open Source. • Amazon specific. • Multi-Region Capable. • Uses Amazon concepts natively. Asgard Scalr

Slide 37

Slide 37 text

IMAGE CREATION

Slide 38

Slide 38 text

PACKER.IO • Creates images - AMI and others. • Uses JSON build file. • Supports multiple builders - Puppet, Chef, Salt.

Slide 39

Slide 39 text

AMINATOR • Only builds AMI. • Netflix backed. • Installs Packages, doesn’t necessary “Build”.

Slide 40

Slide 40 text

MONITORING

Slide 41

Slide 41 text

POPULAR SERVICES NOT RECOMMENDED! • Cacti. • Munin. • Ganglia. • Nagios/Icinga. ! ! These require configuration files to be altered for each machine. 
 ! ! !

Slide 42

Slide 42 text

CLOUD WATCH • Part of Management Console. • Stats available via API. • Default interval of 5 minutes, 
 can be upgraded. • Can store custom metrics. • Data used by Auto Scaling Groups 
 & Cloud Formation.

Slide 43

Slide 43 text

CLOUDWATCH

Slide 44

Slide 44 text

COLLECTD & GRAPHITE • Near real time stats. • Custom retention periods. • Various front-ends. • Infinite way to configure graphs. • No need to preconfigure stats,
 just send and it will record.

Slide 45

Slide 45 text

COLLECTD + GRAPHITE

Slide 46

Slide 46 text

COLLECTD + GRAPHITE

Slide 47

Slide 47 text

COLLECTD + GIRAFFE

Slide 48

Slide 48 text

STACKDRIVER • Attempts to be near real time. • Easy to configure & administer. • Fairly cheap considering alternatives. • Supports custom metrics.

Slide 49

Slide 49 text

STACKDRIVER

Slide 50

Slide 50 text

STACKDRIVER

Slide 51

Slide 51 text

HOW DO THESE LOOK IN PRODUCTION Virtual Private Cloud Availability Zone x2 (this is duplicated across us-east-1c and us-east-1d) us-east AWS Production Layout Diagram vat-elb ext-dws static Akamai vat-ats vat-php vat-python int-db ext-db int-solr ext-solr int-dws ext-dws ext-solr static ext-ats Internet Akamai / Internet ext-ats vat-ats haproxy ext-dws haproxy ext-db ext-solr haproxy int-dws haproxy int-db int-solr haproxy vat-python haproxy int-ats vat-php haproxy int-memcache Legend Caching Server Proxy / LB Application Server Data Source

Slide 52

Slide 52 text

THANK YOU