Latency's Worst Nightmare: Performance Tuning Tips and Tricks

Latency's Worst Nightmare: Performance Tuning Tips and Tricks

Delivering high performance web applications, using Amazon Web Services.

39488f9d172ab92fd352f2cd7b73258d?s=128

Matt Wood

April 18, 2013
Tweet

Transcript

  1. Latency’s Worst Nightmare: Performance Tuning Tips and Tricks Matt Wood

    Principal Data Scientist @mza
  2. Hello.

  3. Let’s talk about performance...

  4. Figure 3 Interactive user productivity versus computer response time for

    human-intensive interactions for system A E 600 3 - T 7 w z E 500 - U E w E - > > - - 400 - 3 n F 2 300 - 200 - 100 - 0 - 0 -" INTERACTIVE USER PRODUCTIVITY (IUP) - HUMAN-INTENSIVE COMPONENT OF IUP A MEASURED DATA (HUMAN-INTENSIVE " COMPONENT) 0 0 0 0 I 1 I I I 1 2 3 4 5 COMPUTER RESPONSE TIME (SI A. J. Thadhani, IBM Systems Journal 20 (4), 1981 Productivity and response time
  5. -0.70% -0.60% -0.50% -0.40% -0.30% -0.20% -0.10% 0.00% 50ms pre-

    header 100ms pre- header 200ms post- header 200ms post- ads 400ms post- header Page load time and average daily searches per user http://www.webperformancetoday.com/2013/04/10/cloud-connect-2013-web-acceleration-and-front-end-optimization-slides/
  6. -5.00% -4.50% -4.00% -3.50% -3.00% -2.50% -2.00% -1.50% -1.00% -0.50%

    0.00% 50 200 500 1000 2000 Percent change Added delay Queries per visitor Query refinement Revenue per visitor Any clicks Satisfaction Page load delay and business metrics http://www.webperformancetoday.com/2013/04/10/cloud-connect-2013-web-acceleration-and-front-end-optimization-slides/
  7. 2.2s 15.4% reduction in page load time increase in conversion

    rate https://blog.mozilla.org/metrics/2010/04/05/firefox-page-load-speed-%E2%80%93-part-ii/
  8. “Speed is more than a feature.” @fredwilson

  9. Let’s talk about a web request...

  10. Initial connection SSL negotiation Time to first byte Content download

    %CPU kbps
  11. Initial connection SSL negotiation Time to first byte Content download

    %CPU kbps Remote
  12. Initial connection SSL negotiation Time to first byte Content download

    %CPU kbps Remote Browser
  13. Initial connection SSL negotiation Time to first byte Content download

    %CPU kbps Remote Browser
  14. http://www.stevesouders.com/blog/2012/02/10/the-performance-golden-rule/

  15. None
  16. None
  17. Let’s talk about a web request...

  18. Web server Application logic Database

  19. Web server Application logic Database

  20. Web server Application logic Database

  21. Web server Application logic Database

  22. Web server Application logic Database

  23. Web server Application logic Database

  24. Web server Application logic Database

  25. Web server Application logic Database

  26. Web server Application logic Database 3 4 6 5 7

    2 1
  27. Web server Application logic Database 3 4 6 5 7

    2 1
  28. Initial connection SSL negotiation Time to first byte Content download

    %CPU kbps TTFB
  29. Initial connection SSL negotiation Time to first byte Content download

    %CPU kbps Network latency Download + negotiation time
  30. None
  31. Variable TTFB based on geographic location. NYC London Sydney

  32. Reduce internet induced latency. CloudFront content delivery network. Lower latency.

    Faster downloads.
  33. Reduce internet induced latency. CloudFront content delivery network. Lower latency.

    Faster downloads.
  34. Europe Amsterdam (2) Dublin Frankfurt (2) London (2) Madrid Milan

    Paris (2) Stockholm South America Sao Paulo North America Ashburn, VA (2) Dallas, TX (2) Hayward, CA Jacksonville, FL Los Angeles, CA (2) Miami, FL Newark, NJ New York, NY (3) Palo Alto, CA Seattle, WA San Jose, CA South Bend, IN St. Louis, MO CloudFront Edge Locations
  35. Static and dynamic content Cache dynamic pages (search results). Use

    query strings or cookie for cache keys. Network and Path optimizations accelerate even unique content.
  36. Web server Application logic Database 3 4 6 5 7

    2 1 Content delivery
  37. 7 Web server Application logic Database 3 4 6 5

    1 2 Content delivery
  38. Web server Application server Database

  39. Web server Application server Database

  40. Application server Database

  41. Application server Thread safe? Application state? Database

  42. Web server Application server Database

  43. Web server Application server Database Unit of scale

  44. Web server Application server Database Load balancer

  45. Web server Application server Database Web server Application server Web

    server Application server Load balancer
  46. Load balancer Web server Application server Database Web server Application

    server Web server Application server
  47. Load balancer Web server Application server Database Web server Application

    server Web server Application server
  48. Decouple your service tiers Separation of concerns. Easier to manage.

    Drives higher availability.
  49. Build for horizontal scale Decrease request contention. Reduce capacity planning

    headaches. Requires a stateless application architecture.
  50. Small things, loosely coupled. Do one thing, and do it

    well. The Unix Way. Take a look at the Unicorn and Rainbows approach. Asynchronous be default (where possible).
  51. Reduce response time. Concurrency limits can reappear quickly. Limit impact

    on performance through rapid scaling.
  52. Load balancer

  53. Load balancer

  54. Load balancer

  55. Fast booting with EBS-backed instances. Linux is faster to boot

    than Windows. EBS-backed instances are faster than S3 backed.
  56. Pre-baked EBS-backed AMIs. Each code deployment creates a new AMI.

    AMI is the unit of deployment.
  57. Automate response with Auto Scaling. Set operational thresholds. Faster than

    manual response (especially at 3am).
  58. Time-based response with OpsWorks. Set operational times. ‘Follow the sun’

    response.
  59. Pre-emptive scaling. Contact your account manager, or get in touch

    via a Premium Support ticket.
  60. Reserve capacity. Lower costs. Guaranteed availability.

  61. 7 Database 3 4 6 5 1 2 Content delivery

    Horizontal scale
  62. 7 Database 4 6 5 1 2 Content delivery Horizontal

    scale 3
  63. A range of instance types Full spectrum of price/performance options.

  64. Choosing the right instance type. Application specific. Benchmark. CloudWatch metrics.

  65. Benchmark on business metrics. Relate application metrics to business metrics.

    Customers supported/instance. Photos processed/dollar.
  66. The Canary in the Coal Mine Standardize on 64-bit AMIs.

    Deploy across instance types. Evaluate new instance types with real traffic.
  67. None
  68. None
  69. None
  70. 7 Database 4 6 5 1 2 Content delivery Horizontal

    scale 3 Instance selection
  71. 7 Database 6 5 1 2 Content delivery Horizontal scale

    3 Instance selection 4
  72. Interface with the data store. Faster if you don’t have

    to go to disk. Increased concurrency.
  73. Caching. Store query results in memory. Writes go to disk.

  74. Amazon ElastiCache. Deploy, operate and scale in-memory caches.

  75. Managing state. Transient data only. Web server state, high score

    tables, etc. Time consuming task results (many to many query results).
  76. Best practices. Assume cold cache latency in application architecture. Set

    appropriate time-to-live. Batch requests rather than sequential single. Architect for cache failure.
  77. 7 Database 6 5 1 2 Content delivery Horizontal scale

    3 Instance selection 4 Caching
  78. 7 Database 6 1 2 Content delivery Horizontal scale 3

    Instance selection 4 Caching 5
  79. Accelerating reads. Vertical scale vs horizontal scale.

  80. Horizontal scale. Add additional DB resources for scale. Scale out

    for reads. Shard for reads and writes.
  81. Vertical scale. More resources for a single DB engine. Add

    memory for DB caches. Add CPU for more intensive queries.
  82. Scaling IO. Provision throughput for your app. Reserved. Available. Elastic.

    Available on EBS, Amazon RDS and DynamoDB.
  83. Provisioned throughput is consistent. Consistent, predictable performance. Relational databases with

    RDS. NoSQL stores with DynamoDB. Relational & NoSQL with EC2 and EBS.
  84. Provisioned throughput with RDS. 12.5k IOPS on MySQL. 25k IOPS

    on Oracle. 10k IOPS on SQL Server. Provision up to 30k for reduced latency.
  85. Provisioned throughput and instance types. Optimized for provisioned IO storage:

    m1.large: 500 Mbps m1.xlarge, m2.xlarge, m2.4xlarge: 1000 Mbps
  86. Provisioned throughput and DynamoDB. Consistent performance with unlimited throughput and

    storage. Single digital latency.
  87. Build for a uniform workload. Evenly distribute query patterns by

    key. Use a wide range of key values.
  88. 7 6 1 2 Content delivery Horizontal scale 3 Instance

    selection 4 Caching 5 Read capacity
  89. 7 1 2 Content delivery Horizontal scale 3 Instance selection

    4 Caching 5 Read capacity 6
  90. Standard EBS volumes. Moderate or bursty workloads. 100 IOPS, bursting

    to hundreds of IOPS. Bursting is good for boot volumes.
  91. Provisioned IOPS with EBS volumes. Predictable, high performance for IO

    intensive workloads. 2000 IOPS per volume. Stripe volumes for additional IO. Deliver within 10% of the performance, 99.9% of the time.
  92. EBS-Optimized instances. m1.large, m2.2xlarge, m3.xlarge: 500 Mbps m1.xlarge, m2.4xlarge, m3.2xlarge,

    c1.xlarge: 1000 Mbps
  93. High bandwidth networking cg1.4xlarge, cc2.8xlarge, hi1.4xlarge, hs1.8xlarge and cr1.8xlarge run

    on non-blocking, 10 gigabit networking. Not EBS-Optimized, but can be used with provisioned IOPS volumes.
  94. High I/O instances Designed for for high throughput database workloads.

    2 x 1TB SSDs 2 GB/s for reads 1.1 GB/s for writes
  95. Para-virtual instances 4kb random reads: 120k IOPS 4kb random writes:

    10k - 80k IOPS
  96. HVM instances (including Windows) 4kb random reads: 90k IOPS 4kb

    random writes: 9k - 75k IOPS
  97. High Storage instances High sequential IO. 24 x 2TB drives.

    2.4 GB/s of 2MiB sequential reads. 2.6 GB/s for sequential writes.
  98. 7 1 2 Content delivery Horizontal scale 3 Instance selection

    4 Caching 5 Read capacity 6 Block store
  99. Initial connection SSL negotiation Time to first byte Content download

    %CPU kbps Back end Front end
  100. http://amzn.to/170S1VV http://amzn.to/Zod4PY

  101. 1. Make fewer HTTP requests. 2. Use a content delivery

    network. 3. Add an expires header. 4. GZIP components. 5. Put style sheets at the top. 6. Put scripts at the bottom. 7. Avoid CSS expressions. 8. Make JavaScript and CSS External. 9. Reduce DNS lookups. 10. Minify Javascript. 11. Avoid redirects. 12. Remove duplicate scripts. 13. Configure ETags. 14. Make AJAX cacheable. http://stevesouders.com/hpws/rules.php 14 rules for faster loading web sites
  102. 7 1 3 4 6 5 Content delivery Instance selection

    Caching Read capacity Front end optimization 2 Horizontal scale Block store
  103. One more thing...

  104. US East

  105. US East Sydney Dublin Route 53

  106. Thank you! matthew@amazon.com @mza