Slide 1

Slide 1 text

PERFORMANCE! / Michael Hamrah @mhamrah Software Engineer @ Getty Images

Slide 2

Slide 2 text

PERFORMANCE VS. SCALABILITY VS. RELIABILITY

Slide 3

Slide 3 text

TALKING POINTS Networking and TCP Web Servers App Servers Rails Caching Databases Front End Code Iaas (Amazon) vs PaaS (Heroku) Example Architectures

Slide 4

Slide 4 text

WHY IS THIS IMPORTANT? Converstion Rates Attention Span, Frustration Levels Trust and Web Performance Lessons Superbowl Website Failures

Slide 5

Slide 5 text

IT'S NOT ALL ABOUT YOUR CODE! Time spent in your code? Overall percentage?

Slide 6

Slide 6 text

NETWORKING! IT'S MOSTLY ALL ABOUT NETWORKING What happens when you type an address in your browser? The TCP/IP Stack: Data Link, Network, Transport, Application (also Physical, Session, Presentation) This is called the OSI model.

Slide 7

Slide 7 text

TCP: HOW DATA FLOWS FROM A TO B A node sends "packets" Packets must be of a certain size (~1500 bytes, 64k max) Only a certain amount of packets are allowed "in flight" Packets are guaranteed delivery, in-order (different than UDP) Starts with a Three-Way handshake. Syn -> Syn Ack -> Ack

Slide 8

Slide 8 text

HOW LONG DOES THIS TAKE? NYC to SFC: ~2530mi, RTT in vacuum: 27ms, RTT in fiber: 41.04ms IP and Routes add delay. (Check out Ping and Traceroute) Remember the three way handshake? Latency. Bandwidth may not matter Connections are scarce, precious resources! (Why we have connection pooling)

Slide 9

Slide 9 text

HOW DOES HTTP FIT IN? Application level protocol Defines how requests and responses sit on top of TCP Http 1.1: Keep-alive, pipelining Browsers 6-8 connections per server Think of 1mb of data split 1, 10, 100 ways

Slide 10

Slide 10 text

Most client-server optimization is about tuning for network latency and managing TCP connections. Networking isn't just about a browser and a server. Intra- and inter- data center connections are important! Optimize for throughput: keep data small, reduce latency, reduce connections.

Slide 11

Slide 11 text

WEB SERVERS! Web servers like nginx and apache handle http requests. Application servers run programs which produce dynamic content. Think Thin, Unicorn, Puma, Node, etc. Sometimes it's a blurry line. Depends on architecture.

Slide 12

Slide 12 text

THREADED VS. EVENTED How connections are mapped to requests How threads are mapped to requests Memory usage and context switching How does your OS help?

Slide 13

Slide 13 text

ALL ABOUT ASYNC Non-blocking io is essential in all cases EventMachine, Node, Twisted: Event loops Heroku's big problem with Rails

Slide 14

Slide 14 text

A TYPICAL SCENARIO LOAD BALANCER -> WEBSERVER -> APP SERVER Load Balancer Distributes Requests Web Servers (nginx/haproxy/varnish) handles connections, static content Application Servers or Process Pools respond to requests

Slide 15

Slide 15 text

SO WHAT DO WEB SERVERS ACTUALLY DO? Very good at handling http connections Parse http requests Add filters like gzip, authentication Buffering, Caching Reverse-Proxy support

Slide 16

Slide 16 text

DIFFERENCES WITH APPLICATION SERVERS (AND RUNTIMES!) Allow you to program dynamic content Thin, Unicorn, Puma, Passenger MRI, Rubinus, JRuby Node vs. Rails vs. Django vs. Whatever?

Slide 17

Slide 17 text

RACK A common interface between Ruby web servers and web frameworks. Provides simple http primitives Supports middleware Can mix Rails and Sinatra endpoints

Slide 18

Slide 18 text

DATABASES! They're beasts! Sql - ACID NoSql - BASE Both have their own set of problems.

Slide 19

Slide 19 text

TUNING DATABASES In both cases, it's about indexes. Learn to find slow queries. Avoid n+1 queries (go for eager loading). Optimize for read/write patterns.

Slide 20

Slide 20 text

SCALING DATABASES How do you add more nodes? Again, understand read/write patterns. Understand latency.

Slide 21

Slide 21 text

POLYGOT DATA LANDSCAPE Postgres, Mysql Cassandra, Riak, HBase MongoDb Redis Memcache

Slide 22

Slide 22 text

CACHING! Html Caching Fragment Caching Object Caching How easy is it to expire your cache?

Slide 23

Slide 23 text

HTML CACHING Uses http headers ETags, Cache-Control, Expires Anonymous vs. Authenticated Users How are you expiring cache?

Slide 24

Slide 24 text

FRAGMENT CACHING Still Html! More surgical ESI make tools like Varnish quite powerful

Slide 25

Slide 25 text

OBJECT CACHING Most common Most flexible Easy db integration Memcache for reads Redis for structured data Compress (CPU cycles are there) You're just denormalizing data

Slide 26

Slide 26 text

QUEUES! THEY'LL BE YOUR BEST FRIEND. Very Async Great for handling spikes Offload low-priority requests Great for elastic workers Resque, IronWorker, Amazon SQS

Slide 27

Slide 27 text

CLIENT-SIDE TECHNIQUES Sprites Minification Domain Sharding cache-control Async javascript Image management (WebP) CDNs!

Slide 28

Slide 28 text

PAAS VS. IAAS How do you want to run your apps? What control do you want or need? Other AWS Services: ElasticCache, R53/ELB, ASG, Search, CloudFormation, OpWorks, Multi-AZ, Multi-Region Cloud Services: Most tech is offered as SaaS NewRelic, Loggly

Slide 29

Slide 29 text

GOOGLE'S SPDY One persistent SSL connection True Duplex, no waiting Better window scaling Compressed headers (cookies, etc) Content Prioritization Server Push

Slide 30

Slide 30 text

EXAMPLES How would you design a news site? How would you design an e-commerce site? How would you design a social site?

Slide 31

Slide 31 text

PROBLEM The server request/response time is very long.

Slide 32

Slide 32 text

PROBLEM The server request/response time is very fast, but when more load increases, times increase dramatically.

Slide 33

Slide 33 text

PROBLEM A page takes a long time to load, but the server time is very fast.

Slide 34

Slide 34 text

PROBLEM We are sending images to a bunch of customers. Sending images is very slow, but nothing is maxed out. The problem increases when we try and send more images.