Things to Know About Web Performance

Things to Know About Web Performance

An overview of performance related topics to make systems fast and scalable.

3ca5501cb61a4251bd1e6f0a878bb8d4?s=128

Michael Hamrah

October 12, 2013
Tweet

Transcript

  1. PERFORMANCE! / Michael Hamrah @mhamrah Software Engineer @ Getty Images

  2. PERFORMANCE VS. SCALABILITY VS. RELIABILITY

  3. TALKING POINTS Networking and TCP Web Servers App Servers Rails

    Caching Databases Front End Code Iaas (Amazon) vs PaaS (Heroku) Example Architectures
  4. WHY IS THIS IMPORTANT? Converstion Rates Attention Span, Frustration Levels

    Trust and Web Performance Lessons Superbowl Website Failures
  5. IT'S NOT ALL ABOUT YOUR CODE! Time spent in your

    code? Overall percentage?
  6. NETWORKING! IT'S MOSTLY ALL ABOUT NETWORKING What happens when you

    type an address in your browser? The TCP/IP Stack: Data Link, Network, Transport, Application (also Physical, Session, Presentation) This is called the OSI model.
  7. TCP: HOW DATA FLOWS FROM A TO B A node

    sends "packets" Packets must be of a certain size (~1500 bytes, 64k max) Only a certain amount of packets are allowed "in flight" Packets are guaranteed delivery, in-order (different than UDP) Starts with a Three-Way handshake. Syn -> Syn Ack -> Ack
  8. HOW LONG DOES THIS TAKE? NYC to SFC: ~2530mi, RTT

    in vacuum: 27ms, RTT in fiber: 41.04ms IP and Routes add delay. (Check out Ping and Traceroute) Remember the three way handshake? Latency. Bandwidth may not matter Connections are scarce, precious resources! (Why we have connection pooling)
  9. HOW DOES HTTP FIT IN? Application level protocol Defines how

    requests and responses sit on top of TCP Http 1.1: Keep-alive, pipelining Browsers 6-8 connections per server Think of 1mb of data split 1, 10, 100 ways
  10. Most client-server optimization is about tuning for network latency and

    managing TCP connections. Networking isn't just about a browser and a server. Intra- and inter- data center connections are important! Optimize for throughput: keep data small, reduce latency, reduce connections.
  11. WEB SERVERS! Web servers like nginx and apache handle http

    requests. Application servers run programs which produce dynamic content. Think Thin, Unicorn, Puma, Node, etc. Sometimes it's a blurry line. Depends on architecture.
  12. THREADED VS. EVENTED How connections are mapped to requests How

    threads are mapped to requests Memory usage and context switching How does your OS help?
  13. ALL ABOUT ASYNC Non-blocking io is essential in all cases

    EventMachine, Node, Twisted: Event loops Heroku's big problem with Rails
  14. A TYPICAL SCENARIO LOAD BALANCER -> WEBSERVER -> APP SERVER

    Load Balancer Distributes Requests Web Servers (nginx/haproxy/varnish) handles connections, static content Application Servers or Process Pools respond to requests
  15. SO WHAT DO WEB SERVERS ACTUALLY DO? Very good at

    handling http connections Parse http requests Add filters like gzip, authentication Buffering, Caching Reverse-Proxy support
  16. DIFFERENCES WITH APPLICATION SERVERS (AND RUNTIMES!) Allow you to program

    dynamic content Thin, Unicorn, Puma, Passenger MRI, Rubinus, JRuby Node vs. Rails vs. Django vs. Whatever?
  17. RACK A common interface between Ruby web servers and web

    frameworks. Provides simple http primitives Supports middleware Can mix Rails and Sinatra endpoints
  18. DATABASES! They're beasts! Sql - ACID NoSql - BASE Both

    have their own set of problems.
  19. TUNING DATABASES In both cases, it's about indexes. Learn to

    find slow queries. Avoid n+1 queries (go for eager loading). Optimize for read/write patterns.
  20. SCALING DATABASES How do you add more nodes? Again, understand

    read/write patterns. Understand latency.
  21. POLYGOT DATA LANDSCAPE Postgres, Mysql Cassandra, Riak, HBase MongoDb Redis

    Memcache
  22. CACHING! Html Caching Fragment Caching Object Caching How easy is

    it to expire your cache?
  23. HTML CACHING Uses http headers ETags, Cache-Control, Expires Anonymous vs.

    Authenticated Users How are you expiring cache?
  24. FRAGMENT CACHING Still Html! More surgical ESI make tools like

    Varnish quite powerful
  25. OBJECT CACHING Most common Most flexible Easy db integration Memcache

    for reads Redis for structured data Compress (CPU cycles are there) You're just denormalizing data
  26. QUEUES! THEY'LL BE YOUR BEST FRIEND. Very Async Great for

    handling spikes Offload low-priority requests Great for elastic workers Resque, IronWorker, Amazon SQS
  27. CLIENT-SIDE TECHNIQUES Sprites Minification Domain Sharding cache-control Async javascript Image

    management (WebP) CDNs!
  28. PAAS VS. IAAS How do you want to run your

    apps? What control do you want or need? Other AWS Services: ElasticCache, R53/ELB, ASG, Search, CloudFormation, OpWorks, Multi-AZ, Multi-Region Cloud Services: Most tech is offered as SaaS NewRelic, Loggly
  29. GOOGLE'S SPDY One persistent SSL connection True Duplex, no waiting

    Better window scaling Compressed headers (cookies, etc) Content Prioritization Server Push
  30. EXAMPLES How would you design a news site? How would

    you design an e-commerce site? How would you design a social site?
  31. PROBLEM The server request/response time is very long.

  32. PROBLEM The server request/response time is very fast, but when

    more load increases, times increase dramatically.
  33. PROBLEM A page takes a long time to load, but

    the server time is very fast.
  34. PROBLEM We are sending images to a bunch of customers.

    Sending images is very slow, but nothing is maxed out. The problem increases when we try and send more images.