Scaling ASP.NET Core Applications

Scaling ASP.NET Core Applications

Hey my app doesn't scale! ____ Framework sucks! Well, you can write a slow app in any language. This talk will show you why your app isn't scaling and gives you the DOs and the DON'Ts of making big apps do big things in ASP.NET Core.

1fe64ebb176498be5f73ab51986c6b7b?s=128

David Fowler

January 30, 2019
Tweet

Transcript

  1. Scaling ASP.NET Core Applications David Fowler @davidfowl Damian Edwards @damianedwards

  2. Disclaimer • We don’t build real applications • We see

    *A LOT* of broken applications • We help customers solve their scalability issues
  3. What do we mean by “scale”? • Scale is a

    measure of user/request/connections per scale- unit (machine, container etc) • If you do nothing, you can scale it infinitely – Scott Hanselman
  4. Types of scaling • Horizontal scale (scaling out) • Adding

    more units of scale (machines/VMs/containers etc) • Vertical scale (scaling up) • Adding more capable resources to an existing scale unit (CPU, memory, bandwidth)
  5. Why doesn’t my application scale? • “Work” that doesn’t clean

    up after itself • Creating work faster than work is being executed
  6. What affects scale • CPU • Hot paths in your

    application • Contended locks • Memory • Memory leaks (work isn’t cleaning up properly when complete) • Inefficient memory usage (using more memory than expected for the work) • IO • Ephemeral port exhaustion • Running out of disk/storage space • Blocking • Bandwidth & latency
  7. What affects scale (CLR) • GC • Too many GC

    pauses • ThreadPool • Thread pool starvation • Timers • Too many timers • Exceptions • Locks • Highly contended locks • Synchronous IO
  8. Async Programming • Doing async right can increase scalability •

    Doing async wrong can severely decrease scalability • .NET has lots of async traps • The number one rule is DON’T BLOCK
  9. Load testing • Scale issues usually show up when it’s

    too late • It’s important to figure out how much load your application can handle • For a fixed RPS, monitor CPU usage and memory usage • Understand how much each scale unit in your deployment can handle (e.g. each VM can handle 1000 RPS)
  10. Load testing Run load tests Find bottleneck Fix issues

  11. Scalability Checklist: CPU • Machine resources • CPU usage •

    CLR resources • ThreadPool (work-items and worker threads) • GC (Gen0, Gen1 and Gen2) collections • Locks • Application logic • Serialization • Chatty IO
  12. Scalability Checklist: Memory • Machine resources • Memory usage •

    Number of threads • CLR resources • Timers • GC (heap sizes for Gen0, Gen1 and Gen2) • Application logic • Strings • Reading everything into memory instead of using streaming data • Disk IO • Network IO • Disposable objects not being disposed • AsyncLocal leaks
  13. Scalability Checklist: IO • Machine resources • Number of open

    files/handles/sockets (check ulimit) • CLR resources • IO threads • Application logic • HttpClient • DbConnection/SqlConnection • FileStream • Inefficient buffering (lots of small reads/writes packets)
  14. Sync over Async

  15. ThreadPool • Sync over async • APIs that masquerade as

    synchronous but are actually blocking async methods • Uses 2 threads to complete a single operation • Blocking APIs are BAD • Avoid blocking APIs where possible e.g. Task.Wait, Task.Result, Thread.Sleep, GetAwaiter.GetResult() • Excessive blocking on thread pool threads can cause starvation • Thread injection rate beyond configured max is slow (2 per second)
  16. Sync over Async

  17. Demo: Sync over async

  18. Cache Lookup

  19. Highly contended locks • Web applications are highly concurrent •

    Highly contended locks can be a death knell for scalable services • Lock contention is sometimes hard to look at in basic profilers • Visual Studio Concurrency Visualizer • dotTrace timeline view • Prefer concurrent data structures • Understand which operations take locks and which operations are lock free • Know what BCL APIs take locks on your behalf • String.Intern • System.Drawing (GDI)
  20. Demo: Cache Lookup

  21. Parsing a JSON payload

  22. GC issues • Allocating memory is very cheap, collecting it

    isn’t • Allocating lots of memory can lead to GC pauses • Allocating objects over 85KB in size ends up on the LOH (large object heap) • The LOH is collected with Gen2 but not compacted (by default)
  23. Demo: Parsing a JSON payload

  24. Timeouts

  25. TimerQueue • There’s a TimerQueue per CPU Core • Timers

    within a TimerQueue form a linked list • Timers are optimized for adding and removing • Timer callbacks are scheduled to the thread pool • Each TimerQueue is protected by a lock • Disposing the timer removes it from the queue
  26. TimerQueue TimerQueue TimerQueueTimer TimerQueueTimer TimerQueueTimer TimerQueueTimer TimerQueueTimer TimerQueue TimerQueueTimer TimerQueueTimer

    TimerQueueTimer TimerQueueTimer TimerQueueTimer
  27. Demo: Timer leak

  28. Demo: Timeout (fixed)

  29. .NET Async traps

  30. .NET Async traps

  31. .NET Async traps: ConcurrentDictionary

  32. .NET Async traps

  33. .NET Async traps

  34. .NET Async traps

  35. .NET Async traps

  36. .NET Async traps

  37. Diagnostics: Performance Traces • Types of issues • High CPU

    • Tools • Visual Studio • dotTrace • PerfView • dotnet-collect
  38. Diagnostics: Post Mortem Debugging • Types of issues • Crashes

    • Hangs (sync and async) • Memory leaks • Locks • Tools • Visual Studio • Windbg • lldb • dotnet-analyze • dotMemory
  39. Future Enhancements • Improved documentation on how to scale web

    services • Improvements to the thread pool to better handle blocking workers • Analyzers to catch common mistakes with asynchronous programming • Tools to help better diagnose common issues • Async hangs • Thread pool starvation • More counters in .NET Core • Reduce the amount of .NET async traps • IAsyncDisposable • FileStream (sync over async)
  40. In summary… coding is hard