Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Pete's Performance Paradise

Pete's Performance Paradise

We'll put on our X-ray specs to see why our fast app is running so slowly: how to identify system performance problems, test for success and make your boss happy. Bring along your sad-path apps and we'll try to find the performance bottlenecks.

Come prepared to try out some of Pete's suggestions.

Pete Campbell

March 24, 2012
Tweet

More Decks by Pete Campbell

Other Decks in Technology

Transcript

  1. Why Is My App Slow?
    Tools & Techniques To Identify System
    Performance Issues
    Pete Campbell
    [email protected]
    @sumirolabs
    github.com/campbell

    View full-size slide

  2. Why Is My App Slow?
    • You have a slow app!
    • Network congestion
    • Database load
    • External services*
    • Garbage collection*
    • Other users & processes

    View full-size slide

  3. Where is the problem?
    • Can't fix it (easily) if we can't find it
    • Need some system knowledge (e.g.
    architecture)
    • Need some system expertise (e.g. Linux
    sysadmin)
    • Lucky you! I have some tools for you...

    View full-size slide

  4. T
    ypical Architectures
    Web
    Server DB
    Web
    Server
    & DB
    Scary
    Internet
    Monsters
    (I’m assuming we’re using Apache, Passenger, MySQL)
    Firewall
    Firewall

    View full-size slide

  5. Where to Start?
    • Its easy
    • Its great
    • Its free!*
    • Tell me more!
    NewRelic

    View full-size slide

  6. NewRelic = Awesomeness!
    Outage!
    Start here

    View full-size slide

  7. More $ = More Love

    View full-size slide

  8. More $$ = Love^2

    View full-size slide

  9. You Had Me At "T
    race"...

    View full-size slide

  10. ...and "Web T
    ransactions"

    View full-size slide

  11. But Why Was It Slow?
    • NewRelic can tell you where time was spent...
    • NewRelic can tell you how it was spent...
    • NewRelic can't tell you why it was spent

    View full-size slide

  12. T
    o The Bat-Cave!
    Need to investigate at the system &
    architecture level.
    Is the problem...
    • cpu maxed out?
    • network bandwidth?
    • server needs more memory?
    • external (web services, file system)?
    • all of the above?

    View full-size slide

  13. T
    ools In Our Bat-Utility Belt
    • CPU - top, htop, mpstat*
    • Disk - iostat, nfsiostat*
    • Network - iftop, ntop*, etherape*
    • Memory - vmstat, free
    • Passenger - passenger-status,
    passenger-memory-stats
    • "Resources" - lsof, sar*
    • MySQL slow query log

    View full-size slide

  14. TOP - List of all running processes
    CPU - Old-School
    Stats per core?

    View full-size slide

  15. CPU - New Generation
    TOP with more info, abilities
    Stats per core!
    (Removed so this
    can be placed on
    the internet)

    View full-size slide

  16. CPU + Disk
    IOSTAT - cpu & device utilization
    Test Started

    View full-size slide

  17. Home Movies!
    TOP & HTOP

    View full-size slide

  18. IFTOP - TOP for interfaces
    Network - Look Ma, Bandwidth!
    2s, 10s, 40s
    averages
    (Removed so
    this can be
    placed on the
    internet)
    (Removed
    so this can
    be placed on
    the internet)

    View full-size slide

  19. Home Movies!
    IFTOP

    View full-size slide

  20. Memories...
    FREE - free & used memory
    VMSTAT - processes, memory, paging, cpu
    Test Started
    Watch for swapping

    View full-size slide

  21. Passenger Status
    Passenger-status - threads, status, memory
    Watch if this grows

    View full-size slide

  22. Home Movies!
    Passenger-Status

    View full-size slide

  23. Passenger Memory Stats
    Passenger-memory-stats - check for memory leaks

    View full-size slide

  24. Whats In Your Process?
    LSOF - list of open "files" (actually resources)
    - see what the process is using
    lsof -p
    (Removed so this
    can be placed on
    the internet)

    View full-size slide

  25. MySQL
    • Use system tools to look at CPU & memory
    usage
    • Turn on the slow-query log
    • Set slow_query_log &
    slow_query_log_file in my.cnf
    • Specify long_query_time minimum
    time threshold (default = 10s)

    View full-size slide

  26. T
    esting
    • Now that you have the tools, you need to use
    them correctly
    • Goal is to methodically determine how your
    system behaves and how you can change this
    behavior (for good or evil)
    • So how do we successfully test our system?

    View full-size slide

  27. T
    esting Is Complicated
    Run A
    Test

    View full-size slide

  28. T
    esting Isn't Complicated
    Run A
    Test
    Change
    One Thing

    View full-size slide

  29. T
    esting Isn't Complicated
    • Be methodical & precise
    • Change one thing at a time to verify cause &
    effect
    • Make sure you can reproduce previous results
    (otherwise something else has changed!)
    • Add focused methods to your app to isolate &
    test just one thing (e.g. rendering, db...)
    • Bracket performance - min / max effect

    View full-size slide

  30. "The site seems slow."
    Before:
    Boss: "The site seems slow today."
    You: "We haven't made any changes.
    Maybe its your connection?"
    Boss: "Google is fast, the site is slow.
    Fix it!"
    You: #$@#())!~

    View full-size slide

  31. "The site seems slow."
    After:
    Boss: "The site seems slow today."
    You: "We haven't made any changes.
    The NewRelic APDEX score hasn't
    changed either."
    Boss: "Ah, hmm, ok, maybe it is my DSL
    line. Dang AOL!"
    You: :-)

    View full-size slide

  32. APDEX = You.happy!
    • APDEX is a 'user-experience metric', i.e. a way
    of measuring user satisfaction
    • "This is the one-number metric that senior
    management can easily understand and use to
    manage IT across many applications." http://
    apdex.org/index.php/about/apdex-faq/
    • Shows how all of your users are experiencing
    the site

    View full-size slide

  33. NewRelic T
    o The Rescue

    View full-size slide

  34. Remember This Stuff
    • Use NewRelic to find sad-paths
    • Use these Linux tools to see why the paths are
    so sad
    • Use focused tests to isolate & tune parts of the
    system
    • Change one thing, retest, & make sure you can
    reproduce earlier results

    View full-size slide

  35. Why Was My App Slow?
    Thanks to Dave Bock @codesherpas
    for help with the tools & tuning.
    Pete Campbell
    [email protected]
    @sumirolabs
    github.com/campbell

    View full-size slide