Upgrade to Pro — share decks privately, control downloads, hide ads and more …

High Performance Websites with Scalable Workers

High Performance Websites with Scalable Workers

There comes a point in time with a website when eventually need to do something in the background. There are always cron jobs, but eventually those either don't scale well, or are not responsive enough. Learn about how to help your website efficiently scale by using workers. We'll discuss the fundamental theory behind workers and how to easily implement them. We'll learn about several different technologies to help manage workers such as Beanstalkd, Gearman, Supervisord, Redis, and others.

Justin Carmony

May 02, 2013
Tweet

More Decks by Justin Carmony

Other Decks in Technology

Transcript

  1. Text
    High Performance Websites
    with Scalable Workers
    @JustinCarmony

    View Slide

  2. Introduction
    About Presenter
    • Director of Development
    for Deseret Digital Media
    • President of Utah PHP Usergroup
    • PHP Developer for 8+ years
    • Goofy Dad

    View Slide

  3. Introduction
    About Presentation
    • There is a Vagrant demo project located at:
    https://github.com/JustinCarmonyDotCom/
    PHP-Workers-Tutorial
    aka http://bit.ly/10WYeUf
    • Feel free to clone it and follow along

    View Slide

  4. Introduction
    Normally I Have Really
    Nice Slides...

    View Slide

  5. Introduction
    ... but my work gave me
    this template (which is
    ugly) ...

    View Slide

  6. Introduction
    ... and this guy came
    along ....

    View Slide

  7. Introduction

    View Slide

  8. Introduction
    So No Fancy Slides
    this Time
    ( Sorry :p )

    View Slide

  9. Introduction
    So Lets Identify The
    Problem

    View Slide

  10. The Problem
    Example: Submit
    New Blog Post
    Fill Out
    Form
    Display
    Success
    Page

    View Slide

  11. The Problem
    Example: Submit
    New Blog Post
    Fill Out
    Form
    Display
    Success
    Page
    Save
    to DB

    View Slide

  12. The Problem
    Example: Submit
    New Blog Post
    Fill Out
    Form
    Display
    Success
    Page
    Save
    to DB
    Post to
    Twitter
    & FB

    View Slide

  13. The Problem
    Example: Submit
    New Blog Post
    Fill Out
    Form
    Display
    Success
    Page
    Save
    to DB
    Post to
    Twitter
    & FB
    Send
    Emails

    View Slide

  14. The Problem
    Example: Submit
    New Blog Post
    Fill Out
    Form
    Display
    Success
    Page
    Save
    to DB
    Post to
    Twitter
    & FB
    Send
    Emails
    Warm
    Cache

    View Slide

  15. The Problem
    Example: Submit
    New Blog Post
    Fill Out
    Form
    Display
    Success
    Page
    Save
    to DB
    Post to
    Twitter
    & FB
    Send
    Emails
    Warm
    Cache
    Update
    Queues

    View Slide

  16. The Problem
    Example: Submit
    New Blog Post
    Fill Out
    Form
    Display
    Success
    Page
    Save
    to DB
    Post to
    Twitter
    & FB
    Send
    Emails
    Warm
    Cache
    Update
    Queues
    Create
    Images

    View Slide

  17. The Problem
    Example: Submit
    New Blog Post
    Fill Out
    Form
    Display
    Success
    Page
    Save
    to DB
    Post to
    Twitter
    & FB
    Send
    Emails
    Warm
    Cache
    Update
    Queues
    Create
    Images
    Find
    Related
    Posts

    View Slide

  18. The Problem
    Example: Submit
    New Blog Post
    Fill Out
    Form
    Display
    Success
    Page
    Save
    to DB
    Post to
    Twitter
    & FB
    Send
    Emails
    Warm
    Cache
    Update
    Queues
    Create
    Images
    Find
    Related
    Posts
    Another
    Thing
    Another
    Thing
    Another
    Thing
    Another
    Thing
    Another
    Thing
    Another
    Thing
    Another
    Thing

    View Slide

  19. The Problem
    It’s SLOW!

    View Slide

  20. The Problem
    Problems With
    Linear Approach
    • Time From “Save” to “Success” Longer &
    Longer
    • What happens if you get an error? Will it
    block all other items afterwards?
    • What if you have lots of authors? Everyone
    stuck in this linear approach?

    View Slide

  21. The Solution - Workers
    The Solution:
    PHP Workers

    View Slide

  22. The Solution
    What are Workers?
    • Separate PHP Processes that typically run
    from the CLI
    • Listen to a Queue for “jobs”
    • When it receives a job, it does it
    • When it finished, listens for more jobs

    View Slide

  23. The Solution
    The Workflow

    View Slide

  24. Web Application
    The Solution
    The Workflow

    View Slide

  25. Job
    Web Application
    The Solution
    The Workflow

    View Slide

  26. Job
    Web Application
    The Solution
    The Workflow
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Queue
    of Jobs

    View Slide

  27. Job
    Web Application
    The Solution
    The Workflow
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Queue
    of Jobs

    View Slide

  28. Job
    Web Application
    The Solution
    The Workflow
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Queue
    of Jobs
    Worker

    View Slide

  29. Job
    Web Application
    The Solution
    The Workflow
    Listen
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Queue
    of Jobs
    Worker

    View Slide

  30. Job
    Web Application
    The Solution
    The Workflow
    Listen
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Queue
    of Jobs
    Worker

    View Slide

  31. Job
    Web Application
    The Solution
    The Workflow
    Listen
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Queue
    of Jobs
    Worker

    View Slide

  32. Job
    Web Application
    The Solution
    The Workflow
    Listen
    Result
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Queue
    of Jobs
    Worker

    View Slide

  33. Job
    Web Application
    The Solution
    The Workflow
    Listen
    Result
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Queue
    of Jobs
    Worker

    View Slide

  34. The Solution
    Example: Submit
    New Blog Post
    Fill Out
    Form

    View Slide

  35. The Solution
    Example: Submit
    New Blog Post
    Fill Out
    Form
    Queue Jobs

    View Slide

  36. The Solution
    Example: Submit
    New Blog Post
    Fill Out
    Form
    Job
    Queue Jobs
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job

    View Slide

  37. The Solution
    Example: Submit
    New Blog Post
    Fill Out
    Form
    Display
    Success
    Page
    Job
    Queue Jobs
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job
    Job

    View Slide

  38. The Solution
    Okay... the Theory is
    Pretty Straight
    forward...

    View Slide

  39. The Solution
    Lets see how you
    actually do it...

    View Slide

  40. Getting Started
    What We’ll Need
    • PHP 5.3
    • Queue -- Beanstalkd
    • Manage Queue - Pheanstalk Library
    • Manage Workers - Solo / Supervisord
    • Store Worker Status - Redis
    • Talk with Redis - Predis Library

    View Slide

  41. Getting Started
    Wait, What About
    Gearman?
    • It’s great to use!
    • Requires to install modules.
    • If you understand how to use Beanstalkd,
    its easy to use Gearman
    • Lots of talks about Gearman, like to give
    some exposure to Beanstalkd

    View Slide

  42. Getting Started
    Install / Setup
    Beanstalkd
    • Debian/Ubuntu:
    apt-get install beanstalkd
    • Fedora/CentOS/RHEL:
    su -c 'yum install beanstalkd'
    • Mac Homebrew:
    brew install beanstalkd

    View Slide

  43. Getting Started
    Beanstalkd on
    Windows
    • ... use a VM. Checkout Vagrant!

    View Slide

  44. Getting Started
    Pheanstalk for PHP
    • Most Tested PHP Beanstalkd Library
    • Installable via Composer
    • Located at:
    https://github.com/pda/pheanstalk

    View Slide

  45. Workers
    The Trick: Managing
    Your Workers
    • Putting stuff in a queue is easy!
    • Keeping workers up & running consistently
    is the hard part!
    • We’ll talk about PHP workers, but you can
    use any language to write your workers.

    View Slide

  46. Workers
    Features for our
    PHP Workers
    • Run for a long time
    • Self-restarting
    • Can be restarted
    • Reports it’s current status
    • Blocking, Not Polling
    • Monitor It’s Own Performance

    View Slide

  47. Workers
    Timing For your
    Workers
    • Allowing them to run indefinitely:
    set_timeout_limit(0);
    • Set a max runtime for them to gracefully
    restart. Example:
    $time_limit = 60 * 60 * 1; // Minimum of 1 hour
    $time_limit += rand(0, 60 * 10); // Adding additional time

    View Slide

  48. The Queue
    Anatomy of a
    Beanstalkd Job
    • What makes up a job
    • Priority
    • Delay
    • Time-To-Run
    • Data

    View Slide

  49. Text
    Beanstalkd Actions
    • put (creates job)
    • put with delay
    • reserve (worker selects
    job)
    • release (puts job back
    into queue)
    • release with delay
    • delete (removes job)
    • bury (bury job at end of
    queue to wait)
    • kick (kick a buried job
    back into the queue)
    • touch (request more
    time on a job)

    View Slide

  50. The Queue
    Typical Job Life Cycle
    Client
    put
    Job Worker
    retrieve delete
    Poof!

    View Slide

  51. Poof!
    The Queue
    Possible Job Flow
    DELAYED
    put with
    delay
    (time passes)
    READY RESERVED
    reserve
    put
    release with delay
    release
    delete
    BURIED
    bury
    kick
    Poof! delete

    View Slide

  52. Demo Time
    Lets look at the code!

    View Slide

  53. High Performance
    Okay, so... what about
    the High Performance
    Part?

    View Slide

  54. High Performance
    Using Workers to
    Increase Performance
    • Make Writing to Your “Queue” Extremely
    Fast (Beanstalkd, Redis, In-Memory)
    • Do only what you must in your request,
    Queue Everything Else
    • Find IO Operations & Queue Them Instead
    of Perform Them

    View Slide

  55. High Peformance
    Caching is for Reads
    Queueing is for Writes

    View Slide

  56. High Performance
    Possible Things to
    Queue: Your Writes
    • Logging
    • Analytics
    • Some Queries
    • Communication with other APIs
    • Writing to your Data Sources

    View Slide

  57. High Performance
    Identifying What To
    Queue
    • Use XHGui/XHProf to find slow parts of
    your application
    • Use StatsD/Graphite to Gather Stats
    Across Requests
    • Does it communicate with something else?

    View Slide

  58. Questions
    Questions?

    View Slide

  59. The End
    Thank You!
    Twitter: @JustinCarmony
    Website: http://www.justincarmony.com/
    Email: [email protected]

    View Slide