Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Northeast PHP 2012 - Gearman

Northeast PHP 2012 - Gearman

Mike Willbanks

August 12, 2012
Tweet

More Decks by Mike Willbanks

Other Decks in Technology

Transcript

  1. A Job Server to Scale By Mike Willbanks Sr. Web

    Architect Manager NOOK Developer Northeast PHP August 12, 2012
  2. 2 • Talk  Slides will be online later! • Me  Sr. Web

    Architect Manager at NOOK Developer  Former MNPHP Organizer  Open Source Contributor (Zend Framework and various others)  Where you can find me: • Twitter: mwillbanks G+: Mike Willbanks • IRC (freenode): mwillbanks Blog: http://blog.digitalstruct.com • GitHub: https://github.com/mwillbanks Housekeeping…
  3. 3 • What is Gearman   A general introduction • Main Concepts

      Looking overall at how gearman works for you. • Quick Start   Make it go do something. • Digging in   A detailed look into gearman. • PHP Integration   How you should work with it in PHP including use cases and samples. • Questions   Although you can bring them up at anytime! Agenda
  4. 5 “Gearman provides a generic application framework to farm out

    work to other machines or processes that are better suited to do the work. It allows you to do work in parallel, to load balance processing, and to call functions between languages.” Official Statement
  5. 6 • Gearman consists of a daemon, client and worker  At

    the core, they are simply small programs. • The daemon handles the negotiation of work  Workers and Clients • The worker does the work • The client requests work to be done What it Means
  6. 13 • Head to gearman.org • Click Download • Click on the LaunchPad

    download • Download the Binary • Unpack the binary • ./configure && make && make install • Bam! You’re off!  For more advanced configuration see ./configure –help • Starting  gearmand -d Installation
  7. 14 • gearmand  -d Run as background daemon  -u [user] Run

    as user  -L [host] Listen on host/ip  -p [port] Listen on port  -t [threads] Number of threads to use  -v[vv] Verbosity Gearmand Usage
  8. 15 • Starting the Daemon  gearmand –d • Worker – command line

    style  nohup gearman -w -f wc -- wc –l & • Run the worker in the background. • Client – command line style  gearman -f wc < /etc/passwd • Outputs the number of lines. Simple Bash Example
  9. 16 • gearman  -w Worker mode  -f [function] Function name to

    use  -h [host] Job server host  -p [port] Job server port  -t [timeout] Timeout in milliseconds  -H Full options for both clients and workers. Gearman Client Command Line Usage
  10. 18 • Gearman by default is an in-memory queue  Leaving this

    as the default is ideal; however, does not work in all environments. • Persistent Queues  Libdrizzle  Libsqlite3  Libmemcached  Postgres  TokyoCabinet  MySQL  Redis Persistence
  11. 19 • Persistent queues require specific configuration during the compilation of

    gearman. • Additionally, arguments to the gearman daemon need to be passed to talk to the specific persistence layer. • Each persistence layer is actually built as a plugin to gearmand  http://bazaar.launchpad.net/~tangent-org/gearmand/trunk/files/ head:/libgearman-server/plugins/queue/ Getting Up and Running with Persistence
  12. 21 • Clients send work to the gearmand server  This is

    called the workload; it can be anything that can become a string.  Utilize an open format; it will make life easier in the event you use multiple programming languages, are debugging or the like. • XML, JSON, etc. • Yes, you can serialize objects if you wanted to. –  I recommend against this. Clients
  13. 22 • Workers are the dudes in the factory doing all

    the work • Generally they will run as a daemon in the background • Workers register a function that they perform  They should ONLY be doing a single task.  This makes them far easier to manage. • The worker does the work and “can” return results  If you are doing the work asynchronously you generally do not return the result.  Synchronous work you will return the result. Workers
  14. 23 • Utilizing the Database  If you keep a database connection

    • Must have the ability to reconnect to the database. • Watch for connection timeouts • Handling Memory Leaks  Watch the amount of memory and detect leaks then kill the worker. • Request Languages  PHP for instance, sometimes slows down after hundreds of executions, kill it off if you know this will happen. Workers – special notes
  15. 24 • Workers sometimes have issues and die, or you need

    to boot them back up after a restart  Utilizing a service to watch your workers and ensure they are always running is a GOOD thing. • Supervisord  Can watch processes, restart them if they die or get killed  Can manage multiple processes of the same program  Can start and stop your workers.  Running: supervisord –c myconfig.conf • When running workers, BE SURE to handle KILL signals such as SIGKILL. Keeping the Daemon Running
  16. 26 • Gearman Status  telnet on port 4730  Write “STATUS” • Gives

    you the registered functions, number of workers and items in the queue. • Gearman Monitor – PHP Project  Basic monitoring; but works and it is open source so you can improve it!  https://github.com/yugene/Gearman-Monitor Monitoring
  17. 28 • Two Options  Net::Gearman (PEAR) • Implemented through sockets with PHP.

    • https://pear.php.net/package/Net_Gearman/  Gearman Extension (PECL) • Implemented through the C API from libgearman • http://pecl.php.net/package/gearman Usage
  18. 29 • GearmanManager - agnostic  https://github.com/brianlmoon/GearmanManager/ • Zend Framework 1: Zend_Gearman  https://github.com/mwillbanks/Zend_Gearman

    • Zend Framework 2: mwGearman  https://github.com/mwillbanks/mwGearman • Drupal  http://drupal.org/node/783294 Frameworks and Integration
  19. 30 • Watch for Memory Utilization  Check peak usage then kill

    and restart the worker • Don’t execute too many times  PHP is not great at unlimited loops • Keep your memory free  Garbage collect when you can! • Databases  Implement a callback to ensure that you do not timeout; otherwise implement a reconnection. Conditions
  20. 31 • If you resize images on your web server:  Web

    servers should serve, not process images.  Images require a lot of memory AND processing power • They are best to be processed on their own! • Processing in the Background  Generally will require a change to your workflow and checking the status with XHR to see if the job has been completed. • This allows you to process them as you have resources available. • Have enough workers to process them “quickly enough” • Or just do it synchronously Images
  21. 34 • Sending email and/or generating templates and processing variables can

    take up time, time that is better spent getting the user to the next page. • The feedback on the mail doesn’t really make a difference so it is great to send it to the background. Email
  22. 37 • Get all of your logs to a single place

    • Process the logs to produce analytical data • Impression / Click Tracking • Why run introspection over the log file itself?  Near real-time analysis is possible! Log Analysis / Aggregation
  23. 40 • You need to run an executable process… • This process

    takes a given name and tells you how many processes are running on your worker machine.  Purely for example purposes; however, you might want to run SaaS against a CMS or something to that degree. Executable Processes
  24. Questions? These slides will be posted to SlideShare & SpeakerDeck.

     Slideshare: http://www.slideshare.net/mwillbanks  SpeakerDeck: http://speakerdeck.com/u/mwillbanks  Twitter: mwillbanks  G+: Mike Willbanks  IRC (freenode): mwillbanks  Blog: http://blog.digitalstruct.com  GitHub: https://github.com/mwillbanks