Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Distributed Systems with Gearman

Gabor Vizi
February 22, 2013

Distributed Systems with Gearman

Gabor Vizi

February 22, 2013
Tweet

Other Decks in Technology

Transcript

  1. Distributed task system via Gearman Gabor Vizi @vgabor PHPUK -

    2013 unconference
  2. Agenda • Job queues and distributed task systems – What

    they are and why we have them – Important aspects (workflow vs task, sync vs async) • Using German with PHP – Gearman server and system architecture – Gearman PHP extension • Questions
  3. Keywords: - Scalability - Redundancy - Load balancing

  4. Scaling out by multiplication Single machine Multiple machine server server

    server server I N T E R N E T I N T E R N E T I N T E R N E T I N T E R N E T
  5. Scaling out by functionality I N T E R N

    E T I N T E R N E T I N T E R N E T I N T E R N E T web web web task task task s e r v e r s e r v e r s e r v e r
  6. web web web job job job Job Server Job Server

    Distributed job queue system W o r k e r C l i e n t Q u e u e
  7. Important aspects • Response time: sync vs async • Resource

    location: data vs processing • Inter-dependency: workflow vs task Task execution: failure tolerance, parallelization, concurrency
  8. Solutions  Gearman  Tasks  PHP extension  Amazon

    SWF  Workflows  PHP lib (amazon sdk)  0MQ  Your own implementation
  9. Gearman • Client creates a job and sends to the

    server • Server find a worker and send the task to it. • Worker does the job and reports back to the server. • [Server reports back to the client – sync only] Gearman stack
  10. Gearman server Install from source (latest c++: 1.1.5; old c:

    0.14), packages are outdated dependency: boost-devel, libevent-devel, curl-devel (for tests) whatever -devel need for persistent queue (mysql, sqlite3, memcached, redis, etc...) user / group / dir: `groupadd -r gearmand` `useradd -M -r -g gearmand -d /var/lib/gearmand -s /bin/false \ -c "Gearman Server" gearmand` `mkdir /var/lib/gearmand && chown -r garmand.gearmand /var/lib/gearmand` Config support/gearmand.init (not for `make install`, should edit before use): --pidfile, --log-file, --verbose [level]
  11. Gearman server • In-memory queue • Persistent queue – Memcached

    – Redis – SQLite – Mysql/Drizzle – PostgreSQL – TokyoCabinet
  12. Gearman server • Physical box: – CPU: low – IO:

    high – memory: depends... (but more better) • Workload size and queue type: – disk / network IO – workload have to fit into queue • IO model: – continuous connection between server and workers – server push tasks to worker
  13. Gearman PHP extension • Install from source, requirements: 0.8.* versions:

    libgearman v0.14- 1.0.* versions, libgearman v0.21- 1.1.* versions, libgearman v1.1.0 • Config /etc/php.d/gearman.ini: extension="gearman.so"
  14. Gearman PHP extension Version 1.1.1: addServer(string $host, int $port) •

    wrong documentation, both host/port required • without host/port both client and worker cannot connect to the server. Client at least throws an error, but worker just silently do nothing.
  15. PHP example: job execution <?php $gmclient= new GearmanClient(); $gmclient->addServer('127.0.0.1', 4730);

    $job_handle= $gmclient->doBackground("reverse","123456", "uniq1"); save_somewhere("uniq1", $job_handle, '127.0.0.1', 4730); <?php $gmclient= new GearmanClient(); $gmclient->addServer('127.0.0.1', 4730); $gmclient->setCreatedCallback( function($task){ save_somewhere( $task->unique(), $task->jobHandle(), '127.0.0.1', 4730 ); } ); $gmclient->addTaskBackground("reverse","123456", null, "uniq1"); $gmclient->addTaskBackground("reverse","qwerty", null, "uniq1"); $gmclient->addTaskBackground("reverse","asdfgh", null, "uniq1"); $gmclient->runTasks(); Single Parallel
  16. PHP example: progress check <?php $gmclient= new GearmanClient(); $gmclient->addServer('127.0.0.1', 4730);

    $job_handle= $gmclient->doBackground("sheepcount","0"); print_r($job_handle); <?php $gmclient= new GearmanClient(); $gmclient->addServer('127.0.0.1', 4730); $status = $gmclient->jobStatus($argv[1]); print_r($status); <?php echo "Starting\n"; $gmworker= new GearmanWorker(); $gmworker->addServer('127.0.0.1', 4730); $gmworker->addFunction("sheepcount", "sheepcount_fn"); print "Waiting for job...\n"; while($gmworker->work()) { if ($gmworker->returnCode() != GEARMAN_SUCCESS) { echo "return_code: ".$gmworker->returnCode()."\n"; break; } } function sheepcount_fn($job) { echo "Received job: " . $job->handle() . "\n"; for ($max=1000, $x= 0; $x < $max; $x++) { echo "Sending status: $x/$max sheep!\n"; $job->sendStatus($x, $max); sleep(3); } } Client Worker Status
  17. Informations • Documentation: http://www.gearmand.org/ • Server source: https://launchpad.net/gearmand • PHP

    extension: http://pecl.php.net/gearman slides: https://joind.in/8237 twitter: @vgabor