Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Distributed Systems with Gearman

Gabor Vizi
February 22, 2013

Distributed Systems with Gearman

Gabor Vizi

February 22, 2013
Tweet

Other Decks in Technology

Transcript

  1. Agenda • Job queues and distributed task systems – What

    they are and why we have them – Important aspects (workflow vs task, sync vs async) • Using German with PHP – Gearman server and system architecture – Gearman PHP extension • Questions
  2. Scaling out by multiplication Single machine Multiple machine server server

    server server I N T E R N E T I N T E R N E T I N T E R N E T I N T E R N E T
  3. Scaling out by functionality I N T E R N

    E T I N T E R N E T I N T E R N E T I N T E R N E T web web web task task task s e r v e r s e r v e r s e r v e r
  4. web web web job job job Job Server Job Server

    Distributed job queue system W o r k e r C l i e n t Q u e u e
  5. Important aspects • Response time: sync vs async • Resource

    location: data vs processing • Inter-dependency: workflow vs task Task execution: failure tolerance, parallelization, concurrency
  6. Solutions  Gearman  Tasks  PHP extension  Amazon

    SWF  Workflows  PHP lib (amazon sdk)  0MQ  Your own implementation
  7. Gearman • Client creates a job and sends to the

    server • Server find a worker and send the task to it. • Worker does the job and reports back to the server. • [Server reports back to the client – sync only] Gearman stack
  8. Gearman server Install from source (latest c++: 1.1.5; old c:

    0.14), packages are outdated dependency: boost-devel, libevent-devel, curl-devel (for tests) whatever -devel need for persistent queue (mysql, sqlite3, memcached, redis, etc...) user / group / dir: `groupadd -r gearmand` `useradd -M -r -g gearmand -d /var/lib/gearmand -s /bin/false \ -c "Gearman Server" gearmand` `mkdir /var/lib/gearmand && chown -r garmand.gearmand /var/lib/gearmand` Config support/gearmand.init (not for `make install`, should edit before use): --pidfile, --log-file, --verbose [level]
  9. Gearman server • In-memory queue • Persistent queue – Memcached

    – Redis – SQLite – Mysql/Drizzle – PostgreSQL – TokyoCabinet
  10. Gearman server • Physical box: – CPU: low – IO:

    high – memory: depends... (but more better) • Workload size and queue type: – disk / network IO – workload have to fit into queue • IO model: – continuous connection between server and workers – server push tasks to worker
  11. Gearman PHP extension • Install from source, requirements: 0.8.* versions:

    libgearman v0.14- 1.0.* versions, libgearman v0.21- 1.1.* versions, libgearman v1.1.0 • Config /etc/php.d/gearman.ini: extension="gearman.so"
  12. Gearman PHP extension Version 1.1.1: addServer(string $host, int $port) •

    wrong documentation, both host/port required • without host/port both client and worker cannot connect to the server. Client at least throws an error, but worker just silently do nothing.
  13. PHP example: job execution <?php $gmclient= new GearmanClient(); $gmclient->addServer('127.0.0.1', 4730);

    $job_handle= $gmclient->doBackground("reverse","123456", "uniq1"); save_somewhere("uniq1", $job_handle, '127.0.0.1', 4730); <?php $gmclient= new GearmanClient(); $gmclient->addServer('127.0.0.1', 4730); $gmclient->setCreatedCallback( function($task){ save_somewhere( $task->unique(), $task->jobHandle(), '127.0.0.1', 4730 ); } ); $gmclient->addTaskBackground("reverse","123456", null, "uniq1"); $gmclient->addTaskBackground("reverse","qwerty", null, "uniq1"); $gmclient->addTaskBackground("reverse","asdfgh", null, "uniq1"); $gmclient->runTasks(); Single Parallel
  14. PHP example: progress check <?php $gmclient= new GearmanClient(); $gmclient->addServer('127.0.0.1', 4730);

    $job_handle= $gmclient->doBackground("sheepcount","0"); print_r($job_handle); <?php $gmclient= new GearmanClient(); $gmclient->addServer('127.0.0.1', 4730); $status = $gmclient->jobStatus($argv[1]); print_r($status); <?php echo "Starting\n"; $gmworker= new GearmanWorker(); $gmworker->addServer('127.0.0.1', 4730); $gmworker->addFunction("sheepcount", "sheepcount_fn"); print "Waiting for job...\n"; while($gmworker->work()) { if ($gmworker->returnCode() != GEARMAN_SUCCESS) { echo "return_code: ".$gmworker->returnCode()."\n"; break; } } function sheepcount_fn($job) { echo "Received job: " . $job->handle() . "\n"; for ($max=1000, $x= 0; $x < $max; $x++) { echo "Sending status: $x/$max sheep!\n"; $job->sendStatus($x, $max); sleep(3); } } Client Worker Status
  15. Informations • Documentation: http://www.gearmand.org/ • Server source: https://launchpad.net/gearmand • PHP

    extension: http://pecl.php.net/gearman slides: https://joind.in/8237 twitter: @vgabor