Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Apache Kafkaによるスケーラブル アプリケーション開発

Apache Kafkaによるスケーラブル アプリケーション開発

PHPカンファレンス2017で利用したスライド

yuuki takezawa

October 08, 2017
Tweet

More Decks by yuuki takezawa

Other Decks in Programming

Transcript

  1. γϯϓϧͳΞϓϦέʔγϣϯߏ੒ public function __invoke(Comment $comment) { $query = $comment->query(); $query->insert(['title',

    'body']) ->values([ 'title' => 'First post', 'body' => 'Some body text', 'id' => $this->auth->getId(), ]) ->execute(); }
  2. ෳࡶԽ͢ΔΞϓϦέʔγϣϯ public function __invoke(Comment $comment, Movie $movie) { $query =

    $comment->query(); $commentId = $query->insert(['title', 'body', 'comment_id']) ->values([ 'title' => 'First post', 'body' => 'Some body text', 'comment_id' => $this->auth->getId(), ]) ->execute(); $movieQuery = $movie->query(); $movieQuery->insert(['movie_id', 'comment_id']) ->values([ 'movie_id' => $movie->getId(), 'comment_id' => $commentId, ]) ->execute(); }
  3. ංେԽ͢ΔΞϓϦέʔγϣϯ public function __invoke( Comment $comment, Movie $movie, Follow $follow,

    Company $company ) { $movieCommentId = false; if ($follow->isFollow()) { $query = $comment->query(); $commentId = $query->insert(['title', 'body', 'comment_id']) ->values([ 'title' => 'First post', 'body' => 'Some body text', 'comment_id' => $this->auth->getId(), ]) ->execute(); $movieQuery = $movie->query(); $movieCommentId = $movieQuery->insert(['movie_id', 'comment_id']) ->values([ 'movie_id' => $movie->getId(), 'comment_id' => $commentId, ]) ->execute(); } if ($movie->isCompany() && $movieCommentId) { $company->appendMovie($movie->getId()); // ϝʔϧૹ৴ } }
  4. RabbitMQ • Advanced Message Queuing Protocol(AMQP) • ΤϯλʔϓϥΠζΞϓϦέʔγϣϯͰଟ਺ར༻࣮੷ • ϝοηʔδͷӬଓԽɺϨϓϦέʔγϣϯՄೳ

    • τϥϯβΫγϣϯΛར༻࣮ͯ֬͠ʹॲཧΛߦ͏ • v3.6.0Ͱ௥Ճ͞ΕͨLazy QueuesͰେ༰ྔͷσʔλ஝ੵՄ • ϑΣΠϧΦʔόʔ
  5. Apache Kafka • ZookeeperΛར༻ͨ͠ΫϥελϦϯάʹΑΔߴՄ༻ੑ • ϝοηʔδͷӬଓԽɺϨϓϦέʔγϣϯɺ࠶औಘՄ • ϏοάσʔλରԠ • ϑΝΠϧγεςϜར༻Ͱɺ


    γʔέϯγϟϧΞΫηεʹΑΔߴ଎Խ • ετϦʔϜରԠͷϝοηʔδϯάϛυϧ΢ΣΞ • Kafka ConnectʹΑΔपลγεςϜͱͷߴ͍਌࿨ੑ
 (Amazon kinesisͱ΄΅ಉ͡)
  6. Apache Kafka֓ཁ • Producer 
 ϝοηʔδ഑৴Λߦ͏
 ֤ݴޠͷΫϥΠΞϯτϥΠϒϥϦΛར༻ • Consumer 


    ϝοηʔδߪಡΛߦ͏
 ফඅ͞Εͨϝοηʔδ͸ഁغ͞Εͣɺ
 Ұఆظؒอ؅͞ΕΔ • Broker
 KafkaຊମͰɺProducerɺConsumerؒͷΩϡʔ
  7. Apache Kafka֓ཁ • Topic 
 Producer͔Βͷϝοηʔδ͸͜ͷTopicʹ֨ೲ͞ΕΔ
 ϝοηʔδ͸Ұҙʹ؅ཧɺFIFO(ޙड़partition)Ͱॲཧ • Partition
 ෛՙ෼ࢄ༻్ʹར༻


    ෳ਺ͷConsumer͕ͦΕͧΕͷPartitionΛࢀর͠ɺ
 ͦΕͧΕ͕ॲཧΛߦ͏
 ॲཧϑϩʔͷσβΠϯʹΑͬͯଟ༷ͳར༻ํ๏
  8. Kafka Connect • Kafka Connectͱ͸ɺ
 पลγεςϜ͔ΒͷσʔλΛऔΓࠐΈ(Source)ɺ
 σʔλૹ৴(Sink)ͷೋछྨΛαϙʔτ͢Δػೳ • Amazon SQS΍MongoDBͷσʔλΛKafkaͰऔࠐΉɺ


    ϝοηʔδΛͦͷ··Elasticsearch΍RDBMSʹ֨ೲɺ
 ͕ߦ͑Δ • Connect͸ࣗ༝ʹ֦ுͯ͠ಠࣗConnectΛ࣮૷Մೳ
 (java, Scala)
  9. ϝοηʔδ఻ୡͷอূ • At least once semantics
 ॏෳΛڐՄ • At most

    once semantics
 ܽଛΛڐՄ • Exactly once semantics
 ॏෳɺ͓ΑͼܽଛΛڐՄ͠ͳ͍
 
 "Exactly-once Semantics are Possible: Here’s How Kafka Does it". Confluent APACHE KAFKA. 
 https://www.confluent.io/blog/exactly-once-semantics-are-possible-heres-how-apache-kafka-does-it/ ࢀর
  10. Apache Kafka࠾༻ • ෼ੳॲཧΛ࣮૷͢Δʹ͋ͨΓɺStreamରԠ͕༰қͳࣄ • Kafka ConnectʹΑΔσʔλετϨʔδͱͷ࿈ܞ • Facebook Prestoͱͷ࿈ܞ͕؆୯


    (RDBMS౳ͱ݁߹ͯ͠σʔλఏڙ͕Մೳ) • ґଘ෼ղͷͨΊʹεέʔϧ͢Δ
 ϝοηʔδϯάγεςϜߏங͕༰қ • ϝοηʔδ࠶औಘͱো֐ରԠ͕༰қͰ͋Δࣄ • PHPΤΫεςϯγϣϯ͕͋Δ(rdkafka)
  11. Rdkafka • librdkafka(C/C++)ΛPHPͰར༻Մೳʹͨ͠
 ΤΫεςϯγϣϯ • High Level API & Row

    Level API • PHP5/7ରԠ • librdkafkaͱkafkaͷઃఆΛཧղ͢Δ͜ͱ
  12. Rdkafka Producer࣮૷ $rk = new RdKafka\Producer(); $rk->setLogLevel(LOG_DEBUG); $rk->addBrokers(“127.0.0.1"); $topic =

    $rk->newTopic("test"); $topic->produce( RD_KAFKA_PARTITION_UA, 0, json_encode(['message' => ‘phpcon']) );
  13. Rdkafka Producer࣮૷ $rk = new RdKafka\Producer(); $rk->setLogLevel(LOG_DEBUG); $rk->addBrokers(“127.0.0.1"); $topic =

    $rk->newTopic("test"); $topic->produce( RD_KAFKA_PARTITION_UA, 0, json_encode(['message' => ‘phpcon']) ); $MVTUFSͷ৔߹͸ɺ
 ΧϯϚ۠੾ΓͰࢦఆ
  14. Rdkafka Producer࣮૷ $rk = new RdKafka\Producer(); $rk->setLogLevel(LOG_DEBUG); $rk->addBrokers(“127.0.0.1"); $topic =

    $rk->newTopic("test"); $topic->produce( RD_KAFKA_PARTITION_UA, 0, json_encode(['message' => ‘phpcon']) ); $MVTUFSͷ৔߹͸ɺ
 ΧϯϚ۠੾ΓͰࢦఆ 5PQJD໊ 1BSUJUJPOΛࢦఆ
  15. Rdkafka Consumer࣮૷ $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk =

    new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.interval.ms', 100); $topicConf->set('offset.store.method', 'file'); $topicConf->set( 'offset.store.path', sys_get_temp_dir() ); $topicConf->set('auto.offset.reset', 'smallest');
  16. Rdkafka Consumer࣮૷ $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk =

    new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.interval.ms', 100); $topicConf->set('offset.store.method', 'file'); $topicConf->set( 'offset.store.path', sys_get_temp_dir() ); $topicConf->set('auto.offset.reset', 'smallest'); ฒྻͰಉ͡UPQJDΛಉ࣌ʹ ॲཧ͢Δ৔߹͸ಉ໊͡લʹ
  17. Rdkafka Consumer࣮૷ $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk =

    new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.interval.ms', 100); $topicConf->set('offset.store.method', 'file'); $topicConf->set( 'offset.store.path', sys_get_temp_dir() ); $topicConf->set('auto.offset.reset', 'smallest'); ฒྻͰಉ͡UPQJDΛಉ࣌ʹ ॲཧ͢Δ৔߹͸ಉ໊͡લʹ $POTVNFSઃఆ஋Λࢦఆ
  18. Rdkafka Consumer࣮૷ $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true)

    { $message = $topic->consume(0, 120*10000); switch ($message->err) { case RD_KAFKA_RESP_ERR_NO_ERROR: var_dump($message); break; case RD_KAFKA_RESP_ERR__PARTITION_EOF: echo "No more messages; will wait for more\n"; break; case RD_KAFKA_RESP_ERR__TIMED_OUT: echo "Timed out\n"; break; default: throw new \Exception($message->errstr(), $message->err); break; } }
  19. Rdkafka Consumer࣮૷ $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true)

    { $message = $topic->consume(0, 120*10000); switch ($message->err) { case RD_KAFKA_RESP_ERR_NO_ERROR: var_dump($message); break; case RD_KAFKA_RESP_ERR__PARTITION_EOF: echo "No more messages; will wait for more\n"; break; case RD_KAFKA_RESP_ERR__TIMED_OUT: echo "Timed out\n"; break; default: throw new \Exception($message->errstr(), $message->err); break; } } Ͳͷ1BSUJUJPOΛར༻͢Δ͔
 औಘҐஔ͸Ͳ͔͜Β͔
  20. Rdkafka Consumer࣮૷ $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true)

    { $message = $topic->consume(0, 120*10000); switch ($message->err) { case RD_KAFKA_RESP_ERR_NO_ERROR: var_dump($message); break; case RD_KAFKA_RESP_ERR__PARTITION_EOF: echo "No more messages; will wait for more\n"; break; case RD_KAFKA_RESP_ERR__TIMED_OUT: echo "Timed out\n"; break; default: throw new \Exception($message->errstr(), $message->err); break; } } Ͳͷ1BSUJUJPOΛར༻͢Δ͔
 औಘҐஔ͸Ͳ͔͜Β͔ ਖ਼ৗʹड৴ͨ͠৔߹ʹ
 ೚ҙͷॲཧΛ࣮ߦ
  21. Apache Kafka GUI • Cluster؅ཧ 
 https://github.com/yahoo/kafka-manager • Message؅ཧ 


    https://github.com/landoop/kafka-topics-ui
 https://github.com/ldaniels528/trifecta
 ͳͲ