Slide 1

Slide 1 text

PHPͰࢧ͑Δ େن໛ΞʔΩςΫνϟ ver1.1 takezawa yuuki builderscon tokyo 2017

Slide 2

Slide 2 text

Notice ͜Ε͔Β͓࿩͢͠Δ಺༰͸ɺ ॴҦϏοάσʔλʹରͯ͠ͷ ΞϓϩʔνͰ ҰൠతͳΞϓϦέʔγϣϯʹ͸ ౰ͯ͸·Γ·ͤΜ

Slide 3

Slide 3 text

Lambda/Kappa Architecture

Slide 4

Slide 4 text

ҰൠతͳwebαʔϏε • PHP • MySQL, PostgreSQL, Oracle, SQL Server • Apache, Nginx etc

Slide 5

Slide 5 text

͓΍ʁݕࡧ͕஗͘ͳ͖ͬͯͨͧ Index͸͖ͪΜͱ͋ΔΜ͚ͩͲɾ

Slide 6

Slide 6 text

େ͖͘ͳ͖ͬͯͨwebαʔϏε • ਺ઍສϨίʔυʹରͯ͠ϑϩϯτ͔ΒLIKE۟ • ύϑΥʔϚϯεվળͷͨΊʹશจݕࡧͳͲΛ௥Ճ͠
 RDBMSͷ೉͍͠ͱ͜ΖΛิ͏ • PVूܭͷͨΊʹຖ෼਺ඦߦ͕ॻ͖ࠐ·ΕΔϩά • খ͍͞αʔϏεͰ͸໰୊ʹͳΒͳ͔ͬͨ΋ͷ͕ݟ͑࢝ ΊΔ

Slide 7

Slide 7 text

ύλʔϯ1 • webΞϓϦέʔγϣϯଆͰɺσʔλϕʔεʹૠೖޙɺ
 ElasticseachͳͲʹૠೖ͢Δύλʔϯ • ΞϓϦέʔγϣϯଆͰίϯτϩʔϧͰ͖Δ͕ɺ
 ΞϓϦέʔγϣϯͷίʔυ͕ංେԽ

Slide 8

Slide 8 text

Application Database Elasticsearch

Slide 9

Slide 9 text

/** * @Transactional * * @param ProductEntity $entity */ public function register(ProductEntity $entity) { $this->productRepository->insert($entity); $this->elasticRepository->index([$entity]); }

Slide 10

Slide 10 text

ͦͷ2

Slide 11

Slide 11 text

ύλʔϯ2 • webΞϓϦέʔγϣϯଆͰɺσʔλϕʔεʹૠೖޙɺ
 ఆظ࣮ߦ͞ΕΔόονͰૠೖ͢Δύλʔϯ • webΞϓϦέʔγϣϯଆͰ͸σʔλϕʔεʹૠೖͷΈ • batchͰͲ͜·Ͱ࡞੒͔ͨ֬͠ೝ͠ͳ͕Βɺ
 ະ࡞੒ͷ΋ͷͷΈ࡞੒͢Δ • ͨͩ͠ϦΞϧλΠϜͰ͸ͳ͍

Slide 12

Slide 12 text

Application Database Elasticsearch Batch

Slide 13

Slide 13 text

ൃੜ͢Δ໰୊ • େྔͷϨίʔυΛऔಘ͢ΔͱςʔϒϧϩοΫ • ϞϊϦγοΫͳγεςϜͰɺಛఆͷσʔλϕʔεʹूத ͨ͠৔߹ʹɺҶͮΔࣜʹো֐ • େྔϨίʔυҰׅ౤ೖͰϨϓϦέʔγϣϯ஗ԆͰো֐ • σΟεΫᷓΕͰো֐ 
 etc…

Slide 14

Slide 14 text

ύλʔϯ3 • webΞϓϦέʔγϣϯଆͰɺσʔλϕʔεʹૠೖޙɺ
 ProducerΛհͯ͠Message Queue΁ૠೖ • webΞϓϦέʔγϣϯଆͰ͸σʔλϕʔεʹૠೖͷΈ • Consumer͕൓Ԡ͠ɺElasticsearchͷindexΛ࡞੒ • Message͕ফࣦ͠ͳ͍ݶΓϦΞϧλΠϜʹ͍ۙ

Slide 15

Slide 15 text

Application Database Elasticsearch Message Queue Consumer Producer

Slide 16

Slide 16 text

ൃੜͨ͠໰୊ • ಛఆͷαʔϏε͕ඞཁͱ͢Δ஋ΛDefinitionʹೖΕͯ͠ ·͍ɺσʔλෆ଍ͰQueue٧·Γ • ConsumerͰϝϞϦϦʔΫ • σʔλϕʔείωΫγϣϯΫϩʔζͤͣʹ
 connection is gone

Slide 17

Slide 17 text

ΑΓେ͖ͳΞϓϦέʔγϣϯ΁

Slide 18

Slide 18 text

ࣄۀ੒௕ʹΑΔେ͖ͳΞϓϦέʔγϣϯ΁ • ϢʔβʔͷߦಈΛ෼ੳ͍ͨ͠ • ଟ͘ͷϢʔβʔʹར༻͞Ε͍ͯΔݕࡧจࣈΛαδΣετ ʹར༻͍ͨ͠ • Ϣʔβʔͷߦಈʹج͍ͮͨίϯςϯπΛද͍ࣔͤͨ͞ • ෼ࢄͨ͠αʔϏεͷσʔλΛू໿ͯ͠৽͍͠ίϯςϯ πΛఏڙ͍ͨ͠ • BigData΁

Slide 19

Slide 19 text

Big Data + Fast Data

Slide 20

Slide 20 text

BigDataʹ൐͏ΞϓϦέʔγϣϯͷ՝୊ • ͦΕͧΕͷΞϓϦέʔγϣϯͰ࣮ߦ͍ͯͨ͠όονॲ ཧ͕ऴΘΒͳ͍ • Ϩίʔυ਺΋୹͍ظؒͰ਺ԯͱ๲େʹͳΓɺ
 σʔλϕʔεͷindexΑΓ΋I/O͕ݫ͍͠ • ϨϓϦέʔγϣϯ஗Ԇ୲อ͕೉͍͠ • ਺ઍສϢʔβʔͷϦΞϧλΠϜͷ෼ੳΛ͢Δʹ͸ݫ͍͠ • ਖ਼نԽͨ͠σʔλ͸ઈରʹRDBMS • ෼ࢄͨ͠σʔλϕʔεʹͲ͏ཱͪ޲͔͏͔

Slide 21

Slide 21 text

BigData΁ͷΞϓϩʔν • σʔλͦͷ΋ͷͷू໿ • લ೔·Ͱʹूܭ͓͚ͯ͠͹ྑ͍σʔλΛ͋Β͔͡Ί༻ ҙ͢Δ • ϦΞϧλΠϜʹೖྗ͞ΕΔσʔλʹରͯ͠ͷ
 Messageॲཧͱɺ෼ࢄՄೳͳσʔλετϨʔδ • લड़ͷdatabase, elasticsearchซ༻ͷऔΓ૊ΈΛɺ
 ΑΓେ͖ͳεέʔϧͰߏங͢Δ

Slide 22

Slide 22 text

෼ࢄετϨʔδ࠾༻ • MongoDB΍Couchbase΋ݕ౼
 υϥΠόʔपΓͷෆ҆ఆ͞ͳͲͰݟૹΓ(ݱࡏ΋ར༻த) • Hadoop
 ෼ࢄϑΝΠϧγεςϜͷHDFS
 ෼ࢄॲཧͷͨΊͷMapReduce
 े෼ރΕ͍ͯΔɺ࠾༻ࣄྫ΋े෼

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

ϥϜμΞʔΩςΫνϟ • όον૚ɺαʔϏε૚ɺεϐʔυ૚Ͱߏ੒ • όον૚͸ɺେ͖ͳσʔλͷूܭ΍ɺେྔσʔλͷ෼ੳͳͲΛ୲౰ ͢Δ -> Hadoop(MapReduce), Spark • αʔϏε૚͸όον૚ͷू໿݁ՌΛఏڙ͢Δ
 Hive, HBase, ElephantDB, Splout SQL, pipelineDB… • εϐʔυ૚͸ϦΞϧλΠϜॲཧͷ݁ՌΛఏڙ͢Δ૚
 Spark, Storm, Kafka, Cassandra etc.. • αʔϏε૚ͱεϐʔυ૚ͷ྆ํͷ஋ΛϚʔδͯ͠ฦ٫
 ೉қ౓͸ߴ͍ɾɾɾ
 -> KafkaͳͲʹू໿ͤͨ͞Kappa Architecure΁

Slide 25

Slide 25 text

ετϦʔϜॲཧ • େྔͷσʔλΛϦΞϧλΠϜͰॲཧ͢Δͷ͕ɺ
 ετϦʔϜσʔλॲཧͷ໨త • ऴΘΓ͕ͳ͘ɺແݶʹ΍ͬͯ͘Δ΋ͷ΁ͷΞϓϩʔν • ϝϞϦ಺Ͱॲཧ͞Εɺͦͷޙഁغ͞ΕΔ • ؂ࢹܥͷॲཧΑ͘ར༻͞Ε͍ͯΔ΋ͷ • ηϯαʔΛར༻ͨ͠ΞϓϦέʔγϣϯͳͲ

Slide 26

Slide 26 text

Spark • ෼ࢄॲཧϑϨʔϜϫʔΫͷҰͭ • RDDͱݺ͹ΕΔΠϛϡʔλϒϧͳίϨΫγϣϯΛѻ͏ • Spark SQL • Spark Streaming • Spark MLlib

Slide 27

Slide 27 text

KappaΞʔΩςΫνϟ

Slide 28

Slide 28 text

KappaΞʔΩςΫνϟ

Slide 29

Slide 29 text

OSSͰߏங

Slide 30

Slide 30 text

PHPϝΠϯͰ࡞Δ͜ͱ͸೉͍͠ɾɾ

Slide 31

Slide 31 text

σϞ https://github.com/ytake/ builderscon-example

Slide 32

Slide 32 text

Kappa Architecture(small) PHP ConsoleApp Kafka Spark Streaming PHP Consumer Cassandra PHP WebApp

Slide 33

Slide 33 text

Apache Cassandra

Slide 34

Slide 34 text

Apache Cassandra • Ϩίʔυ͕େྔʹ૿͑Δ͜ͱ͕Θ͔͓ͬͯΓɺ
 ෼ੳʹར༻͢Δ༧ఆͰ͋ͬͨͨΊɺ
 εέʔϧ͕༰қͱ͍͏఺Ͱ࠾༻ • PHP͔Βར༻Մೳ(ext-cassandra) • େྔσʔλͷॻ͖ࠐΈʹରԠ • ؆୯ͳτϥϯβΫγϣϯαϙʔτ • σʔληϯλʔލ͗ͷΫϥελʔߏங • Availability ͱ Partition Tolerance • SQLΠϯλʔϑΣʔε • ୯Ұো֐఺ͳ͠???????

Slide 35

Slide 35 text

Apache Cassandra Architecture

Slide 36

Slide 36 text

ؾΛ͚ͭΔ఺ • RDBMSײ֮Ͱ͸͏·͘ར༻Ͱ͖·ͤΜ • ύʔςΟγϣϯΩʔͰ͏·͘ઃܭ͢Δ • ৚݅ʹΑΔΦʔμʔ͸ࢦఆͰ͖ͳ͍ • ϚςϦΞϥΠζυϏϡʔซ༻͢΂͠ • ো֐࣌ͷϩά෼ੳ͸ͨͩ͘͠ • ίϯύΫγϣϯͱઓ͏(࣮ࡍʹར༻͢Δ༰ྔ*2Ͱܭࢉ) • ݕࡧͰҾ͔͔ͬΔهࣄ͸େମچόʔδϣϯͰɺ
 ݱߦͱશ͘ผ෺

Slide 37

Slide 37 text

ςʔϒϧઃܭ • Primary Key͸ࣝผΩʔͰ͋Γͳ͕Βɺ
 Ͳͷnodeʹ֨ೲ͢Δ͔Λܾఆ͢ΔύʔςΟγϣϯΩʔ • ҟͳΔnodeʹ͋Δ΋ͷͷݕࡧ͸޲͔ͳ͍
 ඞཁͳέʔε͕ੜͨ͡৔߹͸ςʔϒϧઃܭΛݟ௚͢ • ߋ৽࣌ɺ࡟আ࣌ʹؚΊͳ͚Ε͹ͳΒͳ͍ • ར༻Մೳͳ΋ͷ͸ηΧϯμϦΠϯσοΫε·Ͱ • JOIN΍LIKE͸ଘࡏ͠ͳ͍ͨΊɺෳࡶͳ΋ͷ͸SparkͰ

Slide 38

Slide 38 text

ςʔϒϧઃܭ CREATE TABLE timeline.user_timeline ( uuid uuid, user_id int, reference map, body text, is_read tinyint, published_at timestamp, PRIMARY KEY (user_id) );

Slide 39

Slide 39 text

Ϩίʔυͷॱ൪Λܾఆ͢Δ CREATE TABLE timeline.user_timeline ( uuid uuid, user_id int, reference map, body text, is_read tinyint, published_at timestamp, PRIMARY KEY (user_id) ) WITH CLUSTERING ORDER BY (published_at DESC);

Slide 40

Slide 40 text

MATERIALIZED VIEW CREATE MATERIALIZED VIEW timeline.desc_user_timeline AS SELECT uuid, user_id, published_at, reference, body FROM timeline.user_timeline WHERE user_id IS NOT NULL AND published_at IS NOT NULL AND uuid IS NOT NULL PRIMARY KEY (user_id, published_at, uuid) WITH CLUSTERING ORDER BY (published_at DESC);

Slide 41

Slide 41 text

From PHP $cluster = Cassandra::cluster() ->withContactPoints('10.0.1.24', ‘localhost') ->withPort(9042) ->build(); $statement = $session->prepare( "UPDATE users SET age = ? WHERE user_name = ?” ); $futures = array(); // execute all statements in background foreach ($data as $arguments) { $futures[] = $session->executeAsync( $statement, [ ‘arguments' => $arguments ]; }

Slide 42

Slide 42 text

PHP extension • Batchʹ࠷దԽ͞ΕͨI/F Batch Statement • ฒྻར༻Մೳ • Pagination͕༻ҙ͞Ε͍ͯΔ(Generatorར༻) • Cassandraͷ΄ͱΜͲͷػೳ͕ར༻Ͱ͖ΔͷͰɺ
 Java͔Βར༻ͤͣͱ΋े෼׆༻Ͱ͖Δ

Slide 43

Slide 43 text

Apache Kafka

Slide 44

Slide 44 text

Apache Kafka • Streamαϙʔτ(ϥϜμΞʔΩςΫνϟͰ͸ඞཁෆՄܽ) • ΫϥελϦϯά͕ࣗ༝ࣗࡏ • Zookeeperͱ࿈ܞͨ͠෼ࢄγεςϜ • ো֐ʹڧ͘ɺϝοηʔδͷ࠶औಘ͕Մೳ • SparkͱStormͱ༰қʹ࿈ܞͰ͖Δ͜ͱ͔Β࠾༻ • ϝοηʔδૹ৴ޙͰ΋ࢦఆظؒอ࣋͠ɺ
 ଞͷΫΤϦΤϯδϯ͔Βϝοηʔδ಺༰औಘՄೳ • PHP͔Βར༻Մೳ(rdkafka)

Slide 45

Slide 45 text

Message QueueͰൃੜ͢Δ໰୊ • Producer͔ΒBroker΁ૹ৴࣌ʹܽଛ͢Δ͜ͱ͕͋Δ • Broker͕ड৴Λࣦഊ͢Δέʔε • Brokerͷૹ৴͕ࣦഊ͢Δέʔε • ॏෳͯ͠ड৴ͯ͠͠·͏έʔε

Slide 46

Slide 46 text

0.11 • Exactly-once delivery and transactional messaging • ਖ਼֬ʹҰ౓͚ͩɺ࣮֬ʹಧ͚Δ • ϝοηʔδૹ৴ͱड৴ʹτϥϯβΫγϣϯʂ • ΑΓڧݻʹ

Slide 47

Slide 47 text

Partition • ฒྻ΍෼ࢄॲཧ͕ઃܭ • topicΛPartitionͰ෼ׂ͠ɺProducer, Consumer͕೚ҙ ͷPartitionʹΞΫηε • ࡉ෼Խͱޮ཰Խ͕ࣗ༝ʹ

Slide 48

Slide 48 text

Partition

Slide 49

Slide 49 text

BigDataͷ࢝·Γ͸PHP͔Β

Slide 50

Slide 50 text

Presto

Slide 51

Slide 51 text

େن໛ʹΑΔσʔλநग़໰୊ • hdfsʹ֨ೲ͞ΕͨσʔλΛΈ͍ͨ • RDBMSʹ֨ೲ͞Εͨσʔλͱ݁߹ͯ͠΄͍͠ • Ϗδωε؍఺Ͱͷσʔλूܭ΍நग़Λͯ͠΄͍͠ • σʔλϕʔε෼ࢄͰ೉қ౓͕ߴ͍ • खܰʹ෼ੳʹར༻͍ͨ͠
 
 
 -> ແཧ೉୊

Slide 52

Slide 52 text

σʔλϚʔτ • ඞཁͳσʔλΛूΊͯσʔλϕʔεʹू໿͢Δ • όονॲཧͰ࣮ߦ͢ΔͨΊɺଈ࣌ʹσʔλΛऔಘ͢Δ ͜ͱ͸೉͍͠ • σʔλϚʔτࣗମͷอकӡ༻͕ඞཁͱͳΔ
 (ͦΕ͕ۀ຿Ͱ͋Ε͹Մೳ) • σʔλϚʔτΛઃܭ͢Δͷ͸೉͘͠ɺϋʔυϧ͕ߴ͍

Slide 53

Slide 53 text

Prestoͱ͸ • facebook੡Ͱେن໛ͳσʔλʹରͯ͠ɺ
 ΠϯλϥΫςΟϒʹσʔλऔಘͰ͖ΔΫΤϦΤϯδϯ
 • ϑϩϯτΞϓϦέʔγϣϯ͔Βhdfsʹ઀ଓ͠ɺ
 σʔλΛଈ࠲ʹՄࢹԽͤ͞Δͷ͸೉͍͠ • Hive͸όονॲཧ༻్ͷͨΊɺ਺ඵͰฦ٫͸ෆՄೳ
 (MapReduce) • RDBMSʹ௚઀઀ଓ͍ͨ͠ʂͳͲΛղܾ

Slide 54

Slide 54 text

Prestoͱ͸ • SQLΠϯλʔϑΣʔεΛఏڙ • Cassandra, Hive, Kafka • MongoDB, MySQL, PostgreSQL, SQLServer • Redis, Thrift • ରԠ͍ͯ͠ͳ͍΋ͷͰ΋javaͰυϥΠόΛ࣮૷͢Δ͚ͩ Ͱ͋Δఔ౓֦ுՄೳ • SELECTҎ֎ʹ΋INSERTͳͲʹ΋ରԠ͓ͯ͠Γɺ
 σʔλϕʔεҠߦ΍ɺ෼ࢄΞʔΩςΫνϟͷΧόʔͳͲ ʹ΋

Slide 55

Slide 55 text

Prestoͱ͸

Slide 56

Slide 56 text

Prestoͱ͸ • jdbcରԠ • PHP͔Β͸ɺ
 xtendsys-labs/php-presto-client
 ytake/php-presto-client

Slide 57

Slide 57 text

No content

Slide 58

Slide 58 text

·ͱΊ • ෳࡶԽ͢ΔΞϓϦέʔγϣϯɺ
 ՝୊ղܾ͸େ͖͘ͳΔ୉ޣຯ • PHPͰϏδωεΛαϙʔτ͢Δ؅ཧπʔϧ • PHP͔Β࢝·ΔBigData + FastDataΞʔΩςΫνϟ • PHPͰ΋େ͖͘ߩݙ

Slide 59

Slide 59 text

webΞϓϦέʔγϣϯ͔Β BigData·Ͱࢧ͑ΔPHP