Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Speaker Deck
PRO
Sign in
Sign up
for free
PHPで支える大規模アーキテクチャ
yuuki takezawa
August 05, 2017
Technology
1
5.4k
PHPで支える大規模アーキテクチャ
このスライドはPHPカンファレンス関西2017でおこなったものに、presto追加とシンプルなデモを追加したものです
yuuki takezawa
August 05, 2017
Tweet
Share
More Decks by yuuki takezawa
See All by yuuki takezawa
ytake
4
1.1k
ytake
0
5.4k
ytake
3
6.9k
ytake
2
2.6k
ytake
7
10k
ytake
3
270
ytake
1
1.7k
ytake
4
1.8k
ytake
1
3.4k
Other Decks in Technology
See All in Technology
mii3king
0
400
binarymist
0
1.2k
torisoup
10
4.9k
kanaugust
PRO
0
150
yamamuteki
2
160
miyakemito
1
520
norioikedo
0
210
gamella
3
1.4k
sansandsoc
1
380
fufuhu
3
130
you
0
270
whitefox_73
0
180
Featured
See All Featured
dotmariusz
94
5.5k
ddemaree
273
31k
bryan
100
11k
kneath
295
39k
dougneiner
119
7.9k
phodgson
87
3.9k
chrislema
173
14k
akmur
252
19k
shpigford
369
42k
caitiem20
308
17k
mongodb
23
3.9k
keathley
20
700
Transcript
PHPͰࢧ͑Δ େنΞʔΩςΫνϟ ver1.1 takezawa yuuki <ytake> builderscon tokyo 2017
Notice ͜Ε͔Β͓͢͠Δ༰ɺ ॴҦϏοάσʔλʹରͯ͠ͷ ΞϓϩʔνͰ ҰൠతͳΞϓϦέʔγϣϯʹ ͯ·Γ·ͤΜ
Lambda/Kappa Architecture
ҰൠతͳwebαʔϏε • PHP • MySQL, PostgreSQL, Oracle, SQL Server •
Apache, Nginx etc
͓ʁݕࡧ͕͘ͳ͖ͬͯͨͧ Index͖ͪΜͱ͋ΔΜ͚ͩͲɾ
େ͖͘ͳ͖ͬͯͨwebαʔϏε • ઍສϨίʔυʹରͯ͠ϑϩϯτ͔ΒLIKE۟ • ύϑΥʔϚϯεվળͷͨΊʹશจݕࡧͳͲΛՃ͠ RDBMSͷ͍͠ͱ͜ΖΛิ͏ • PVूܭͷͨΊʹຖඦߦ͕ॻ͖ࠐ·ΕΔϩά • খ͍͞αʔϏεͰʹͳΒͳ͔ͬͨͷ͕ݟ͑࢝
ΊΔ
ύλʔϯ1 • webΞϓϦέʔγϣϯଆͰɺσʔλϕʔεʹૠೖޙɺ ElasticseachͳͲʹૠೖ͢Δύλʔϯ • ΞϓϦέʔγϣϯଆͰίϯτϩʔϧͰ͖Δ͕ɺ ΞϓϦέʔγϣϯͷίʔυ͕ංେԽ
Application Database Elasticsearch
/** * @Transactional * * @param ProductEntity $entity */ public
function register(ProductEntity $entity) { $this->productRepository->insert($entity); $this->elasticRepository->index([$entity]); }
ͦͷ2
ύλʔϯ2 • webΞϓϦέʔγϣϯଆͰɺσʔλϕʔεʹૠೖޙɺ ఆظ࣮ߦ͞ΕΔόονͰૠೖ͢Δύλʔϯ • webΞϓϦέʔγϣϯଆͰσʔλϕʔεʹૠೖͷΈ • batchͰͲ͜·Ͱ࡞͔ͨ֬͠ೝ͠ͳ͕Βɺ ະ࡞ͷͷͷΈ࡞͢Δ •
ͨͩ͠ϦΞϧλΠϜͰͳ͍
Application Database Elasticsearch Batch
ൃੜ͢Δ • େྔͷϨίʔυΛऔಘ͢ΔͱςʔϒϧϩοΫ • ϞϊϦγοΫͳγεςϜͰɺಛఆͷσʔλϕʔεʹूத ͨ͠߹ʹɺҶͮΔࣜʹো • େྔϨίʔυҰׅೖͰϨϓϦέʔγϣϯԆͰো • σΟεΫᷓΕͰো
etc…
ύλʔϯ3 • webΞϓϦέʔγϣϯଆͰɺσʔλϕʔεʹૠೖޙɺ ProducerΛհͯ͠Message Queueૠೖ • webΞϓϦέʔγϣϯଆͰσʔλϕʔεʹૠೖͷΈ • Consumer͕Ԡ͠ɺElasticsearchͷindexΛ࡞ •
Message͕ফࣦ͠ͳ͍ݶΓϦΞϧλΠϜʹ͍ۙ
Application Database Elasticsearch Message Queue Consumer Producer
ൃੜͨ͠ • ಛఆͷαʔϏε͕ඞཁͱ͢ΔΛDefinitionʹೖΕͯ͠ ·͍ɺσʔλෆͰQueue٧·Γ • ConsumerͰϝϞϦϦʔΫ • σʔλϕʔείωΫγϣϯΫϩʔζͤͣʹ connection is
gone
ΑΓେ͖ͳΞϓϦέʔγϣϯ
ࣄۀʹΑΔେ͖ͳΞϓϦέʔγϣϯ • ϢʔβʔͷߦಈΛੳ͍ͨ͠ • ଟ͘ͷϢʔβʔʹར༻͞Ε͍ͯΔݕࡧจࣈΛαδΣετ ʹར༻͍ͨ͠ • Ϣʔβʔͷߦಈʹج͍ͮͨίϯςϯπΛද͍ࣔͤͨ͞ • ࢄͨ͠αʔϏεͷσʔλΛूͯ͠৽͍͠ίϯςϯ
πΛఏڙ͍ͨ͠ • BigData
Big Data + Fast Data
BigDataʹ͏ΞϓϦέʔγϣϯͷ՝ • ͦΕͧΕͷΞϓϦέʔγϣϯͰ࣮ߦ͍ͯͨ͠όονॲ ཧ͕ऴΘΒͳ͍ • Ϩίʔυ͍ظؒͰԯͱେʹͳΓɺ σʔλϕʔεͷindexΑΓI/O͕ݫ͍͠ • ϨϓϦέʔγϣϯԆ୲อ͕͍͠ •
ઍສϢʔβʔͷϦΞϧλΠϜͷੳΛ͢Δʹݫ͍͠ • ਖ਼نԽͨ͠σʔλઈରʹRDBMS • ࢄͨ͠σʔλϕʔεʹͲ͏ཱ͔ͪ͏͔
BigDataͷΞϓϩʔν • σʔλͦͷͷͷू • લ·Ͱʹूܭ͓͚ͯ͠ྑ͍σʔλΛ͋Β͔͡Ί༻ ҙ͢Δ • ϦΞϧλΠϜʹೖྗ͞ΕΔσʔλʹରͯ͠ͷ MessageॲཧͱɺࢄՄೳͳσʔλετϨʔδ •
લड़ͷdatabase, elasticsearchซ༻ͷऔΓΈΛɺ ΑΓେ͖ͳεέʔϧͰߏங͢Δ
ࢄετϨʔδ࠾༻ • MongoDBCouchbaseݕ౼ υϥΠόʔपΓͷෆ҆ఆ͞ͳͲͰݟૹΓ(ݱࡏར༻த) • Hadoop ࢄϑΝΠϧγεςϜͷHDFS ࢄॲཧͷͨΊͷMapReduce ेރΕ͍ͯΔɺ࠾༻ࣄྫे
None
ϥϜμΞʔΩςΫνϟ • όονɺαʔϏεɺεϐʔυͰߏ • όονɺେ͖ͳσʔλͷूܭɺେྔσʔλͷੳͳͲΛ୲ ͢Δ -> Hadoop(MapReduce), Spark •
αʔϏεόονͷू݁ՌΛఏڙ͢Δ Hive, HBase, ElephantDB, Splout SQL, pipelineDB… • εϐʔυϦΞϧλΠϜॲཧͷ݁ՌΛఏڙ͢Δ Spark, Storm, Kafka, Cassandra etc.. • αʔϏεͱεϐʔυͷ྆ํͷΛϚʔδͯ͠ฦ٫ қߴ͍ɾɾɾ -> KafkaͳͲʹूͤͨ͞Kappa Architecure
ετϦʔϜॲཧ • େྔͷσʔλΛϦΞϧλΠϜͰॲཧ͢Δͷ͕ɺ ετϦʔϜσʔλॲཧͷత • ऴΘΓ͕ͳ͘ɺແݶʹͬͯ͘ΔͷͷΞϓϩʔν • ϝϞϦͰॲཧ͞Εɺͦͷޙഁغ͞ΕΔ • ࢹܥͷॲཧΑ͘ར༻͞Ε͍ͯΔͷ
• ηϯαʔΛར༻ͨ͠ΞϓϦέʔγϣϯͳͲ
Spark • ࢄॲཧϑϨʔϜϫʔΫͷҰͭ • RDDͱݺΕΔΠϛϡʔλϒϧͳίϨΫγϣϯΛѻ͏ • Spark SQL • Spark
Streaming • Spark MLlib
KappaΞʔΩςΫνϟ
KappaΞʔΩςΫνϟ
OSSͰߏங
PHPϝΠϯͰ࡞Δ͜ͱ͍͠ɾɾ
σϞ https://github.com/ytake/ builderscon-example
Kappa Architecture(small) PHP ConsoleApp Kafka Spark Streaming PHP Consumer Cassandra
PHP WebApp
Apache Cassandra
Apache Cassandra • Ϩίʔυ͕େྔʹ૿͑Δ͜ͱ͕Θ͔͓ͬͯΓɺ ੳʹར༻͢Δ༧ఆͰ͋ͬͨͨΊɺ εέʔϧ͕༰қͱ͍͏Ͱ࠾༻ • PHP͔Βར༻Մೳ(ext-cassandra) • େྔσʔλͷॻ͖ࠐΈʹରԠ
• ؆୯ͳτϥϯβΫγϣϯαϙʔτ • σʔληϯλʔލ͗ͷΫϥελʔߏங • Availability ͱ Partition Tolerance • SQLΠϯλʔϑΣʔε • ୯Ұোͳ͠???????
Apache Cassandra Architecture
ؾΛ͚ͭΔ • RDBMSײ֮Ͱ͏·͘ར༻Ͱ͖·ͤΜ • ύʔςΟγϣϯΩʔͰ͏·͘ઃܭ͢Δ • ݅ʹΑΔΦʔμʔࢦఆͰ͖ͳ͍ • ϚςϦΞϥΠζυϏϡʔซ༻͢͠ •
ো࣌ͷϩάੳͨͩ͘͠ • ίϯύΫγϣϯͱઓ͏(࣮ࡍʹར༻͢Δ༰ྔ*2Ͱܭࢉ) • ݕࡧͰҾ͔͔ͬΔهࣄେମچόʔδϣϯͰɺ ݱߦͱશ͘ผ
ςʔϒϧઃܭ • Primary KeyࣝผΩʔͰ͋Γͳ͕Βɺ Ͳͷnodeʹ֨ೲ͢Δ͔Λܾఆ͢ΔύʔςΟγϣϯΩʔ • ҟͳΔnodeʹ͋Δͷͷݕࡧ͔ͳ͍ ඞཁͳέʔε͕ੜͨ͡߹ςʔϒϧઃܭΛݟ͢ • ߋ৽࣌ɺআ࣌ʹؚΊͳ͚ΕͳΒͳ͍
• ར༻ՄೳͳͷηΧϯμϦΠϯσοΫε·Ͱ • JOINLIKEଘࡏ͠ͳ͍ͨΊɺෳࡶͳͷSparkͰ
ςʔϒϧઃܭ CREATE TABLE timeline.user_timeline ( uuid uuid, user_id int, reference
map<text, text>, body text, is_read tinyint, published_at timestamp, PRIMARY KEY (user_id) );
Ϩίʔυͷॱ൪Λܾఆ͢Δ CREATE TABLE timeline.user_timeline ( uuid uuid, user_id int, reference
map<text, text>, body text, is_read tinyint, published_at timestamp, PRIMARY KEY (user_id) ) WITH CLUSTERING ORDER BY (published_at DESC);
MATERIALIZED VIEW CREATE MATERIALIZED VIEW timeline.desc_user_timeline AS SELECT uuid, user_id,
published_at, reference, body FROM timeline.user_timeline WHERE user_id IS NOT NULL AND published_at IS NOT NULL AND uuid IS NOT NULL PRIMARY KEY (user_id, published_at, uuid) WITH CLUSTERING ORDER BY (published_at DESC);
From PHP $cluster = Cassandra::cluster() ->withContactPoints('10.0.1.24', ‘localhost') ->withPort(9042) ->build(); $statement
= $session->prepare( "UPDATE users SET age = ? WHERE user_name = ?” ); $futures = array(); // execute all statements in background foreach ($data as $arguments) { $futures[] = $session->executeAsync( $statement, [ ‘arguments' => $arguments ]; }
PHP extension • Batchʹ࠷దԽ͞ΕͨI/F Batch Statement • ฒྻར༻Մೳ • Pagination͕༻ҙ͞Ε͍ͯΔ(Generatorར༻)
• Cassandraͷ΄ͱΜͲͷػೳ͕ར༻Ͱ͖ΔͷͰɺ Java͔Βར༻ͤͣͱे׆༻Ͱ͖Δ
Apache Kafka
Apache Kafka • Streamαϙʔτ(ϥϜμΞʔΩςΫνϟͰඞཁෆՄܽ) • ΫϥελϦϯά͕ࣗ༝ࣗࡏ • Zookeeperͱ࿈ܞͨ͠ࢄγεςϜ • োʹڧ͘ɺϝοηʔδͷ࠶औಘ͕Մೳ
• SparkͱStormͱ༰қʹ࿈ܞͰ͖Δ͜ͱ͔Β࠾༻ • ϝοηʔδૹ৴ޙͰࢦఆظؒอ࣋͠ɺ ଞͷΫΤϦΤϯδϯ͔Βϝοηʔδ༰औಘՄೳ • PHP͔Βར༻Մೳ(rdkafka)
Message QueueͰൃੜ͢Δ • Producer͔ΒBrokerૹ৴࣌ʹܽଛ͢Δ͜ͱ͕͋Δ • Broker͕ड৴Λࣦഊ͢Δέʔε • Brokerͷૹ৴͕ࣦഊ͢Δέʔε • ॏෳͯ͠ड৴ͯ͠͠·͏έʔε
0.11 • Exactly-once delivery and transactional messaging • ਖ਼֬ʹҰ͚ͩɺ࣮֬ʹಧ͚Δ •
ϝοηʔδૹ৴ͱड৴ʹτϥϯβΫγϣϯʂ • ΑΓڧݻʹ
Partition • ฒྻࢄॲཧ͕ઃܭ • topicΛPartitionͰׂ͠ɺProducer, Consumer͕ҙ ͷPartitionʹΞΫηε • ࡉԽͱޮԽ͕ࣗ༝ʹ
Partition
BigDataͷ࢝·ΓPHP͔Β
Presto
େنʹΑΔσʔλநग़ • hdfsʹ֨ೲ͞ΕͨσʔλΛΈ͍ͨ • RDBMSʹ֨ೲ͞Εͨσʔλͱ݁߹ͯ͠΄͍͠ • Ϗδωε؍Ͱͷσʔλूܭநग़Λͯ͠΄͍͠ • σʔλϕʔεࢄͰқ͕ߴ͍ •
खܰʹੳʹར༻͍ͨ͠ -> ແཧ
σʔλϚʔτ • ඞཁͳσʔλΛूΊͯσʔλϕʔεʹू͢Δ • όονॲཧͰ࣮ߦ͢ΔͨΊɺଈ࣌ʹσʔλΛऔಘ͢Δ ͜ͱ͍͠ • σʔλϚʔτࣗମͷอकӡ༻͕ඞཁͱͳΔ (ͦΕ͕ۀͰ͋ΕՄೳ) •
σʔλϚʔτΛઃܭ͢Δͷ͘͠ɺϋʔυϧ͕ߴ͍
Prestoͱ • facebookͰେنͳσʔλʹରͯ͠ɺ ΠϯλϥΫςΟϒʹσʔλऔಘͰ͖ΔΫΤϦΤϯδϯ • ϑϩϯτΞϓϦέʔγϣϯ͔Βhdfsʹଓ͠ɺ σʔλΛଈ࠲ʹՄࢹԽͤ͞Δͷ͍͠ • Hiveόονॲཧ༻్ͷͨΊɺඵͰฦ٫ෆՄೳ (MapReduce)
• RDBMSʹଓ͍ͨ͠ʂͳͲΛղܾ
Prestoͱ • SQLΠϯλʔϑΣʔεΛఏڙ • Cassandra, Hive, Kafka • MongoDB, MySQL,
PostgreSQL, SQLServer • Redis, Thrift • ରԠ͍ͯ͠ͳ͍ͷͰjavaͰυϥΠόΛ࣮͢Δ͚ͩ Ͱ͋Δఔ֦ுՄೳ • SELECTҎ֎ʹINSERTͳͲʹରԠ͓ͯ͠Γɺ σʔλϕʔεҠߦɺࢄΞʔΩςΫνϟͷΧόʔͳͲ ʹ
Prestoͱ
Prestoͱ • jdbcରԠ • PHP͔Βɺ xtendsys-labs/php-presto-client ytake/php-presto-client
None
·ͱΊ • ෳࡶԽ͢ΔΞϓϦέʔγϣϯɺ ՝ղܾେ͖͘ͳΔޣຯ • PHPͰϏδωεΛαϙʔτ͢Δཧπʔϧ • PHP͔Β࢝·ΔBigData + FastDataΞʔΩςΫνϟ
• PHPͰେ͖͘ߩݙ
webΞϓϦέʔγϣϯ͔Β BigData·Ͱࢧ͑ΔPHP