Slide 1

Slide 1 text

Microservices on Multi-Cloud Masahiro @kazeburo Nagano MANABIYA Teratail developer days 2018/03/23

Slide 2

Slide 2 text

Me • ௕໺խ޿ • @kazeburo • גࣜձࣾϝϧΧϦ
 ϓϦϯγύϧΤϯδχΞ
 Site Reliability Engineering (SRE) νʔϜ • BASE, Inc ٕज़ΞυόΠβʔ • झຯ͸DBͷ Restore

Slide 3

Slide 3 text

Agenda • ϝϧΧϦʹ͍ͭͯ • ϝϧΧϦͷ Infrastructure History #1 - Multi-Cloud • ϝϧΧϦͷ Infrastructure History #2 - Microservices on Multi-Cloud • Microservices on Multi-Cloud ͷ՝୊

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

ϝϧΧϦ • ೔ຊ࠷େڃͷϑϦϚΞϓϦ • 3෼Ͱ؆୯ʹग़඼ 1) ࣸਅΛࡱΔ 2) ঎඼৘ใΛهೖ 3) ग़඼ϘλϯΛԡ͢ • ҆৺҆શͳܾࡁɾऔҾ • ΤεΫϩʔ(͓ۚͷ΍ΓͱΓ͸౰͕ࣾؒʹհࡏ) • ಗ໊഑ૹ

Slide 6

Slide 6 text

ถࠃ/ӳࠃ ΁ͷల։ JP UK US

Slide 7

Slide 7 text

KPI μ΢ϯϩʔυ਺ GMV(૯औҾֹ) 1ԯDLҎ্(JP+US+UK) ݄ؒ100ԯԁҎ্ ग़඼਺ 1೔100ສ඼Ҏ্

Slide 8

Slide 8 text

γεςϜ֓ཁ ग़඼! DB Search 5-දࣔ ݕࡧ൓ө ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific େྔͷϦΫΤετ ϦΫΤετԠ౴ DB Search ߪೖ! ਺ඵʙ30ඵ ਺ඵʙ ը૾ ܾࡁ AI ߴ଎ʹฒߦͯ͠େྔͷτϥϯβΫγϣϯΛѻ͏

Slide 9

Slide 9 text

Infrastructure

Slide 10

Slide 10 text

Infrastructure in 2017 DNS: Amazon Route53 CDN: Akamai, CloudFront Storage: Amazon S3 Analysis: Google BigQuery / Monitoring: Mackerel JP UK US

Slide 11

Slide 11 text

Infrastructure in 2018 DNS: Amazon Route53 CDN: Akamai, Fastly, ImageFlux(JP) Storage: Amazon S3 Analysis: Google BigQuery / Monitoring: Mackerel, DataDog JP UK US + +

Slide 12

Slide 12 text

Infrastructure History #1 2013 - 2017 / Multi-Cloud

Slide 13

Slide 13 text

Infrastructure History (1) • 2013/07 JP ϦϦʔε • ͘͞ΒΠϯλʔωοτͷʮ͘͞ΒͷVPSʯ1୆ʹWeb΋DB΋ࡌͤͨߏ੒Ͱ։࢝ • Infrastructure ઐ೚ऀ͕͍ͳ͍தͰɺ։ൃऀʹ਎ۙͳج൫Λબ୒ • ϦϦʔεޙ2ϲ݄Ͱʮ͘͞ΒͷΫϥ΢υʯʮઐ༻αʔόʯ΁Ҡߦ

Slide 14

Slide 14 text

ʮ͘͞Βͷઐ༻αʔόʯ • Metal as a Service • ෺ཧαʔόΛΫϥ΢υͷΑ͏ʹѻ͑Δ • ෺ཧαʔόͳΒͰͷύϑΥʔϚϯε • ωοτϫʔΫͱϋʔυ΢ΣΞͷอक͸
 ͘͞ΒΠϯλʔωοτ༷͕୲౰ • ʮ͘͞ΒͷΫϥ΢υʯͱ઀ଓ͕Մೳ • ίετύϑΥʔϚϯεʹ༏ΕΔ

Slide 15

Slide 15 text

Infrastructure History (2) • 2014/09 US ϦϦʔε • AWS (Oregon) ʹͯαʔϏεߏங • JPϦϦʔε͔Β͠͹Β͘ܦͪɺ։ൃऀʹAWSܦݧऀ͕૿Ճ • ͦΕͰ΋ Infrastructure ઐ೚ऀ͸গͳ͘ɺRDS΍ElastiCache౳ϚωʔδυαʔϏεΛ ར༻ͯ͠αʔϏεΛߏங • USࠃ಺ͷ MaaS Λݕ౼͕ͨ͠ɺUSͰͷαʔϏεͷ੒௕͸༧૝͕೉͘͠ɺΫϥ΢υͷॊ ೈ͞Λ JP ΑΓ΋ॏཁࢹ

Slide 16

Slide 16 text

Infrastructure History (3) • 2015/11 SREνʔϜൃ଍ • JP/US ͷΞʔΩςΫνϟΛվળ͠ɺαʔϏεͷ৴པੑͱεέʔϥϏϦςΟͷ޲ ্ʹͱऔΓ૊Ή • 2017/03 UK ϦϦʔε • ৽͍ٕ͠ज़ͱͯ͠ʮGCPʯ্ͰαʔϏεΛߏங

Slide 17

Slide 17 text

Multi-Cloud in 2017/03 JP UK US ઐ༻αʔό EC2 GCE IaaS Λத৺ͱͨ͠ Multi-Cloud (Hybrid Cloud) ͨͩ͠ɺͦΕͧΕͷαʔϏεΛΈΔͱ୯ಠͷCloudΛར༻

Slide 18

Slide 18 text

Multi-Cloud Operations • ՄೳͳݶΓڞ௨ͷΞʔΩςΫνϟΛ࠾༻ • ଞͷΫϥ΢υʹଘࡏ͠ͳ͍ϚωʔδυαʔϏεͷϦϓϨΠε • Consul/Local DNSͷಋೖ • ΦϖϨʔγϣϯͷڞ௨Խɾগਓ਺Ͱͷӡ༻ͷ࣮ݱ • JP ͷن໛Ͱ࣮੷ͷ͋Δߏ੒ɻUS AppStoreͰ3Ґ࣌ͷτϥϑΟοΫΛ҆ఆͯ͠ॲཧ • Ansible playbookɺDBͷϚΠάϨʔγϣϯ࡞ۀͷڞ௨Խ

Slide 19

Slide 19 text

Architecture nginx nginx nginx DNS-RR App App App App App App MySQL MySQL memcached memcached util util cloud cloud JP nginx nginx nginx App App App App App App MySQL MySQL memcached memcached util util GCE cloud load balancer GCE GCE GCE GCE GCE GCE GCE GCE GCE GCE GCE GCE GCE GCE UK γϯϓϧͳ3૚ߏ੒ Ϋϥ΢υͰ΋EC2/GCE (αʔό) Λ
 த৺ʹߏ੒ ɾ USಠࣗͷαʔϏε΍
 খن໛DBʹ͸ RDSΛ࢖͏͜ͱ΋ UKͰ͸Cloud Load BalancerΛར༻

Slide 20

Slide 20 text

Internal DNS App App App App App App DNS DNS unbound unbound unbound unbound unbound unbound DNS unbound Consul DNS *.consul *.local • શͯͷαʔόʹunboundΛಋೖ • ϩʔΧϧΩϟογϡʹΑΔύϑΥʔϚϯε޲্ • resolv.conf ΑΓো֐ʹڧ͍ • αʔϏεͷՄ༻ੑͱॊೈੑΛ֬อ • ΞϓϦέʔγϣϯ͸IPΞυϨεͰͳ͘ϗετ໊Λར༻ • ΞϓϦέʔγϣϯίʔυͷมߋͳ͠ʹߏ੒มߋ͕Մೳ • Internal LB୅ସͱͯ͠consul Λ͔ͭͬͨ৑௕Խͱෛՙ෼ࢄΛଟ༻

Slide 21

Slide 21 text

Infrastructure History #2 2018 - / Microservices on Multi-Cloud

Slide 22

Slide 22 text

Microservices • αʔϏεͷ Resilience Λ޲্ͤ͞Δ • ࡉ͔͍୯ҐͰͷεέʔϦϯάɺো֐ͷ෼཭ • νʔϜɾ૊৫ͷ Scalability ΛߴΊΔ • 1000໊Ҏ্ͷΤϯδχΞ૊৫Λࢤ޲ • αʔϏε։ൃͷ଎౓Λ͞Βʹ͍͋͛ͯͨ͘Ί

Slide 23

Slide 23 text

US Re-Architecture • US marketʹΑΓ࠷దԽ͢΂͘ Client ΛFull Renewal • MicroservicesͷroutingΛߦ͏API GatewayΛGolangͰ࣮૷ • AWS্ͷMonolith APIΛWrap • ؇΍͔ͳҠߦΛ࣮ݱ API Gateway search personalization offer gRPC JSON over HTTPs Protocol Buffers over HTTPs gRPC gRPC Monolith API

Slide 24

Slide 24 text

API Fork • 3ͭͷRegionͰڞ༗͍ͯͨ͠Monolith APIͷίʔυΛ US,UK ͱ JP Ͱ෼཭ • ࣗregionͷมߋ͕ଞregionʹӨڹ͢Δ͜ͱΛ཈͑Δɻௐ੔ɾQAίετ࡟ݮ • ΑΓ֤ࠃͷࣄ৘ʹ͋ͬͨ։ൃΛ֤ࠃͰߦ͏ • US,UKͷݱ஍࠾༻΋ਐల

Slide 25

Slide 25 text

API Gateway in JP • Monolith API͔Βݺ͹ΕΔ
 Microservices ͸͢Ͱʹӡ༻த • JPͰ΋MicroservicesΛ͞ΒʹਐΊΔͨΊ API GatewayΛಋೖ • Golang͕ͩɺUSͱ͸ҟͳΔ࣮૷ • Clientͷมߋ͸ͳ͘Protocol͸ҡ࣋ • DNS cacheɺRequest bufferingͳͲͷ௥Ճ API Gateway JSON over HTTPs JSON over HTTPs ServiceA ServiceC ServiceB

Slide 26

Slide 26 text

Infrastructure in 2018 JP UK US + + ͦΕͧΕͷRegionʹ͋Θͤͨ
 Microservices on Multi-Cloud

Slide 27

Slide 27 text

Microservices Tech Stack • Container / Docker • Kubernetes • Spinnaker

Slide 28

Slide 28 text

Container / Docker • Container • Ϧιʔεͷ෼཭ɾ੍ޚ • VMΑΓܰྔͳOS؀ڥΛ࣮ݱ • Docker • ϙʔλϏςΟͷ࣮ݱ • DockerfileʹΑΔҰ؏ͨ͠Πϝʔδͷ࡞੒

Slide 29

Slide 29 text

Container use case Github PR Daily job BigQuery (app-log) index Container Registory DEPLOY!! Application͚ͩͰ͸ͳ͘ ML΍RecommendͷσʔλΛؚΉContainerΛ࡞੒ ෳࡶͳMiddleware΋҆ఆͯ͠ఏڙ container for keyword suggest service

Slide 30

Slide 30 text

Kubernetes • Container ͷ Orchestration Platform • ࣗಈScalingɺࣗಈhealing • Container ӡ༻ίετͷ࡟ݮ • GKE(Google Kubernetes Engine) Λத৺ʹར༻ • k8s͕MicroservicesͷKey factor • AWS EKS/Fargateͷݕূ • ͘͞ΒͷΫϥ΢υɺk8s on Metalͷݕ౼ɾݕূ

Slide 31

Slide 31 text

Spinnaker • Continuous Delivery Platform • Developed by Netflix • googleͳͲͷڠྗɾOSSԽ • Deploy pipelineΛఆٛ͠ɺࣗಈ࣮ߦ͢Δ • Multi-Cloud ରԠ • k8s, ECS, OpenStack... • SpinnakerʹΑΔContinuous Delivery
 http://tech.mercari.com/entry/2017/08/21/092743

Slide 32

Slide 32 text

Microservices on Multi-Cloud ͷ՝୊

Slide 33

Slide 33 text

Microservices on Multi-Cloud Pros/Cons • Pros: Service ʹద࣮ͨ͠ߦ؀ڥͷબ୒ • σʔλϕʔεɾMLܥαʔϏεͳͲ৽͍ٕ͠ज़Λૉૣ͘औΓࠐΉ • ։ൃऀ͕ٕज़બ୒ݖΛ΋ͭ͜ͱͰɺΦʔφʔγοϓΛΑΓڧ͘ • Cons: Ϋϥ΢υؒ࿈ܞͷޮ཰ੑ • ωοτϫʔΫίετ • Ϋϥ΢υؒͷڑ཭ • Cons: αʔϏεͷՄ༻ੑҡ࣋

Slide 34

Slide 34 text

Distance between clouds ੴङ DC Cloud Service Mircoservices Infrastructure ઐ༻αʔό Monolith API Infrastructure 1,000 km

Slide 35

Slide 35 text

Distance between clouds $ ping -c 3 example.mercari.jp PING example.mercari.jp (x.x.x.x) 56(84) bytes of data. 64 bytes from x.bc.googleusercontent.com (x.x.x.x): icmp_seq=1 ttl=50 time=18.6 ms 64 bytes from x.bc.googleusercontent.com (x.x.x.x): icmp_seq=2 ttl=50 time=18.4 ms 64 bytes from x.bc.googleusercontent.com (x.x.x.x): icmp_seq=3 ttl=50 time=20.6 ms ੴङ(ઐ༻αʔό) ▶︎ ౦ژ(Google Cloud Load Balancer) $ ping -c 3 example.mercari.jp PING example.mercari.jp (x.x.x.x) 56(84) bytes of data. 64 bytes from x.bc.googleusercontent.com (x.x.x.x): icmp_seq=1 ttl=56 time=1.09 ms 64 bytes from x.bc.googleusercontent.com (x.x.x.x): icmp_seq=2 ttl=56 time=1.08 ms 64 bytes from x.bc.googleusercontent.com (x.x.x.x): icmp_seq=3 ttl=56 time=1.14 ms ౦ژ(͘͞ΒͷΫϥ΢υ) ▶︎ ౦ژ(Google Cloud Load Balancer) 18-20 ms 1 ms ಉ͡DC಺Ͱ͋Ε͹ 0.1 ms

Slide 36

Slide 36 text

Distance between clouds by HTTPS $ ./httpstat.sh https://example.mercari.jp/hc HTTP/1.1 200 OK Server: nginx/1.13.3 Date: Wed, 11 Oct 2017 01:59:15 GMT Content-Type: application/json; charset=utf-8 Content-Length: 22 Expires: Wed, 11 Oct 2017 02:59:15 GMT Cache-Control: max-age=3600 Cache-Control: public Via: 1.1 google Alt-Svc: clear DNS Lookup TCP Connection SSL Handshake Server Processing Content Transfer [ 1ms | 19ms | 165ms | 20ms | 0ms ] | | | | | namelookup:1ms | | | | connect:20ms | | | pretransfer:185ms | | starttransfer:205ms | total:205ms

Slide 37

Slide 37 text

How to beyond the distance • 3 way handshakeΛආ͚ΔɻTLS ͷ handshake ΋ආ͚Δ • HTTP/1, HTTP/2 ͷKeepAlive Λ׆༻͢Δ • ChoconͰͷConnection Aggregation

Slide 38

Slide 38 text

chocon • GoͰ࣮૷ͨ͠γϯϓϧͳ
 Proxy Server • OSSͱͯ͠ެ։ • github.com/kazeburo/chocon • 1೥Ҏ্ͷՔಇ࣮੷

Slide 39

Slide 39 text

chocon % curl -H ‘Host: example.com.ccnproxy-https’ http://10.0.0.1/v1/foo *.ccnproxy-https IN CNAME chocon.local. ಺෦DNSΛ׆༻͢ΔͱURLͷϗετ໊Λมߋ͢Δ͚ͩ chocon Web Client https://example.com/ ʹproxy http http or https keepAlive Private Network % curl http://example.com.ccnproxy-https/v1/foo

Slide 40

Slide 40 text

After Chocon $ ./httpstat.sh /dev/null https://microservice.example.com.ccnproxy-https/hc HTTP/1.1 200 OK Cache-Control: max-age=3600,public Content-Length: 22 Content-Type: application/json; charset=utf-8 Date: Thu, 01 Jun 2017 00:43:49 GMT Expires: Thu, 01 Jun 2017 01:43:49 GMT Server: nginx/1.11.5 X-Chocon-Req: bSCzJrCMZ9wbRN8TYhZ3wV Body stored in: /tmp/httpstat-body.390174181496278775 DNS Lookup TCP Connection Server Processing Content Transfer [ 1ms | 1ms | 19ms | 0ms ] | | | | namelookup:1ms | | | connect:2ms | | starttransfer:21ms | total:21ms pingͱಉ౳ͷ଎౓

Slide 41

Slide 41 text

Durability, Availability • Multi-CloudͰ͸Մ༻ੑ͸Լ͕Δ • ͲͷΫϥ΢υ͕མͪͯ΋αʔϏεͷܧଓʹӨڹ • Քಇ཰ 99.99% ͱ 99.95% ͷΫϥ΢υΛ࢖͍ͬͯΔ৔߹ɺՔಇ཰͸ 99.95%ʹͳΔ • MicroservicesͰ͸ಛఆͷαʔϏε͕མͪͯ΋શମʹӨڹ͠ͳ͍ͤ͞ͳ͍ • Өڹ͕཈͑ΒΕΔMicroservices͸ಛఆͷCloudͰӡ༻ • ߴ͍Մ༻ੑ͕ඞཁͱ͞ΕΔMicroservices͸Multi-CloudͰల։

Slide 42

Slide 42 text

Massive Computing Resource Service Mesh Service Mesh J Infrastructure in the near future? Security / DDoS mitigation API Gateway A B C D D E CloudA CloudB F CloudC (Monolith API) H K L M

Slide 43

Slide 43 text

ॊೈͰ৴པੑͷߴ͍ Infrastructure Λ Microservices ͱ Multi-Cloud Ͱ࣮ݱ

Slide 44

Slide 44 text

We’re Hiring! careers.mercari.com