AWS Summit 2018 スタートアップ特設エリアにおける発表内容で、ChatWorkにおけるKubernetesの運用話です。
νϟοτϫʔΫʹ͓͚Δ Kubernetes on AWS /Kubernetes on AWS at ChatWorkSRERyo Sakamoto
View Slide
© ChatWork▸ ຊൃϏδωενϟοτ▸ λεΫཧϏσΦ௨͕Մೳ▸ ಋೖاۀ174,000ࣾҎ্ʢ※20184݄࣌ʣ
© ChatWorkΞδΣϯμ▸ Kubernetesͷڥಈ͍͍ͯΔΞϓϦ▸ KubernetesͰར༻͍ͯ͠Δπʔϧ▸ KubernetesͷࢹɺϩΪϯά▸ Kubernetesͷversion up▸ ·ͱΊ
© ChatWorkKubernetesͷར༻▸ ϝοηʔδॲཧ෦ͷϦϓϨΠε(201612݄)▸ backen appΛkubernetes Ͱಈ͔͢▸ ࡉ͔͍ࡢͷAWS summitͰ…
© ChatWorkKubernetesͷڥ▸ ڥ AWS▸ AWSͳͲطଘͷࢿݯΛར༻͔ͨͬͨͨ͠Ί▸ ߏஙπʔϧ kube-aws▸ https://github.com/kubernetes-incubator/kube-aws▸ ϝΠϯϝϯςφmumoshu (chatwork kubernetes ސ)▸ cloudformation(ͱcloud-init)Ͱ·ΔͬͱߏஙͰ͖Δ
© ChatWorkKubernetesͰಈ͍͍ͯΔͷ▸ backend▸ ࡢͷAWS summitͰmessage backendΛϦϓϨΠεͨ͠▸ ͜ͷϓϩδΣΫτҎ߱(webhook, oauthͳͲ)ͯ͢kubernetes▸ ϊʔυm4.2xlarge * 10ఔ▸ CDڥ(concourse)▸ spot instance ͳnodepoolΛར༻
© ChatWorkKubernetesͰར༻͍ͯ͠Δπʔϧ(1)▸ cluster-autoscaler▸ podͷauto scaleͰͳ͘ɺnodeͷauto scaleͯ͘͠ΕΔ▸ schedulerΛࢹͯ͠ɺϦιʔε͕Γͳ͍pod͕͍ΔͱASGΛૢ࡞▸ nodeͷݮʹେ͖͘ߩݙ▸ σϓϩΠ࣌ͷpodͷೖସ͑ͳͲͰҰ࣌తʹϊʔυ͕Γͳ͘ͳΔͱ͖͞ΒͬͱରԠͯ͘͠ΕΔ
© ChatWorkcluster-autoscalerͷಈ͖(scale out)controller nodepoolapi-serverschedulercluster-autoscalerpod(1) watch(2) “fails to be scheduleddue to insufficient”(4)scale out(3) set-desired-capacityྫ
© ChatWorkcluster-autoscalerͷಈ͖(scale in)controller nodepoolapi-serverschedulercluster-autoscalerpod(1) watch (apiܦ༝)nodeͷ༻(3) set-desired-capacityྫababnodeͷ༻abab(4) scale in(2) evict
© ChatWorkKubernetesͰར༻͍ͯ͠Δπʔϧ(2)▸ kube2iam▸ podຖʹroleͷ༩▸ ௨ৗΠϯελϯεͷϩʔϧΛར༻▸ ෆཁͳpolicy͕͘ & Γͳ͍ͷAPIKEYΛͨͤΔ͜ͱʹͳΔ▸ secretbase64ͳ͚ͩ▸ एׯෆ҆ఆͰɺkiamʹஔ͖͑༧ఆ
© ChatWorkkube2iam▸ annotationʹroleΛهࡌ▸ roleworkerͷroleΛ৴པ͓ͯ͘͠▸ worker͕asuumeͰ͖ΔΑ͏ʹ͓ͯ͘͠▸ pod͕ɺAWSͷAPIΛར༻͠Α͏ͱ͢ΔͱɺmetadataʹΞΫηε͢Δ▸ metadataͷΞΫηεΛiptablesͰkube2iamʹసૹ▸ kube2iam͕annotationͷroleͷΫϨσϯγϟϧΛൃߦ
© ChatWorkkube2iamappkube2iam 1. credentialͷൃߦ(ec2-metadata)2. iptablesͰkube2iamͷpodʹϦΫΤετ͕సૹ3. credentialͷൃߦpodྫ
© ChatWorkKubernetesͷࢹ▸ datadog only▸ daemonsetͰஔ▸ not k8sͳڥͷࢹͱ౷Ұ͍ͨ͠ & prometheusͷཧΛͨ͘͠ͳ͍▸ prometheusͷΑ͏ʹΤϯυϙΠϯτΛੜ͢ͷͰͳ͘ɺ֤ϗετͷstatsdʹૹ৴▸ version 6Λར༻▸ v5ͰϝτϦΫε͕͚͍ܽͯͨ(ϝτϦΫεᷓΕ)͕ɺv6Ͱ͚ܽͳ͘ͳͬͨ
© ChatWorkdatadogͷlive container monitoring
© ChatWorkKubernetesͷϩΪϯά▸ fluentd + stackdriver▸ fluentdΛdaemonsetͰஔ▸ ֤ίϯςφϗετͷಛఆͷॴʹstdoutΛు͖ग़͍ͯ͠Δ▸ audit-logfluentdͰstackdriverʹૹ৴▸ S3ͰΑ͔͕ͬͨɺKubernetesΛಋೖͨ࣌͠ʹAthenaग़͔ͨΓ▸ stackdriver + bigqueryҰ෦ͷϩάͰಋೖ
© ChatWorkKubernetesͷversion up▸ kube-awsͰཧ͍ͯ͠ΔҎ্ɺϚωʔδυͳversion upͰ͖ͳ͍▸ version upkubernetes ౷߹Λߦͬͨ▸ version up 1.7 -> 1.8, 1.5 -> 1.8 && 1.8ԽͷλΠϛϯάͰΫϥελ౷߹▸ νϟοτϫʔΫͰ·ͩingressར༻Ͱ͖͍ͯͳͯ͘ɺELB + NodePort▸ ͳͷͰɺversion upELBʹ৽چ྆ํΛͿΒԼ͛ͯɺݹ͍ํΛޙୀ▸ ࠓͷΞϓϦέʔγϣϯͱͯ͠onlineͰversion upྃ
© ChatWorkversion upold k8spodྫappchatwork webNodePort
© ChatWorkversion upold k8spodྫappappchatwork webNodePortNodePortnew k8s
© ChatWorkversion uppodྫappchatwork webNodePortnew k8s
© ChatWorkKubernetesͰࠓޙΓ͍ͨ(1)▸ EKS▸ kube-awsʹΈࠐ·ΕΔ༧ఆ▸ ͬͺΓϩʔϦϯάΞοϓσʔτ͍ͨ͠▸ service mesh▸ envoyͷಋೖ▸ istio, linkerd▸ grpc loadbalancer -> envoy, nginx-ingress-controller
© ChatWorkKubernetesͰࠓޙΓ͍ͨ(2)▸ prometheusͷಋೖ▸ hpaσϑΥϧτͰcpu͔͠ͳ͍ͷͰ͔ͭʹ͍͘▸ cpuͰͳ͘kafkaͷeventͷ٧·Γ۩߹Ͱscale in/out͍ͨ͠▸ datadogͰapiΛͬͯͰ͖Δ͚Ͳɺdatadogͱͷଓෆ҆▸ ϓϥοτϑΥʔϜԽ▸ openFaaSͳͲ
© ChatWorkEKSͷظ▸ preview൛Λར༻͍͍ͤͯͨͩͨ͞▸ workerͷՃ͕͕͕͕͕….configmapܦ༝Ͱొ͢Δɺͱ͍͏ํ๏ͩͬͨ▸ ͜Εͩͱkubernetesͷ֎ͰɺΫϥελߏங͕ด͡ͳ͍▸ AWSͷϦιʔε׆༻(IAMɺVPC)ͳͲظ▸ fargateͰnodeͦͷͷΛҙࣝ͠ͳ͍ͷ͍͍͕ɺloggingࢹ…
© ChatWork·ͱΊ▸ νϟοτϫʔΫͷKubernetesڥʹ͍ͭͯͷ▸ ͍Ζ͍ΖΓ͍ͨ͜ͱ͋Δ▸ EKSʹظ▸ controll plane͕Ϛωʔδυ͞ΕΔ҆৺ײ
© ChatWorkΤϯδχΞืूதhttp://corp.chatwork.com/ja/recruit/▸ ओମੑΛ࣋ͪɺࣗΒߦಈͰ͖Δ▸ ଞऀΛೝΊɺଚॏͰ͖Δ▸ ใΛूΊɺڞ༗Ͱ͖Δͱ͍͏ํΛܴ͠·͢ʂ