Slide 1

Slide 1 text

ίϯςφͰ΋ظ଴௨Γಈ͘ʁ ϋΠϒϦουฒྻ on Kubernetes ɹJAWS-UG HPC #14 Nov 19, 2018 Ryo NAKAMARU, SUPINF Inc.

Slide 2

Slide 2 text

SUPINF Inc ҰԠɺಈ͘ΑʂͰ΋ɾɾ !2

Slide 3

Slide 3 text

SUPINF Inc DEMO !3

Slide 4

Slide 4 text

SUPINF Inc Docker Ͱඣ໺ϕϯνϚʔΫ !4 Docker ͑͞ೖ͍ͬͯΕ͹ gcc ΋ OpemMPI ΋ෆཁɺϫϯϥΠφʙʂ ɹɹ$ docker run --rm -it pottava/openmpi:4.0 \ ɹɹɹɹbash -c "apt-get install -y unzip lhasa make >/dev/null \ ɹɹɹɹ&& wget --quiet http://i.riken.jp/wp-content/uploads/2015/07/cc_himenobmtxp_mpi.zip \ ɹɹɹɹ&& unzip -q cc_himenobmtxp_mpi.zip && lha xqw=/opt/himeno cc_himenobmtxp_mpi.lzh \ ɹɹɹɹ&& cd /opt/himeno && mv Makefile.sample Makefile \ ɹɹɹɹ&& chmod +x ./paramset.sh && ./paramset.sh S 1 1 1 && make >/dev/null 2>&1 \ ɹɹɹɹ&& su -c 'mpirun -np 1 /opt/himeno/bmt' mpiuser” ɹɹSequential version array size ɹɹ mimax = 65 mjmax = 65 mkmax = 129 ɹɹ.. ɹɹMFLOPS measured : 3078.922728

Slide 5

Slide 5 text

SUPINF Inc Mac 1 ୆ͰϋΠϒϦουฒྻॲཧ !5 OpenMPI ͷϚελɾεϨʔϒϊʔυΛίϯςφͱͯ͠ىಈ ɹɹ// εϨʔϒϓϩηεΛ SSH αʔόͱͯ͠ىಈ ɹɹ$ docker run --name 02-node01 -d --cpuset-cpus 0,1 openmpi/samples:02-hybrid-parallel ɹɹ$ docker run --name 02-node02 -d --cpuset-cpus 2,3 openmpi/samples:02-hybrid-parallel ɹɹ// Ϛελʔϓϩηεͷىಈ ɹɹ$ docker run --rm -it -u mpiuser \ ɹɹɹɹ--link 02-node01:node01 --link 02-node02:node02 \ ɹɹɹɹopenmpi/samples:02-hybrid-parallel \ ɹɹɹɹmpirun -np 2 --host node01,node02 -x OMP_NUM_THREADS=2 ./hybrid ɹɹHello from thread 0 out of 2 from process 0 out of 2 on 5329fecf93f4 ɹɹHello from thread 1 out of 2 from process 0 out of 2 on 5329fecf93f4 ɹɹHello from thread 0 out of 2 from process 1 out of 2 on ca3d85c87284 ɹɹHello from thread 1 out of 2 from process 1 out of 2 on ca3d85c87284

Slide 6

Slide 6 text

SUPINF Inc !6 ɹσϞʹ࢖ͬͨίʔυͱ࣮ߦखॱ͸ͪ͜Β https://github.com/pottava/docker-openmpi

Slide 7

Slide 7 text

SUPINF Inc τϐοΫ !7 • HPC ΞϓϦέʔγϣϯΛ Docker Ͱಈ͔ͨ͢Ίͷߟ࡯ • EC2 ͰϋΠϒϦουฒྻΞϓϦΛಈ͔͢·Ͱ • Kubernetes Ͱͷར༻ྫͱ՝୊

Slide 8

Slide 8 text

SUPINF Inc HPC ΞϓϦέʔγϣϯΛ Docker Ͱಈ͔ͨ͢Ίͷߟ࡯ !8 HPC ͷཁٻ / Docker ͷ࢓૊Έ

Slide 9

Slide 9 text

SUPINF Inc HPC ΞϓϦέʔγϣϯͷಛ௃ !9 • ϋʔυ΢ΣΞϦιʔεΛͱʹ͔͘࢖͍੾Δ ‣ େن໛Ϋϥελ & ϊʔυ͸઎༗͢Δ΋ͷ ‣ ؀ڥΛϋʔυ΢ΣΞϨϕϧͰݫີʹ؅ཧ ‣ σόΠε΍ωοτϫʔΫΛར༻੍ݶ͞Εͯ͸ࠔΔ ‣ ந৅ԽʹΑΔΦʔόʔϔου͑͞ɺͱͯ΋ؾʹͳΔ • “ࣾ಺ܭࢉ؀ڥ” ޲͚ηΩϡϦςΟ ‣ ܭࢉ࣮ߦऀͷݫີ͔ͭॊೈͳ؅ཧ & Ϋϥελ಺෦͸؇Ί

Slide 10

Slide 10 text

SUPINF Inc Docker ͷ࢓૊Έ & HPC Ͱ࢖͏೰·͠͞ !10 • namespaces ʹΑΔܭࢉۭؒͷִ཭ ‣ ͍΍ɺϊʔυ͸઎༗͍ͨ͠ͷͰɾɾ ‣ ϓϩηεؒ௨৴ʹͱͬͯ΋ແ༻ͷ௕෺ • cgroup ʹΑΔܭࢉϦιʔεͷ੍ޚ ‣ ੍ݶ͠ͳ͍͍ͯ͘Ͱ͢ ‣ OOM Ωϧʁ໰୊૿΍͞ͳ͍Ͱɾɾ

Slide 11

Slide 11 text

SUPINF Inc ɹଓ: Docker Λ HPC Ͱ࢖͏೰·͠͞ !11 • ϓϩηε࣮ߦϢʔβʔͷઃܭ͕ΧδϡΞϧ ‣ ΧδϡΞϧʹ root ‣ ϑΝΠϧڞ༗ΛབྷΊͯߟ͑Δͱ΋͏࡬Λ౤͍͛ͨ • ISV ͞Μ֤ҐͷରԠ࣍ୈɾɾ ‣ ༗ঈιϑτ΢ΣΞ΁ͷґଘ౓ͷߴ͞ ‣ ϥΠηϯεαʔό΁ͷΞΫηε੍ޚɺେৎ෉ʁ

Slide 12

Slide 12 text

SUPINF Inc !12 ͱ͸͍͑ɺDocker ΠϝʔδʹͰ͖Ε͹ՄൖੑΞοϓʂ ʢSingularity ΁ͷม׵΋͙͢Ͱ͖ΔΑʣ

Slide 13

Slide 13 text

SUPINF Inc !13 MPI ͷ࢓૊Έͱ Dockerfile HPC ΞϓϦέʔγϣϯΛ Docker Ͱಈ͔ͨ͢Ίͷߟ࡯

Slide 14

Slide 14 text

SUPINF Inc Dockerize ͢Δͱ͖ʹେ੾ͳ͜ͱ !14 • ΞϓϦέʔγϣϯͷ࢓༷ͱڍಈΛ೺Ѳ͢Δ ‣ Ͳ͏΍ͬͯಈ͍ͯΔΜ͚ͩͬʁ֤छґଘͷ೺Ѳ ‣ Ͳ͏௨৴͚ͯͨͬ͠ʁ • Ͳ͜·ͰίϯςφԽ͢Δ͔Λߟ͑Δ ‣ SSH ͸ϗετʹ೚ͤΔʁMPI ΋ϗετΛ࢖͏ʁ ‣ શ෦ίϯςφʹೖΕΔʁʁ

Slide 15

Slide 15 text

SUPINF Inc OpenMPI !15 • ֤ϊʔυʹ͸ SSH Ͱ઀ଓ ‣ ܭࢉίϯςφ͸ SSH αʔόͱͯ͠ࢦࣔ଴ͪͤ͞Δͷ΋ख ‣ ίϯςφىಈ࣌ͷίϚϯυͰ௚઀ىಈ͢Δ͜ͱ΋Ͱ͖Δ • OpenMPI ͷόʔδϣϯ͸Ͳ͏߹ΘͤΔʁ ‣ ϗετʹ SSH + OpenMPI Λ೚ͤΔͳΒɺίϯςφ΋߹ΘͤΔ

Slide 16

Slide 16 text

SUPINF Inc ࢲ͸͜͏࡞ͬͯΈ·ͨ͠ !16 https://github.com/pottava/docker-openmpi/blob/master/versions/4.0/Dockerfile ɹɹFROM debian:stretch-slim ɹɹRUN apt-get update && apt-get install -y gcc ssh wget curl \ ɹɹ && apt-get install -y openssh-server \ ɹɹ .. ɹɹENV OPENMPI_VERSION=4.0.0 ɹɹRUN apt-get install -y build-essential \ ɹɹ && repo="https://www.open-mpi.org/software/ompi/v4.0/downloads" \ ɹɹ && curl --location --silent --show-error --output openmpi.tar.gz \ ɹɹ "${repo}/openmpi-${OPENMPI_VERSION}.tar.gz" \ ɹɹ .. ɹɹ && ./configure --prefix=/usr/local && make && make install SSH Server ΋ೖͬͯΔ

Slide 17

Slide 17 text

SUPINF Inc EC2 ͰϋΠϒϦουฒྻΞϓϦΛಈ͔͢·Ͱ !17

Slide 18

Slide 18 text

SUPINF Inc SSH αʔό΋ϗετͷ΋ͷΛར༻ʢ --net=host ʣ EC2 ϗετͷωοτϫʔΫΛ࢖ͬͨܭࢉ !18 eth0 EC2 10.0.0.10 eth0 EC2 10.0.0.12 SSH server SSH server hostfile ʹ ɹ10.0.0.10 ʹ ɹ10.0.0.12 ʹ Λࢦఆ

Slide 19

Slide 19 text

SUPINF Inc SSH αʔό΋ίϯςφͱͯ͠ىಈ Docker ͷԾ૝ωοτϫʔΫΛ࢖ͬͨܭࢉ !19 eth0 docker0 EC2 veth eth0 10.0.0.10 172.17.0.2 eth0 docker0 EC2 veth eth0 10.0.0.12 172.17.0.4 SSH hostfile ʹ ɹ172.17.0.2 ʹ ɹ172.17.0.4 ʹ Λࢦఆ SSH

Slide 20

Slide 20 text

SUPINF Inc Kubernetes Ͱͷར༻ྫͱ՝୊ !20 ΍ͬͯΈͨ

Slide 21

Slide 21 text

SUPINF Inc ࣄલʹΞϓϦέʔγϣϯΛ ECR ʹ push !21 Build Push

Slide 22

Slide 22 text

SUPINF Inc ɹ ܭࢉϊʔυΛઌʹల։ !22 δϣϒΛఆٛͨ͠ YAML Λ Apply ɹ ɹ ɹ ܭࢉϊʔυ c5.large c5.large c5.large … • SSH αʔόͱͯ͠ pod Λىಈ • ϊʔυΞϑΟχςΟΛར༻ • ࠓճ͸؆қతʹ DaemonSet EKSʢ؅ཧϊʔυʣ SSH SSH SSH ECR

Slide 23

Slide 23 text

SUPINF Inc ɹ Master ϓϩηεΛ Job ͱͯ͠౤ೖ !23 eth0 docker0 veth eth0 172.17.0.2 eth0 docker0 veth eth0 172.17.0.4 SSH hostfile ʹ ɹ172.17.0.2 ʹ ɹ172.17.0.4 ʹ Λࢦఆ SSH ܭࢉϊʔυ

Slide 24

Slide 24 text

Presented by

Slide 25

Slide 25 text

தؙ ྑ @pottava • CTO at SUPINF Inc • Solutions Architect at Rescale, Inc. • AWS Certified SA / DevOps Engineer - Pro Profile !25

Slide 26

Slide 26 text

Containerize your app! !26 • Ϋϥ΢υ / ίϯςφ ΛڧΈʹͨ͠डୗ։ൃӡ༻ɺίϯαϧςΟϯά • 2015 ೥͔Β Docker ͷຊ൪ӡ༻Λ։࢝ɾ๛෋ͳ CI / CD ࣄྫ • εϐϯϑɺͱಡΈ·͢ɾɾ

Slide 27

Slide 27 text

Cloud HPC with !27 • Ϋϥ΢υ HPC γϛϡϨʔγϣϯϓϥοτϑΥʔϜͷఏڙ • 2011 ೥ॳ಄ʹઃཱɺPeter Thiel ΍ Microsoft ͔Βग़ࢿ • εέʔϥϒϧͳγϛϡϨʔγϣϯ΍ػցֶशΛʂ

Slide 28

Slide 28 text

͝੩ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠ :) ࢀߟจݙɿ • Getting Started with Amazon EKS ( https://docs.aws.amazon.com/ ja_jp/eks/latest/userguide/getting-started.html )