Slide 1

Slide 1 text

Alchemist and D/OX (developer / operation experience) ElixirConfJP2019 kokurajo Tsunenori Ohara

Slide 2

Slide 2 text

agenda ● about me ● about us ● goal of this talk Part 1. D/OX ● what is D/OX? ● development cycle ● how to improve D/OX? ● 4 automations ○ testing ○ scaling ○ deployment ☆ ○ logging/monitoring Part 2. Elixir Deployment ● modern web app infra ● elixir impedance mismatch ● hot code swap or CD? ● our deployment journey ● elixir deployment tao Part3. Summary ● conclusion

Slide 3

Slide 3 text

about me ● Tsunenori Ohara ○ Twitter: @ohrdev ○ Github: ohr486 ● Community ○ beam-lang.tokyo, tokyo.ex, Erlang & Elixir Fest organizer ○ meguro.rb, meguro.es, etc ● Work ○ enza platform division, platform development, general manager ○ technology officer ○ architect, infra/app developer, SRE ○ using elixir since Elixir v1.0 in production ● Hobby ○ making buddha statue ○ log collection (wooden logs) ○ sutra copying

Slide 4

Slide 4 text

about us ● Drecom Co., Ltd. ● using Elixir since v1.0 was released (2014〜 ● we use elixir with ○ Framework: maru, phoenix ○ Infra/Cloud: AWS ○ Compute: EC2, kubernetes/EKS, lambda ○ Error Tracking: sentry ○ APM: newrelic, appsignal, prometheus ○ DB/KVS: Dynamodb, Redis, MySQL ○ Stream/Messaging: Kafka, Kinesis ● [AD] Tech Inside Drecom (our tech blog) ○ https://tech.drecom.co.jp

Slide 5

Slide 5 text

goal of this talk ● understand DX/OX ● understand how to improve DX/OX ● understand how to deploy elixir apps ● understand how to automate development works with elixir ● introducing our trial and error of elixir app’s deployment attention! - This talk has little elixir source code.

Slide 6

Slide 6 text

part 1. D/OX

Slide 7

Slide 7 text

what is D/OX? ● Developer Experience ○ An indicator of how comfortable the engineer can develop the system ● Operation Experience ○ An indicator of how comfortable the engineer can operate the system ● Good D/OX ○ Development/Operation is fun! ○ Small technical debt ○ Automation ● Bad D/OX ○ Development/Operation is painful! ○ Big/Huge technical debt ○ Manual operations ○ D/OX will deteriorate if left unattended

Slide 8

Slide 8 text

business quarity cost delivery development cycle development operation product value Good D/OX Bad D/OX anxiery/worry afford/margin bad bihaviour KAIZEN behaviour tao tech debt refactor redesign resources - engineer - money - time

Slide 9

Slide 9 text

how to improve D/OX ● Good DX? ○ automate manual tasks ■ testing ■ code analyze ○ running on dev env == running on prod env ○ NOT legacy ● Good OX? ○ automate manual tasks ■ delivery ● deployment tactics ○ blue green, canary, hot code swap, etc ■ scaling ○ error handling / trouble shooting ○ easy middleware version up Automation and Good archtecture

Slide 10

Slide 10 text

4 automations ● testing ● scaling ● deployment ● log tracking

Slide 11

Slide 11 text

automation: testing ● adopt in initial development phase ● run tests with CI ○ unit test: exunit ○ lint: credo/dogma ○ static analysis: dialyxir ○ coverage: excoveralls ● load test / stress test ○ setup “stress” MIX_ENV ○ same level as dev env, test env and prod env ○ stress branch => hard to cherry-pick ○ stress MIX_ENV => easy to manage ● frequent load testing reduce performance problems ○ maintain load test scenario like unit tests ○ easy to run load tests

Slide 12

Slide 12 text

automation: scaling ● auto scaling ○ cloud: Auto Scaling Group ○ k8s: horizontal pod autoscaling ● monitoring ○ APM(application performance monitoring): newrelic, prometheus, appsignal ● deployment ○ deployment methods that support auto scaling => Part 2.

Slide 13

Slide 13 text

automation: deployment ● support auto scaling ● => part 2.

Slide 14

Slide 14 text

automation: logging ● not place on server ● use log collector ○ to S3, GCS ● check the price for cloud service ○ cloudwatch logs ○ stackdriver ● easy to bulk search ○ AWS athena ○ bigquery ● monitoring ○ APM: newrelic, appsignal, prometheus ■ BEAM metrics ○ Error: sentry appsignal newrelic

Slide 15

Slide 15 text

part 2. Elixir Deployment

Slide 16

Slide 16 text

modern web app infra ● public cloud (AWS/GCP/Azure/etc) ○ The server stops. ○ The network is going down. ○ Replace the server if it breaks. ■ AWS: EC2 gacha ● cloud native ○ devops ○ continuous delivery ○ containers ○ microservice ● running on docker/k8s ● serverless

Slide 17

Slide 17 text

elixir impedance mismatch ● elixir/erlang architecture ○ high cpu efficiency ○ hot code swap ○ umbrella apps on VM ● modern infra/k8s architecture ○ run many (small cpu/mem) pods ○ immutable infrastructure ○ microservices

Slide 18

Slide 18 text

[erlang vm architecture] Linux Server Erlang VM CPU CPU CPU CPU CPU CPU as many schedulers as the number of CPU erlang process != OS process vm migrate processes between schedulers scheduler process

Slide 19

Slide 19 text

elixir impedance mismatch Node(Linux Server) Erlang VM CPU CPU CPU CPU sche duler CPU sche duler sche duler sche duler sche duler P P P P P P P Node master Node Node k8s cluster pod pod pod pod pod pod container container container Erlang VM CPU CPU scheduler scheduler high CPU efficiency run many (small CPU/Mem) Pods k8s allocate CPU&Mem high infra cost efficiency

Slide 20

Slide 20 text

[process architecture] Erlang VM application controller application master application process ・elixir, iex, mix, etc ・user apps ・lib apps

Slide 21

Slide 21 text

Node(Linux Server) Node(Linux Server) elixir impedance mismatch Node(Linux Server) Erlang VM Node master Node Node k8s cluster app1 pod container container container Erlang VM application controller application master app1 app2 common common app1 app1 app2 app3 app3 app3 app2

Slide 22

Slide 22 text

elixir impedance mismatch ● VM resource with : large or small resource(CPU,Mem)? ○ running on Instance : => large resource ○ running on k8s : => small resource ● load test and monitoring ○ DON’T GUESS, MEASURE ● easy to (load) test ● monitoring BEAM metrics

Slide 23

Slide 23 text

hot code swap or CD? ● running on instance ○ hot code swap ■ hard way ● auto scaling ● deployment task ○ CD / Immutable infrastructure is easy to apply ● running on k8s ○ CD / Immutable infrastructure

Slide 24

Slide 24 text

our deployment journey ● phase0 : local PC / iex -S mix ● phase1 : local PC / daemonize, mix release ● phase2 : single server / ssh login, git pull, mix release ● phase3 : single server / deploy with CI ● phase4 : multi server / package on deploy server, use S3 for delivery ● phase5 : multi server / deploy with autoscaling ● phase6 : multi server / deploy with edeliver/distirelly ● phase7 : multi server / deploy with autoscaling ● phase8 : k8s cluster

Slide 25

Slide 25 text

phase 0. local PC / iex -S mix local PC $ mix new xxx $ MIX_ENV=prod iex -S mix

Slide 26

Slide 26 text

phase 1. local PC / daemonize local PC $ mix new test_app $ # Elixir 1.9 $ MIX_ENV=prod mix release $ _build/prod/rel/test_app/bin/test_app daemon

Slide 27

Slide 27 text

phase 2. single server / ssh, git, mix release local PC Server $ git pull $ _build/prod/rel/test_app/bin/test_app stop $ MIX_ENV=prod mix release $ _build/prod/rel/test_app/bin/test_app start 1 git push 2 ssh login 3 git pull 4 release & restart

Slide 28

Slide 28 text

phase 3. single server / update with CI local PC Server $ git pull $ _build/prod/rel/test_app/bin/test_app stop $ MIX_ENV=prod mix release $ _build/prod/rel/test_app/bin/test_app start 1 git push 4 git pull 5 release & restart 2 trigger 3 ssh loging

Slide 29

Slide 29 text

phase 4. multi server / upload release file to S3 local PC Build Server 1 git push 4 git pull 2 trigger 3 ssh login $ git pull $ MIX_ENV=prod mix release $ s3mcd put _build/prod/rel 5 release & upload Server 6 ssh login 7 download rel & restart $ _build/prod/rel/test_app/bin/test_app stop $ s3cmd get $ _build/prod/rel/test_app/bin/test_app start

Slide 30

Slide 30 text

phase 5. multi server / deploy with auto scaling local PC Build Server 1 git push 4 git pull 2 trigger 3 ssh login $ git pull $ MIX_ENV=prod mix release $ s3mcd put _build/prod/rel 5 release & upload Server 6 ssh login 7 download rel & restart $ _build/prod/rel/test_app/bin/test_app stop $ s3cmd get $ _build/prod/rel/test_app/bin/test_app start NEW Server! 8 autoscaling 9 [in init script] download rel & start

Slide 31

Slide 31 text

phase 6. multi server / deploy with edeliver/distirelly local PC Build Server 1 git push 4 git pull 2 trigger 3 edeliver build $ git pull $ mix edeliver build $ mix edeliver deploy $ mix edeliver restart 5 release & upload Server 6 develiver deploy 7 edeliver restart tar file 5 download tar tar file

Slide 32

Slide 32 text

phase 7. multi server / deploy with auto scaling local PC Build Server 1 git push 4 git pull 2 trigger 3 edeliver build $ git pull $ s3cmd put tar $ mix edeliver build $ mix edeliver deploy $ mix edeliver restart Server 7 develiver deploy 8 edeliver restart tar file 5 download tar NEW Server! 9 autoscaling 6 upload tar tar file 10 [in init script] download tar & start $ s3cmd get $ /bin/test_app start

Slide 33

Slide 33 text

phase 8. k8s cluster local PC 1 git push 2 trigger $ git pull $ docker build --build-arg MIX_ENV=prod $ docker tag test_app xxx $ docker push xxx 3 docker build&tag 4 docker push 5 kubectl apply -f xxx.yaml 6 docker image pull 7 k8s update pods 8 scaling by HPA simple! $ MIX_ENV=prod mix release $ cp _build/prod/rel/test_app / $ /bin/test_app start

Slide 34

Slide 34 text

elixir deployment tao ● use CI for deployment ● use convenient infrastructure ○ k8s hides deployment complexity ● don’t solve the problem with elixir alone ○ we have awesome cloud services

Slide 35

Slide 35 text

part 3. Summary

Slide 36

Slide 36 text

conclusion ● Good D/OX improve product quality ● Good D/OX == Easy to use infrastructure / apps ● Automation! ● deployment is big automation issue ● choose the deployment method that matches your architecture