Challenges for Global Service from a Perspective of SRE 2nd season

Cookpad Inc. Feb 27th, 2019 Takayuki Watanabe Technology Department SRE
Group Challenges for Global Service  from a Perspective of SRE ~ 2nd season ~

About me 2 • Takayuki Watanabe - Twitter: takanabe_w /
GitHub: takanabe • Site Reliability Engineer (a.k.a SRE) - Focus on Cookpad Global Projects

Today’s menu 3 • What is Cookpad Global ? •
Role of Site Reliability Engineers • Paving Roads for Autonomous Teams - Challenge 1: Organization Transformation for Greater Autonomy - Challenge 2: Feasible Self-service for Autonomous Teams

What is Cookpad Global ? 4

What is Cookpad Global Service? 5 +BQBOFTF$PPLQBE"QQ (MPCBM$PPLQBE"QQT What is
Cookpad Global Service?

What is Cookpad Global Service? What is Cookpad Global Service?
6 +BQBOFTF$PPLQBE"QQ (MPCBM$PPLQBE"QQT JP Service ≠ Global Service

Our users across the globe 7 4FSWJDFQSPWJEFE DPVOUSJFT What is
Cookpad Global Service?

What is Cookpad Global Service? Our users across the globe
8 4FSWJDFQSPWJEFE DPVOUSJFT 71 Countries 26 Languages

What is Cookpad Global Service? Our users across the globe
9 4FSWJDFQSPWJEFE DPVOUSJFT 94 Million Monthly Average Users

# of Recipes for Global Service 10 # of Recipes
0WFSNJMMJPOSFDJQFT NJMMJPOSFDJQFTTJODF What is Cookpad Global Service?

Users and Developers across the globe 11 • Global Service
and SRE ? • Empower high perform technology organization Global head quarter   UK, Bristol 11

and SRE ? • Empower high perform technology organization Global head quarter   UK, Bristol 12 100 People 25 Nationalities

and SRE ? • Empower high perform technology organization Global head quarter   UK, Bristol 13 The best people join from all over the world

Role of Site Reliability Engineers 14

16 A user living beyond a log

18 Our Product Developers

Missions for SREs in Cookpad 19 • Maximize user experiences
in terms of: • Service availability • Performance • Security • etc… • Build a great platform to support a growing product • Product development optimized platform • Software architects owning comprehensive knowledge for technology Role of Site Reliability Engineers

Missions for SREs in Cookpad 20 • Maximize user experiences
in terms of: • Service availability • Performance • Security • etc… • Build a great platform to support a growing product • Product development optimized platform • Software architects owning comprehensive knowledge for technology Role of Site Reliability Engineers Control service availability based on various factors

SRE technology scope in Cookpad 21 4FSWJDF 1MBUGPSNT w 7.$POUBJOFS1MBUGPSNPO"84
Role of Site Reliability Engineers

SRE technology scope in Cookpad 22 4FSWJDF 1MBUGPSNT 0CTFSWBCJMJUZ &OHJOFFSJOH
.JTDJOIPVTF 5PPMJOH 3FMFBTF &OHJOFFSJOH 3FTJMJFODF &OHJOFFSJOH w %JTUSJCVUFE5SBDJOH w .FUSJDT.POJUPSJOH w -PHHJOH4ZTUFN w "MFSUT.BOBHFNFOU w .-#BTFE"OPNBMZ%FUFDUJPO w %BUB"OBMZTJT w 5FBN)FBMUI7JTVBMJ[BUJPO w "84$PTU0QUJNJ[BUJPO w %FWFMPQFS'SJFOEMZ"VUI4ZTUFN w FUD w %FQMPZ1JQFMJOF w $POUJOVPVT*OUFHSBUJPO w $POUJOVPVT%FMJWFSZ w %FQMPZ4USBUFHZ w /8'BVMU*OKFDUJPO w 4QPU*OTUBODF w $JSDVJU#SFBLFS w 5ISPUUMJOH w 7.$POUBJOFS1MBUGPSNPO"84 Role of Site Reliability Engineers

23 Challenges in 2018 Attacks from China GDPR Recipe data
migration EKS based staging Recruitment in UK Observability Full containerization 23 Spot instances Expense reduction Toil analysis automation

Paving Roads for Autonomous Teams 24

Paving Roads for Autonomous Teams 25 • Challenge 1: Organization
Transformation for Greater Autonomy • Challenge 2: Feasible Self-service for Autonomous Teams

Challenge 1 Organization Transformation for Greater Autonomy 26

Organization Transformation for Greater Autonomy 27 • Tipping Points for
Autonomous Teams • Organization Transformation: Chapter and Squad • Development style change for new team structure • Necessity of shared responsibility for service availability Challenge1: Organization Transformation for Greater Autonomy

Tipping Points for Autonomous Teams 28 • Cookpad employees in
UK • 2016: 5 people • 2017: 50 people • 2018: 100 people Challenge1: Organization Transformation for Greater Autonomy 8FC J04 "OESPJE 2" 43& 1. .- Team structure in 2016, 2017

Tipping Points for Autonomous Teams 29 Challenge1: Organization Transformation for
Greater Autonomy lines = n(n − 1) 2

Tipping Points for Autonomous Teams 30 Challenge1: Organization Transformation for
Greater Autonomy lines = n(n − 1) 2 Communication cost ↑

Organization Transformation: Chapter and Squad 31 Challenge1: Organization Transformation for
Greater Autonomy 8FC J04 "OESPJE 2" 8FC J04 "OESPJE 2" 8FC J04 "OESPJE 2" 43& Chapter 1. 1. 1. Product Squad .- Cross-platform Squad ɾɾɾ 8FC J04 "OESPJE 2" 43& 1. .- After Before

1 34 Challenge1: Organization Transformation for Greater Autonomy 8FC J04
"OESPJE 2" 8FC J04 "OESPJE 2" 8FC J04 "OESPJE 2" 43& Chapter 1. 1. 1. Product Squad .- Cross-platform Squad ɾɾɾ 8FC J04 "OESPJE 2" 43& 1. .- After Before Conway's law … http://www.melconway.com/Home/Conways_Law.html

Development style change for new team structure 35 • Architecture
of new feed • New development styles Challenge1: Organization Transformation for Greater Autonomy > Development style change for new team structure

Architecture of new feed 36 Message broker Main API Cache
Feed API DB Complete feed json/html Cache DB GET /user_id/feed List of activity primary keys in order, paginated Challenge1: Organization Transformation for Greater Autonomy > Development style change for new team structure

Architecture of new feed 37 Message broker Main API Cache
Feed API DB Complete feed json/html Cache DB New components developed by a squad GET /user_id/feed List of activity primary keys in order, paginated Challenge1: Organization Transformation for Greater Autonomy > Development style change for new team structure

New development styles (Partial release in production) 38 # WIP
code for new notiﬁcation system # https://github.com/cookpad/xxxxxx-squad/issues/yyyyyy Rollout.add :notiﬁcation_center, owner: "xxxxxx-squad" do # @developer_a, @developer_b, @developer_c, @developer_d, @developer_e Current.user&.id&.in?([AAAAAA, BBBBBB, CCCCCC, DDDDDD, EEEEEE]) end • Feature toggle (application level control) • Prototype environment (platform level control) Challenge1: Organization Transformation for Greater Autonomy > Development style change for new team structure

Challenge1: Organization Transformation for Greater Autonomy > Development style change
for new team structure New development styles (Partial release in production) 39 # WIP code for new notiﬁcation system # https://github.com/cookpad/xxxxxx-squad/issues/yyyyyy Rollout.add :notiﬁcation_center, owner: "xxxxxx-squad" do # @developer_a, @developer_b, @developer_c, @developer_d, @developer_e Current.user&.id&.in?([AAAAAA, BBBBBB, CCCCCC, DDDDDD, EEEEEE]) end • Feature toggle (application level control) • Prototype environment (platform level control) Only users know answers

Feed was successful feature? 40 • Yes, feed was one
of the most successful features in 2018 • New architecture • New technology stack • 100% release in production in short time Challenge1: Organization Transformation for Greater Autonomy > Development style change for new team structure

41 Why feed was successful? • A lot of trials,
failures and improvements in short term • Developers had power and responsibility for feature developments • Feed was developed from scratch • Developers could choose appropriate technology • Introduce Streamy, Karafka (stream app frameworks) • Test Kafka, RabbitMQ, SQS, Kinesis (message brokers) Challenge1: Organization Transformation for Greater Autonomy > Development style change for new team structure

for new team structure 42 Why feed was successful? • A lot of trials, failures and improvements in short term • Developers had power and responsibility for feature developments • Feed was developed from scratch • Developers could choose appropriate technology • Introduce Streamy, Karafka (stream app frameworks) • Test Kafka, RabbitMQ, SQS, Kinesis (message brokers) Rapid prototyping was successful

for new team structure 43 Why feed was successful? • A lot of trials, failures and improvements in short term • Developers had power and responsibility for feature developments • Feed was developed from scratch • Developers could choose appropriate technology • Introduce Streamy, Karafka (stream app frameworks) • Test Kafka, RabbitMQ, SQS, Kinesis (message brokers) On the other hand …

Necessity of shared responsibility for service availability 44 Challenge1: Organization
Transformation for Greater Autonomy > Necessity of shared responsibility for service availability

Challenge1: Organization Transformation for Greater Autonomy > Necessity of shared
responsibility for service availability Necessity of shared responsibility for service availability 45 Too many errors SREs cannot understand…

46 %FWFMPQFST`IBQQJOFTT 43&T`IBQQJOFTT OFXQSPEVDU IBQQZ IBQQZ VOIBQQZ VOIBQQZ OFXQSPEVDU Happiness
Quadrant (release new feed) Challenge1: Organization Transformation for Greater Autonomy > Necessity of shared responsibility for service availability

47 %FWFMPQFST`IBQQJOFTT 43&T`IBQQJOFTT IBQQZ IBQQZ VOIBQQZ VOIBQQZ UPVHIFYQFSJFODFT OFXQSPEVDU Happiness

48 %FWFMPQFST`IBQQJOFTT 43&T`IBQQJOFTT UPVHIFYQFSJFODFT IBQQZ IBQQZ VOIBQQZ VOIBQQZ OFXQSPEVDU Happiness

49 %FWFMPQFST`IBQQJOFTT 43&T`IBQQJOFTT UPVHIFYQFSJFODFT IBQQZ IBQQZ VOIBQQZ VOIBQQZ OFXQSPEVDU Happiness
Quadrants (Release new feed) Challenge1: Organization Transformation for Greater Autonomy > Necessity of shared responsibility for service availability

Challenge1: Organization Transformation for Greater Autonomy > Necessity of shared
responsibility for service availability 50 %FWFMPQFST`IBQQJOFTT 43&T`IBQQJOFTT UPVHIFYQFSJFODFT IBQQZ IBQQZ VOIBQQZ VOIBQQZ OFXQSPEVDU Happiness Quadrants (Release new feed) Not sustainable …

51 Why this situation happen? • A lot of trials,
failures and improvements in short term • Developers had power and responsibility for feature developments • Feed was developed from scratch • Developers could choose appropriate technology • Introduce Streamy, Karafka (stream app frameworks) • Test Kafka, RabbitMQ, SQS, Kinesis (message brokers) • No concepts of shared responsibility for service availability Challenge1: Organization Transformation for Greater Autonomy > Necessity of shared responsibility for service availability

52 (WIP) Shared responsibility as Autonomous Teams • Shared responsibility
for organization sustainability • Reach consensus of service availability for each feature • Targets decided by product owners • Higher quality in emergency notiﬁcations • Alert handling by appropriate people • Another organization transformation based on ideal tech & business architectures Challenge1: Organization Transformation for Greater Autonomy > Necessity of shared responsibility for service availability

53 • Shared responsibility for organization sustainability • Reach consensus
of service availability for each feature • Targets decided by product owners • Higher quality in emergency notiﬁcations • Alert handling by appropriate people • Another organization transformation based on ideal tech & business architectures Challenge1: Organization Transformation for Greater Autonomy > Necessity of shared responsibility for service availability Inverse Conway Maneuver … (WIP) Shared responsibility as Autonomous Teams

Challenge 2 Feasible Self-service for Autonomous Teams 54

Feasible Self-service for Autonomous Teams 55 • Four Important Keys
for Successful Autonomous Teams • Feasible Self-service for Developers • Our focused scope • Full-containerization • No ssh debugging Challenge2: Feasible Self-service for Autonomous Teams

Four Important Keys for Successful Autonomous Teams 56 • Discipline:
Common rules in organization - Technology stack, team structure • Freedom: Ownership for individual developments - Small team, technology selection, system design • Responsibility: Commitments for whole software life cycle - Design, implementation, test, deploy, service availability monitoring • Optimization: Best practices for product developments - Logging and monitoring system, Deploy pipeline Challenge2: Feasible Self-service for Autonomous Teams

Common rules in organization - Technology stack, team structure • Freedom: Ownership for individual developments - Small team, technology selection, system design • Responsibility: Commitments for whole software life cycle - Design, implementation, test, deploy, service availability monitoring • Optimization: Best practices for product developments - Logging and monitoring system, deploy pipeline, feature toggle Challenge2: Feasible Self-service for Autonomous Teams

Common rules in organization - Technology stack, team structure • Freedom: Ownership for individual developments - Small team, technology selection, system design • Responsibility: Commitments for whole software life cycle - Design, implementation, test, deploy, service availability monitoring • Optimization: Best practices for product developments - Logging and monitoring system, deploy pipeline, feature toggle Challenge2: Feasible Self-service for Autonomous Teams Organization strategy matter

Common rules in organization - Technology stack, team structure • Freedom: Ownership for individual developments - Small team, technology selection, system design • Responsibility: Commitments for whole software life cycle - Design, implementation, test, deploy, service availability monitoring • Optimization: Best practices for product developments - Logging and monitoring system, deploy pipeline, feature toggle Challenge2: Feasible Self-service for Autonomous Teams Organization strategy matter Strong leaderships across tech and business are essential

Common rules in organization - Technology stack, team structure • Freedom: Ownership for individual developments - Small team, technology selection, system design • Responsibility: Commitments for whole software life cycle - Design, implementation, test, deploy, service availability monitoring • Optimization: Best practices for product developments - Logging and monitoring system, deploy pipeline, feature toggle Challenge2: Feasible Self-service for Autonomous Teams SRE squad can contribute

Common rules in organization - Technology stack, team structure • Freedom: Ownership for individual developments - Small team, technology selection, system design • Responsibility: Commitments for whole software life cycle - Design, implementation, test, deploy, service availability monitoring • Optimization: Best practices for product developments - Logging and monitoring system, deploy pipeline, feature toggle Challenge2: Feasible Self-service for Autonomous Teams SRE squad can contribute Optimized self-service mechanisms providing company-wide best practices in SRE

Feasible Self-service for Developers 65 • Low learning cost •
e.g: Are you sure that developers are happy to learn and maintain k8s yaml? • Secure and painless operations in production • e.g: Are experiences provided by SREs comfortable and secure for developers? Challenge2: Feasible Self-service for Autonomous Teams

Our focused scope 66 • Full-containerization • No ssh debugging
Challenge2: Feasible Self-service for Autonomous Teams

67 Full-containerization

Pros of Applications on Container Platform 68 • Developers can
control software version upgrade timing • SREs don’t want to maintain legacy VM based service platform • Application of in-house tools and company-wide best practices • Auto Scaling • Cost optimization (spot ﬂeets) • Container apps deployment tool (hako) • Centralized developer console (hako-console) • Easy service mesh integration • etc … • Immutable infrastructure • version controlled applications and infrastructures • No conﬁguration drifts Challenge2: Feasible Self-service for Autonomous Teams > Full-containerization

Pros of Applications on Container Platform 69 • Developers can
control software version upgrade timing • SREs don’t want to maintain legacy VM based service platform • Application of in-house tools and company-wide best practices • Auto Scaling • Cost optimization (spot ﬂeets) • Container apps deployment tool (hako) • Centralized developer console (hako-console) • Easy service mesh integration • etc … • Immutable infrastructure • version controlled applications and infrastructures • No conﬁguration drifts Challenge2: Feasible Self-service for Autonomous Teams > Full-containerization

Development Lead Time 70

71 %FW 43&

73 73 Gaps between Devs & SREs …

Happiness Quadrant (Software Upgrade without container) 74 %FWFMPQFST`IBQQJOFTT 43&T`IBQQJOFTT IBQQZ
IBQQZ VOIBQQZ VOIBQQZ 6OQSPEVDUJWFUBTLT /FXTPGUXBSFWFSTJPO Challenge2: Feasible Self-service for Autonomous Teams > Full-containerization

75 %FWFMPQFST`IBQQJOFTT 43&T`IBQQJOFTT #MPDLFEUJNFFYQFSJFODF /FXTPGUXBSFWFSTJPO /FXTPGUXBSFWFSTJPO IBQQZ IBQQZ VOIBQQZ VOIBQQZ
Happiness Quadrant (Software Upgrade without container) 6OQSPEVDUJWFUBTLT Challenge2: Feasible Self-service for Autonomous Teams > Full-containerization

76 %FWFMPQFST`IBQQJOFTT 43&T`IBQQJOFTT 5PUBMIBQQJOFTT IBQQZ IBQQZ VOIBQQZ VOIBQQZ Happiness Quadrant
(Software Upgrade without container) Challenge2: Feasible Self-service for Autonomous Teams > Full-containerization

77 %FWFMPQFST`IBQQJOFTT 43&T`IBQQJOFTT 5PUBMIBQQJOFTT IBQQZ IBQQZ VOIBQQZ VOIBQQZ (SFBUNFDIBOJTNNJHIUQVUUIF WFDUPSPOUPUIFTURVBESBOUʜ
Happiness Quadrant (Software Upgrade without container) Challenge2: Feasible Self-service for Autonomous Teams > Full-containerization

Challenge2: Feasible Self-service for Autonomous Teams > Full-containerization Happiness quadrant
(Software Upgrade without container) 78 %FWFMPQFST`IBQQJOFTT 43&T`IBQQJOFTT 5PUBMIBQQJOFTT IBQQZ IBQQZ VOIBQQZ VOIBQQZ (SFBUNFDIBOJTNNJHIUQVUUIF WFDUPSPOUPUIFTURVBESBOUʜ Run all stateless applications on container clusters

Progress of Full-containerization in Global 79 Challenge2: Feasible Self-service for
Autonomous Teams > Full-containerization

Challenge2: Feasible Self-service for Autonomous Teams > Full-containerization Progress of
Full-containerization in Global 80 17/18 apps are running on containers (94 % is completed)

81 81 %FW

82 82 5IFEBUF3VCZXBTSFMFBTFE %FDUI %FWFMPQFSTDBODPOUSPM3VCZWFSTJPOTXJUIPVU43&T`TVQQPSUT %FW

83 %FWFMPQFST`IBQQJOFTT 43&T`IBQQJOFTT USBJOJOHDPTU MFBSOJOHDPTU /FXTPGUXBSFWFSTJPO /FXTPGUXBSFWFSTJPO PQFSBUJPOBMDPTUSFEVDUJPO IBQQZ IBQQZ
VOIBQQZ VOIBQQZ Happiness Quadrant (Software Upgrade with container) Challenge2: Feasible Self-service for Autonomous Teams > Full-containerization

84 %FWFMPQFST`IBQQJOFTT 43&T`IBQQJOFTT 5PUBMIBQQJOFTT IBQQZ IBQQZ VOIBQQZ VOIBQQZ Happiness Quadrant
(Software Upgrade with container) Challenge2: Feasible Self-service for Autonomous Teams > Full-containerization

Challenge2: Feasible Self-service for Autonomous Teams > Full-containerization 85 %FWFMPQFST`IBQQJOFTT
43&T`IBQQJOFTT 5PUBMIBQQJOFTT IBQQZ IBQQZ VOIBQQZ VOIBQQZ Happiness Quadrant (Software Upgrade with container) Win - Win

Challenge2: Feasible Self-service for Autonomous Teams > Full-containerization 86 %FWFMPQFST`IBQQJOFTT
43&T`IBQQJOFTT 5PUBMIBQQJOFTT IBQQZ IBQQZ VOIBQQZ VOIBQQZ Happiness Quadrant (Software Upgrade with container) Plus, SREs can focus on container platform (more best practices can be introduced)

Challenge2: Feasible Self-service for Autonomous Teams > No ssh debugging
SSH SSH 87 No ssh debugging

Cons of Applications on Container Platform 88 • Additional Complexities
for Developers • Lack of tools cause chaos Challenge2: Feasible Self-service for Autonomous Teams > No ssh debugging

for Developers • Lack of tools create chaos SFGIUUQTTQFBLFSEFDLDPNUBLBOBCFDIBMMFOHFTGPSHMPCBMTFSWJDFGSPNBQFSTQFDUJWFPGTSF TMJEF

for Developers • Lack of tools create chaos SFGIUUQTTQFBLFSEFDLDPNUBLBOBCFDIBMMFOHFTGPSHMPCBMTFSWJDFGSPNBQFSTQFDUJWFPGTSF TMJEF Already Enough ?

for Developers • Lack of tools cause chaos • No ssh debugging systems for Global team • Granular and chronological order metrics dashboard • Container optimized New Relic agent deployment • Short-term log collection • Safe rails console for container Challenge2: Feasible Self-service for Autonomous Teams > No ssh debugging

93 $POUBJOFST 4IPSUUFSN 5%# -POHUFSN 5%# *OqVY%# 1SPNFUIFVT %FWFMPQFS EPXOTBNQMJOH
FYQPSUNFUSJDT (SBGBOB 5JNFTFSJFT%BUBCBTF 6TFS*OUFSGBDF Granular and chronological metrics dashboard Challenge2: Feasible Self-service for Autonomous Teams > No ssh debugging

94 Granular and chronological metrics dashboard • Before • We
cannot dig errors caused by spike resource saturations • After • We can recognize errors caused by spike resource saturations • We can judge that errors should be ﬁxed soon or not Challenge2: Feasible Self-service for Autonomous Teams > No ssh debugging

Container optimized New Relic agent deployment 95 $POUBJOFS TIBSFENFNPSZ BHFOUTUBSUqBH
IUUQBQQ@OFX@SFMJDTUBSU "11 SBDLOFX@SFMJDTUBSUFS IBLPQBSUJBSFMJD  FYFDDPOTVMMPDL Challenge2: Feasible Self-service for Autonomous Teams > No ssh debugging

Container optimized New Relic agent deployment 96 • Before •
ECS cannot deploy a New Relic agent to a speciﬁc container ( We want to save ) • Agents are gone when containers are killed accidentally • After • ECS can deploy a New Relic agent to a container • Distributed locking via `consul lock` sidecar • Rack middleware that provides an endpoint to start the New Relic agent • Agents are launched in a container when agent start ﬂag exists on shared memory Challenge2: Feasible Self-service for Autonomous Teams > No ssh debugging

Short-term log collection 97 $POUBJOFST 4IPSUUFSN MPHTFBSDI -POHUFSN MPHTFBSDI 4
"UIFOB &MBTUJDTFBSDI FYQPSUMPHT IBLPDPOTPMF -PHTFBSDI 6TFS*OUFSGBDF ,JCBOB %FWFMPQFS FYQPSUMPHT Challenge2: Feasible Self-service for Autonomous Teams > No ssh debugging

Short-term log collection 98 • Before • Developers have to
wait for few minutes to search logs • After • Developers can check logs nearly real-time Challenge2: Feasible Self-service for Autonomous Teams > No ssh debugging

Safe rails console for container 99 $POUBJOFST &YQPSUBVEJUMPHT USBQEPPSDPOTPMF 6TFS*OUFSGBDF
4MBDL USBQEPPSBHFOU "11 *OUFSBDUJWFDPNNVOJDBUJPO  WJB8FC4PDLFU %FWFMPQFS .BOBHFBDDFTT QSJWJMFHFT "[VSF"% USBQEPPSQSPYZ #JOFYFD *OUFSBDUJWFDPNNVOJDBUJPO  WJB8FC4PDLFU EBUBPOMZDPOUBJOFS Challenge2: Feasible Self-service for Autonomous Teams > No ssh debugging

Safe rails console for container 100 • Before • Developers
ssh to servers and run `rails -c` (Sometimes `rails -c -s`) • Developers can run write queries in production ( historical technical debt ) • After • Developers can use REPL via web browser with safe options selected by SREs • Developers can only run read queries on designated database instance Challenge2: Feasible Self-service for Autonomous Teams > No ssh debugging

Safe rails console for container (before) 101 takayuki-watanabe@ssh-accepatable-host-xxx:~$ date Thu
Apr 19 10:53:28 UTC 2018 takayuki-watanabe@ssh-accepatable-host-xxx:~$ htop PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command [snip] 31817 cookpad 20 0 794M 129M 188 R 93.7 1.7 1923h ruby bin/rails console production -s 8773 cookpad 20 0 734M 165M 152 R 91.7 2.2 1800h ruby bin/rails console production -s 8107 cookpad 20 0 959M 734M 14228 R 83.7 9.8 40h01:04 ruby bin/rails c production [snip] Challenge2: Feasible Self-service for Autonomous Teams > No ssh debugging

Safe rails console for container (after) 102 Challenge2: Feasible Self-service
for Autonomous Teams > No ssh debugging

Safe rails console for container (after) 103 Challenge2: Feasible Self-service
for Autonomous Teams > No ssh debugging Feasible Self-service make product development reliable and autonomous !!

Recap 104 • What is Cookpad Global ? • Role
of Site Reliability Engineers • Paving Roads for Autonomous Teams - Challenge 1: Organization Transformation for Greater Autonomy - Challenge 2: Feasible Self-service for Autonomous Teams

105 Thank you !! ([email protected])

Challenges for Global Service from a Perspectiv...

Challenges for Global Service from a Perspective of SRE 2nd season

More Decks by Takayuki WATANABE (渡辺 喬之)

Other Decks in Technology

Featured

Transcript

More Decks by Takayuki WATANABE (渡辺喬之)