Service Operation Centered Development
2019/04/09 DevOpsDays Tokyo 2019
Mitsuyuki Shiiba (@bufferings)
EC Incubation Development Dept.
Rakuten, Inc.
Slide 2
Slide 2 text
@bufferings #devops_b2
"It's their fault, not ours"
Photo by Alexander McFeron on Unsplash
Slide 3
Slide 3 text
@bufferings #devops_b2
After all, does it make the service better?
Slide 4
Slide 4 text
@bufferings #devops_b2
Service
Operation
Development
Interaction
Slide 5
Slide 5 text
@bufferings #devops_b2
Mitsuyuki Shiiba
Web Application Engineer
@Rakuten Osaka from 2010
Slide 6
Slide 6 text
@bufferings #devops_b2
30
countries & regions
with local operational presence
70 +
services
Almost
1.3 B
global members
¥ 15.4 T
global gross transaction value
*FY2018
Figures are from https://global.rakuten.com/corp/ accessed on April 3rd, 2019
@bufferings #devops_b2
Maintenance
Monitoring
Trouble shooting
Inquiry handling
Analysis
Improvement
Alert handling
Version upgrade
etc…
Service
Operation
Slide 10
Slide 10 text
@bufferings #devops_b2
Service Operation is
to keep the service stable
Slide 11
Slide 11 text
@bufferings #devops_b2
Service Operation is to the service
Photo by Ravi Roshan on Unsplash
Slide 12
Slide 12 text
@bufferings #devops_b2
"It's a lot of fun!"
Slide 13
Slide 13 text
@bufferings #devops_b2
doing both operation and development including the releases
The Team
Slide 14
Slide 14 text
@bufferings #devops_b2
that's because
knowing != understanding
Slide 15
Slide 15 text
@bufferings #devops_b2
Too many alerts
Non-automated tests
Long methods
Meaningless names
Manual deployment
etc…
Not updated documents
Big ball of mud
Dirty servers
What makes
Service
Operation
tough
Slide 16
Slide 16 text
@bufferings #devops_b2
then I understood in 2011
understanding != doing
Slide 17
Slide 17 text
@bufferings #devops_b2
Photo by Simon Rae on Unsplash
Slide 18
Slide 18 text
@bufferings #devops_b2
We should connect dev & ops in our heads
putting Service Operation at the center
Slide 19
Slide 19 text
@bufferings #devops_b2
Good culture, feature flags, chatops, and more.
Rakuma team wants you! -> (Wantedly) https://shiiba.page.link/rakuma
if current_time <= 7m
Slide 20
Slide 20 text
@bufferings #devops_b2
Development
Slide 21
Slide 21 text
@bufferings #devops_b2
1. Reply to all the alerts immediately
Slide 22
Slide 22 text
@bufferings #devops_b2
It's US who get woken up at midnight, not anyone else.
Photo by Andre Mouton on Unsplash
Slide 23
Slide 23 text
@bufferings #devops_b2
Is this really necessary?
• Can we check it next morning?
• Can we retry it?
• Can we develop automatic recovery?
What's actually happened?
• What's the impact on the users?
• What should we do for it?
Slide 24
Slide 24 text
@bufferings #devops_b2
so that everyone can think about the impact & the solution
LogMessageBuilder
message("What's happened?")
.cause("What's the cause?")
.impact("What's the user impact?")
.solution("What do we have to do?")
.build()
Slide 25
Slide 25 text
@bufferings #devops_b2
to know the actual impact on the users
(Learned) Handle them at the controller layer
Data Access
Application Service
Controller ← We can know the user impact
← We can't know the user impact
Slide 26
Slide 26 text
@bufferings #devops_b2
2. Safe by Design
Slide 27
Slide 27 text
@bufferings #devops_b2
Fool Proof
to use the tools at ease
Fail Safe
to keep consistency for
every single line
Idempotence
so that we can send a same
message multiple times
Eventual Consistency
for automatic recovery
Decoupled Architecture
to minimize the user impact
Slide 28
Slide 28 text
@bufferings #devops_b2
3. Services over Projects
Slide 29
Slide 29 text
@bufferings #devops_b2
Think the service narrative
including non-systematized area
Slide 30
Slide 30 text
@bufferings #devops_b2
Service quality over project deadline
Slide 31
Slide 31 text
@bufferings #devops_b2
Delivered value over estimation accuracy
Slide 32
Slide 32 text
@bufferings #devops_b2
Reading time over writing time
Slide 33
Slide 33 text
@bufferings #devops_b2
Codes/Tests/Docs
Readable
• having reason for every single line
• meaningful names
• small methods
• overview → detail
Maintainable
• keep them updated
• just enough quality & quantity
Slide 34
Slide 34 text
@bufferings #devops_b2
Team's growth over resource efficiency
Slide 35
Slide 35 text
@bufferings #devops_b2
• Pair Work, Mob Work
• Learning Sessions
• Daily Experiments
Learning Teams
Slide 36
Slide 36 text
@bufferings #devops_b2
EC Start-up Group is mobbing everyday!
-> (Rakuten Careers) https://shiiba.page.link/ecsu
if current_time <= 15m
Slide 37
Slide 37 text
@bufferings #devops_b2
Interaction
Slide 38
Slide 38 text
@bufferings #devops_b2
1. Feel the forces
Slide 39
Slide 39 text
@bufferings #devops_b2
"I don't know why they don't do this!"
Slide 40
Slide 40 text
@bufferings #devops_b2
Feel the forces in the org. Respect & trust other people.
Doing the right thing in your place doesn't work
Slide 41
Slide 41 text
@bufferings #devops_b2
We only can see what we know
Respect other people's expertise
Slide 42
Slide 42 text
@bufferings #devops_b2
Trust everyone feels joy to make the service better.
Trust people, doubt Ba(場: environment)
Slide 43
Slide 43 text
@bufferings #devops_b2
or change the forces together
Take advantage of the forces for the service
Slide 44
Slide 44 text
@bufferings #devops_b2
2. See around the corner
Slide 45
Slide 45 text
@bufferings #devops_b2
Photo by dan carlson on Unsplash
Slide 46
Slide 46 text
@bufferings #devops_b2
See what would come next for the service, and prepare for them
• business
• technology
• design
• organization
Everything is changing rapidly
Slide 47
Slide 47 text
@bufferings #devops_b2
Service
Operation
Development
Interaction
1. Reply to all the alerts
2. Safe by Design
3. Services over Projects
1. Feel the forces
2. See around the corner
Summary
Slide 48
Slide 48 text
@bufferings #devops_b2
After all, does it make the service better?
Slide 49
Slide 49 text
@bufferings #devops_b2
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
https://conferences.oreilly.com/velocity/velocity2009/public/sc
hedule/detail/7641
Appendix