Beyond DevOps:
Operability for
Serverless and IoT
Matthew Skelton, Conflux | 22 March 2018, UCL
Slide 2
Slide 2 text
Focus on operability to
get the best out of
Serverless and IoT
2
Slide 3
Slide 3 text
About me
Matthew Skelton, Conflux
@matthewpskelton
matthewskelton.net
Leeds, UK
3
Slide 4
Slide 4 text
About me
Matthew Skelton, Conflux
@matthewpskelton
matthewskelton.net
Leeds, UK
4
assemblyconf.com
Slide 5
Slide 5 text
What we’ll learn
● Beyond DevOps
● Operability as a key focus
● What is Serverless?
● What is an IoT platform?
● Operability for Serverless & IoT
5
Slide 6
Slide 6 text
Beyond DevOps
Beyond DevOps - Operability for Serverless and IoT
6
@ConfluxHQ
Slide 7
Slide 7 text
DevOps
7
● Infrastructure Automation
vs
● “Highly effective, daily collaboration between
software developers and IT operations people
to produce relevant, working systems”
https://skeltonthatcher.com/blog/a-useful-working-definition-of-devops/
Wall of Confusion
12
Andrew Clay Shafer / @littleidea
Slide 13
Slide 13 text
Feedback and Learning
13
Gene Kim - https://itrevolution.com/the-three-ways-principles-underpinning-devops/
Slide 14
Slide 14 text
Feedback and Learning
14
Gene Kim - https://itrevolution.com/the-three-ways-principles-underpinning-devops/
Slide 15
Slide 15 text
Feedback and Learning
15
Gene Kim - https://itrevolution.com/the-three-ways-principles-underpinning-devops/
Slide 16
Slide 16 text
Type 1: Dev and Ops Collaboration
16
Slide 17
Slide 17 text
Type 2: Fully Shared Ops Duties
17
Slide 18
Slide 18 text
“Infrastructure as Code”
18
● Programmable via APIs
● Scripts in version control
● Testable and Test-driven
● Software Engineering practices
applied to web infrastructure
Platform Teams
● Not on-call for applications
● Responsible only for underlying
platform infra
● Provide the platform “as a
Service” to Product teams
22
Slide 23
Slide 23 text
Product-aligned Teams
23
Slide 24
Slide 24 text
Product-aligned Teams
● On-call for application software
● User Experience (UX)
● Product viability
● Software dev & Testing
● Operational concerns
24
Slide 25
Slide 25 text
Productivity/Supporting Team(s)
25
Slide 26
Slide 26 text
Productivity Teams
● Advise & enable Product teams
● Experienced engineers
(software, testing, operations)
● Coordinate cross-team work
● “Heavy lifting”
26
Slide 27
Slide 27 text
27
Slide 28
Slide 28 text
Invest in code that
differentiates your
organisation
28
Slide 29
Slide 29 text
Infrastructure concerns
● SaaS: Software as a Service
● FaaS: Function as a Service
● PaaS: Platform as a Service
● CaaS: Containers as a Service
● IaaS: Infrastructure as a Service
● On-premise: traditional/manual
29
Infrastructure concerns
● SaaS: Software as a Service
● FaaS: Function as a Service
● PaaS: Platform as a Service
● CaaS: Containers as a Service
● IaaS: Infrastructure as a Service
● On-premise: traditional/manual
33
Slide 34
Slide 34 text
Capabilities and Automation
34
Slide 35
Slide 35 text
Every software system
needs a focus on
operability to be
successful
35
Slide 36
Slide 36 text
What is Serverless?
Beyond DevOps - Operability for Serverless and IoT
36
@ConfluxHQ
Slide 37
Slide 37 text
Serverless:
pay only to execute
differentiating code
37
Slide 38
Slide 38 text
Serverless
38
Event Stream architecture
Slide 39
Slide 39 text
FaaS: Functions as a Service
39
AWS Lambda Azure Functions Google Cloud Functions
Slide 40
Slide 40 text
PaaS: Platform as a Service
40
Slide 41
Slide 41 text
What is an IoT platform?
Beyond DevOps - Operability for Serverless and IoT
41
@ConfluxHQ
Slide 42
Slide 42 text
IoT Platform:
control and respond to
billions of sometimes
connected devices
42
Operability: a key focus
Beyond DevOps - Operability for Serverless and IoT
46
@ConfluxHQ
Slide 47
Slide 47 text
operability:
an ability to work well (in
Production)
47
Slide 48
Slide 48 text
Operability
Deploy
Monitor
Diagnose
Debug
Query
Control
Inspect
Clear
48
Slide 49
Slide 49 text
Operability for
Serverless & IoT
Beyond DevOps - Operability for Serverless and IoT
49
@ConfluxHQ
Slide 50
Slide 50 text
Operability for Serverless/IoT
50
“Even if the cloud provider is doing
everything, ... is my latency where my
customers need it to be? The provider’s
going to do the best they can to give me a
great service, but if my customers don’t
agree, then I have a problem.”
-- Kelsey Hightower, @kelseyhightower
https://read.acloud.guru/you-need-sre-skills-to-thrive-in-a-serverless-world-kelsey-hightower-340a002b3730
User Experience
52
● Latency
● “Service”
“How much does this
organisation care about my
end-to-end experience?”
Slide 53
Slide 53 text
Preventative approaches
53
● Billions of metrics
● Automatic correlation
● Automated anomaly detection
● Act as ‘sensing’ for Product
“It just works”
Slide 54
Slide 54 text
54
Slide 55
Slide 55 text
Security
55
● Perimeter explosion (FaaS, IoT)
● Auto-renewal for Certs
● Egress detection for data
● Dependency scanning
“I feel safe with these people”
Slide 56
Slide 56 text
56
Guy Podjarny / InfoQ https://www.infoq.com/articles/serverless-security
Slide 57
Slide 57 text
Audit & Compliance
57
● What data do you hold?
● Changes: where, when, who?
● Archive: secure, relevant
“Show me your approach to data
removal for consumers”
Slide 58
Slide 58 text
Cost control
58
● Detect run-away executions
● Retire little-used functions
● Prevent DoS → $$$
● Rapid product decisions
“What is the cost of this feature?”
Slide 59
Slide 59 text
Operability for
Serverless & IoT:
design for operational
transparency
59
Slide 60
Slide 60 text
Modern logging & tracing 1/
Distinct states
(Event ID)
{Delivered,
InTransit,
Arrived}
60
Slide 61
Slide 61 text
Modern logging & tracing 2/
Trace the path
(Correlation ID)
612999958..
61
Slide 62
Slide 62 text
Modern logging & tracing 3/
● Dev teams and Ops teams
collaborate on logging details:
○ Log messages
○ EventID
○ Correlation
● Invest in logging (time & tools)
62
Slide 63
Slide 63 text
Run Book dialogue sheets 1/
runbooktemplate.info
63
Slide 64
Slide 64 text
Run Book dialogue sheets 2/
● Checklists for typical
operational considerations
● Team-friendly exploration
around a large table
● See runbooktemplate.info
64
Slide 65
Slide 65 text
Summary
Beyond DevOps - Operability for Serverless and IoT
65
@ConfluxHQ
Slide 66
Slide 66 text
Operability for Serverless/IoT
● Is infrastructure automation the
best approach for your org?
● Invest in areas that differentiate
your organisation
● Make systems operate well
66
Slide 67
Slide 67 text
Operability for
Serverless & IoT:
design for operational
transparency
67
Slide 68
Slide 68 text
Further reading
Team Guide to Software Operability
Matthew Skelton & Rob Thatcher
skeltonthatcher.com/publications
Download a free sample chapter
68