Slide 1

Slide 1 text

Beyond DevOps: Operability for Serverless and IoT Matthew Skelton, Conflux | 22 March 2018, UCL

Slide 2

Slide 2 text

Focus on operability to get the best out of Serverless and IoT 2

Slide 3

Slide 3 text

About me Matthew Skelton, Conflux @matthewpskelton matthewskelton.net Leeds, UK 3

Slide 4

Slide 4 text

About me Matthew Skelton, Conflux @matthewpskelton matthewskelton.net Leeds, UK 4 assemblyconf.com

Slide 5

Slide 5 text

What we’ll learn ● Beyond DevOps ● Operability as a key focus ● What is Serverless? ● What is an IoT platform? ● Operability for Serverless & IoT 5

Slide 6

Slide 6 text

Beyond DevOps Beyond DevOps - Operability for Serverless and IoT 6 @ConfluxHQ

Slide 7

Slide 7 text

DevOps 7 ● Infrastructure Automation vs ● “Highly effective, daily collaboration between software developers and IT operations people to produce relevant, working systems” https://skeltonthatcher.com/blog/a-useful-working-definition-of-devops/

Slide 8

Slide 8 text

Before “Cloud” 8 ● Costly physical infrastructure ● Lengthy deployments (months) ● Mostly manual configuration ● Slow feedback speed ● Optimised for cost

Slide 9

Slide 9 text

9 Leonardo Rizzi - https://www.flickr.com/photos/stars6/4381851322/ CC-SA

Slide 10

Slide 10 text

Build vs Operate 10 BUILD OPERATE

Slide 11

Slide 11 text

Dev vs Ops 11 DEV OPS

Slide 12

Slide 12 text

Wall of Confusion 12 Andrew Clay Shafer / @littleidea

Slide 13

Slide 13 text

Feedback and Learning 13 Gene Kim - https://itrevolution.com/the-three-ways-principles-underpinning-devops/

Slide 14

Slide 14 text

Feedback and Learning 14 Gene Kim - https://itrevolution.com/the-three-ways-principles-underpinning-devops/

Slide 15

Slide 15 text

Feedback and Learning 15 Gene Kim - https://itrevolution.com/the-three-ways-principles-underpinning-devops/

Slide 16

Slide 16 text

Type 1: Dev and Ops Collaboration 16

Slide 17

Slide 17 text

Type 2: Fully Shared Ops Duties 17

Slide 18

Slide 18 text

“Infrastructure as Code” 18 ● Programmable via APIs ● Scripts in version control ● Testable and Test-driven ● Software Engineering practices applied to web infrastructure

Slide 19

Slide 19 text

CAMS 19 ● Culture ● Automation ● Measurement ● Sharing John Willis - https://blog.chef.io/2010/07/16/what-devops-means-to-me/

Slide 20

Slide 20 text

CAMS 20 ● Culture ● Automation ● Measurement ● Sharing John Willis - https://blog.chef.io/2010/07/16/what-devops-means-to-me/ 25% !!!

Slide 21

Slide 21 text

Type 3: Ops as Platform 21

Slide 22

Slide 22 text

Platform Teams ● Not on-call for applications ● Responsible only for underlying platform infra ● Provide the platform “as a Service” to Product teams 22

Slide 23

Slide 23 text

Product-aligned Teams 23

Slide 24

Slide 24 text

Product-aligned Teams ● On-call for application software ● User Experience (UX) ● Product viability ● Software dev & Testing ● Operational concerns 24

Slide 25

Slide 25 text

Productivity/Supporting Team(s) 25

Slide 26

Slide 26 text

Productivity Teams ● Advise & enable Product teams ● Experienced engineers (software, testing, operations) ● Coordinate cross-team work ● “Heavy lifting” 26

Slide 27

Slide 27 text

27

Slide 28

Slide 28 text

Invest in code that differentiates your organisation 28

Slide 29

Slide 29 text

Infrastructure concerns ● SaaS: Software as a Service ● FaaS: Function as a Service ● PaaS: Platform as a Service ● CaaS: Containers as a Service ● IaaS: Infrastructure as a Service ● On-premise: traditional/manual 29

Slide 30

Slide 30 text

Infrastructure concerns ● SaaS: configuration 30

Slide 31

Slide 31 text

Infrastructure concerns ● SaaS: configuration ● FaaS: core business logic 31

Slide 32

Slide 32 text

Infrastructure concerns ● SaaS: configuration ● FaaS: core business logic ● PaaS: scaling, resource mgt 32

Slide 33

Slide 33 text

Infrastructure concerns ● SaaS: Software as a Service ● FaaS: Function as a Service ● PaaS: Platform as a Service ● CaaS: Containers as a Service ● IaaS: Infrastructure as a Service ● On-premise: traditional/manual 33

Slide 34

Slide 34 text

Capabilities and Automation 34

Slide 35

Slide 35 text

Every software system needs a focus on operability to be successful 35

Slide 36

Slide 36 text

What is Serverless? Beyond DevOps - Operability for Serverless and IoT 36 @ConfluxHQ

Slide 37

Slide 37 text

Serverless: pay only to execute differentiating code 37

Slide 38

Slide 38 text

Serverless 38 Event Stream architecture

Slide 39

Slide 39 text

FaaS: Functions as a Service 39 AWS Lambda Azure Functions Google Cloud Functions

Slide 40

Slide 40 text

PaaS: Platform as a Service 40

Slide 41

Slide 41 text

What is an IoT platform? Beyond DevOps - Operability for Serverless and IoT 41 @ConfluxHQ

Slide 42

Slide 42 text

IoT Platform: control and respond to billions of sometimes connected devices 42

Slide 43

Slide 43 text

IoT Platform: consumer 43

Slide 44

Slide 44 text

IoT Platform: manufacturing 44 www.bosch-iot-suite.com

Slide 45

Slide 45 text

IoT Platform: exploratory 45

Slide 46

Slide 46 text

Operability: a key focus Beyond DevOps - Operability for Serverless and IoT 46 @ConfluxHQ

Slide 47

Slide 47 text

operability: an ability to work well (in Production) 47

Slide 48

Slide 48 text

Operability Deploy Monitor Diagnose Debug Query Control Inspect Clear 48

Slide 49

Slide 49 text

Operability for Serverless & IoT Beyond DevOps - Operability for Serverless and IoT 49 @ConfluxHQ

Slide 50

Slide 50 text

Operability for Serverless/IoT 50 “Even if the cloud provider is doing everything, ... is my latency where my customers need it to be? The provider’s going to do the best they can to give me a great service, but if my customers don’t agree, then I have a problem.” -- Kelsey Hightower, @kelseyhightower https://read.acloud.guru/you-need-sre-skills-to-thrive-in-a-serverless-world-kelsey-hightower-340a002b3730

Slide 51

Slide 51 text

Operability for Serverless/IoT 51 ● UX: latency, “service” ● Prevent: pro-active, improve ● Security: perimeter explosion ● Audit: data, traceability, archive ● Compliance: PII, GDPR, SOX ● Cost control: pay per execution

Slide 52

Slide 52 text

User Experience 52 ● Latency ● “Service” “How much does this organisation care about my end-to-end experience?”

Slide 53

Slide 53 text

Preventative approaches 53 ● Billions of metrics ● Automatic correlation ● Automated anomaly detection ● Act as ‘sensing’ for Product “It just works”

Slide 54

Slide 54 text

54

Slide 55

Slide 55 text

Security 55 ● Perimeter explosion (FaaS, IoT) ● Auto-renewal for Certs ● Egress detection for data ● Dependency scanning “I feel safe with these people”

Slide 56

Slide 56 text

56 Guy Podjarny / InfoQ https://www.infoq.com/articles/serverless-security

Slide 57

Slide 57 text

Audit & Compliance 57 ● What data do you hold? ● Changes: where, when, who? ● Archive: secure, relevant “Show me your approach to data removal for consumers”

Slide 58

Slide 58 text

Cost control 58 ● Detect run-away executions ● Retire little-used functions ● Prevent DoS → $$$ ● Rapid product decisions “What is the cost of this feature?”

Slide 59

Slide 59 text

Operability for Serverless & IoT: design for operational transparency 59

Slide 60

Slide 60 text

Modern logging & tracing 1/ Distinct states (Event ID) {Delivered, InTransit, Arrived} 60

Slide 61

Slide 61 text

Modern logging & tracing 2/ Trace the path (Correlation ID) 612999958.. 61

Slide 62

Slide 62 text

Modern logging & tracing 3/ ● Dev teams and Ops teams collaborate on logging details: ○ Log messages ○ EventID ○ Correlation ● Invest in logging (time & tools) 62

Slide 63

Slide 63 text

Run Book dialogue sheets 1/ runbooktemplate.info 63

Slide 64

Slide 64 text

Run Book dialogue sheets 2/ ● Checklists for typical operational considerations ● Team-friendly exploration around a large table ● See runbooktemplate.info 64

Slide 65

Slide 65 text

Summary Beyond DevOps - Operability for Serverless and IoT 65 @ConfluxHQ

Slide 66

Slide 66 text

Operability for Serverless/IoT ● Is infrastructure automation the best approach for your org? ● Invest in areas that differentiate your organisation ● Make systems operate well 66

Slide 67

Slide 67 text

Operability for Serverless & IoT: design for operational transparency 67

Slide 68

Slide 68 text

Further reading Team Guide to Software Operability Matthew Skelton & Rob Thatcher skeltonthatcher.com/publications Download a free sample chapter 68

Slide 69

Slide 69 text

Questions? matthew@confluxdigital.net @matthewpskelton 69