Slide 1

Slide 1 text

Thinking with a "Rolling First" mindset Considering priorities for the engineering and customer needs of today October 06, 2020, The Internet

Slide 2

Slide 2 text

Copyright © SUSE 2020 2 Who am I? openSUSE contributor since it began SUSE employee since 2013 Passionate advocate of rolling releases Linux Distribution Engineer in Future Technology Team focusing on two rolling distributions openSUSE MicroOS – Single Purpose Self Administering OS openSUSE Kubic – MicroOS with Kubernetes & Containers

Slide 3

Slide 3 text

3 Copyright © SUSE 2020 Agenda 01 Changing Upstream Expectations 02 Changing Customer Expectations 03 We Already Roll 04 Engineering Assumptions & Pitfalls 05 Engineering “Rolling First” 06 Releasing “Rolling First”

Slide 4

Slide 4 text

4 Copyright © SUSE 2020 Upstreams

Slide 5

Slide 5 text

5 Copyright © SUSE 2020 Preaching to the Choir

Slide 6

Slide 6 text

Copyright © SUSE 2020 6 Upstreams are fast Upstream software projects move very fast Glibc – New version every 6 months Linux Kernel – New version every 3 months Kubernetes – New version every 3 months SaltStack – New version every 3-6 months Uyuni – New version every 1-2 months Ceph – New version every 1-2 months Podman/skopeo/buildah – New version all the time Cloud Foundry – New versions all the time

Slide 7

Slide 7 text

Copyright © SUSE 2020 7 Short Upstream Support Periods Upstreams rarely care for long support lifecycles Kernel Stable Releases – 4 months Kernel LTS Releases – 6-7 years Kubernetes – 1 year (as of v1.19, was 9 months for earlier versions) SaltStack – 1.5 years (feature freeze after 6 months) Ceph – 2 years

Slide 8

Slide 8 text

Copyright © SUSE 2020 8 API/ABI Compatibility Upstreams rarely have long lasting API/ABI Compatibility Linux Standard Base – is anyone LSB 5.0 certified? Kubernetes – at least 12 months (admin-facing CLI interfaces 6 months)

Slide 9

Slide 9 text

Copyright © SUSE 2020 9 More Upstreams We’re not just a Linux company any more SUSE Linux Enterprise (and family) SUSE Storage SUSE Manager SUSE CaaSP SUSE CAP and more to come...

Slide 10

Slide 10 text

Copyright © SUSE 2020 10

Slide 11

Slide 11 text

Copyright © SUSE 2020 11 Bigger Upstreams - Kubernetes

Slide 12

Slide 12 text

Copyright © SUSE 2020 12 Bigger Upstreams - Kubernetes

Slide 13

Slide 13 text

Copyright © SUSE 2020 13 Bigger Upstreams - Kernel Source: phoronix.com “The Linux Kernel Enters 2020 At 27.8 Million Lines In Git But With Less Developers For 2019”

Slide 14

Slide 14 text

Copyright © SUSE 2020 14 A More Efficient SUSE SUSE historically has never grown as fast as our upstream codebases SUSE is engaging with more new upstreams at a faster rate than ever before Customers are increasingly aware and interested in upstream developments We need to rise to these emerging challenges

Slide 15

Slide 15 text

15 Copyright © SUSE 2020 Customers

Slide 16

Slide 16 text

Copyright © SUSE 2020 16 Market Share 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 Year 0.00% 0.05% 0.10% 0.15% 0.20% 0.25% 0.30% 0.35% 0.40% % of Stack Overflow questions that month Tag ubuntu centos redhat

Slide 17

Slide 17 text

Copyright © SUSE 2020 17 Market Share

Slide 18

Slide 18 text

Copyright © SUSE 2020 18 Market Share

Slide 19

Slide 19 text

Copyright © SUSE 2020 19 Reminder Red Hat Enterprise Linux – 10 year support (3 years less than SUSE Linux Enterprise) Ubuntu – 5 year support (8 years less than SUSE Linux Enterprise) Our more popular competition support things for shorter than we do

Slide 20

Slide 20 text

Copyright © SUSE 2020 20 Complex Products Not only are our products more complex, but the way customers use it is more complex Increasingly rare that customers buy a single SUSE product as the solution to their problem Increasingly common that SUSE products are purchased as part of a wider solution, including external dependencies wholly out of our control, influence, and often knowledge Interoperability is key

Slide 21

Slide 21 text

Copyright © SUSE 2020 21 Interoperability in Complex Solutions If SUSE products are just a cog in a much larger customer machinery, how likely is it that customers will decide all their other dependent technologies to match their choice of SUSE? Adaptability to customers other choices is required for Interoperability We can’t adapt if the codebase is frozen

Slide 22

Slide 22 text

Copyright © SUSE 2020 22 Adoption of New Technologies There are 5 categories of innovation adopters Innovators (2.5%) - most willing to take risks, most financially fluid Early Adopters (13.5%) Early Majority (34%) Late Majority (34%) Laggards (16%) - least willing to take risks, least financially fluid (Rogers, Diffusion of Innovations 5th Ed.)

Slide 23

Slide 23 text

Copyright © SUSE 2020 23 Adoption of New Technologies

Slide 24

Slide 24 text

Copyright © SUSE 2020 24 Requests with “Very High Customer Interest” jsc#PM-2107 “Haproxy 1.8 on SLES 12SP5 HA” jsc#PM-1904 “Switch to Firefox 78 ESR release stream” jsc#PM-1835 “Upgrade mod_nss to enable TLSv1.3” jsc#PM-1804 “Integrate WireGuard into Kernel” jsc#PM-155[5, 6, 7] Tensorflow1, Tensorflow2, Pytorch (packages and containers) jsc#PM-1443 “Update protobuf to 3.6 or later” jsc#PM-1386 “git for SLES 12 with SHA 256 support” jsc#PM-1332 “Clustertools2 in SLESforSAP are outdated. Clients use the version from GitHub” jsc#PM-1237 “Update SLURM to version 19.05.1-2 or later” jsc#PM-1090 “Select and provide an updated kernel for 15 SP2”

Slide 25

Slide 25 text

Copyright © SUSE 2020 25 Update SLURM to version 19.05.1-2 or later “HPC customers often require the latest version of numerical libraries.”

Slide 26

Slide 26 text

26 Copyright © SUSE 2020 We Already Roll

Slide 27

Slide 27 text

Copyright © SUSE 2020 27 We Already Roll SLE 15 GA - 3517 source packages SLE 15 SP1 - 1163 source packages SLE 15 SP2 - 1628 source packages SLE 15 Updates - 7455 update sources (updating 1296 packages) SLE 15 SP1 Updates - 2699 update sources (updating 482 packages) SLE 15 SP2 Updates - 835 update sources (updating 252 packages) PackageHub (SLE 15) – 8826 source packages, 1099 update sources PackageHub (SLE 15 SP1) – 8179 source packages, 1145 update sources PackageHub (SLE 15 SP2) – 10135 source packages, 237 update sources

Slide 28

Slide 28 text

Copyright © SUSE 2020 28 Version Updates in Service Packs SLE 15 GA – 2579 RPM Version Changes (compared to 12 SP3) SLE 15 SP1 – 2095 RPM Version Changes (compared to GA) SLE 15 SP2 – 2855 RPM Version Changes (compared to SP1) Source: http://xcdchk.suse.de

Slide 29

Slide 29 text

Copyright © SUSE 2020 29 Version Updates Occur in Update Repos

Slide 30

Slide 30 text

Copyright © SUSE 2020 30 We Don’t Do It Perfectly (yet) jsc#PM-1876 “SLES parameter changes happen from time to time and having an impact to certain application” jsc#PM-2220 “Usability issue with installation of packages from package hub” jsc#PM-2216 Patch-level “Tagging” for (SUSE) MicroOS jsc#PM-2194 “document lifecycles and support of various components in SLE better”

Slide 31

Slide 31 text

31 Copyright © SUSE 2020 Engineering Assumptions & Pitfalls

Slide 32

Slide 32 text

Copyright © SUSE 2020 32 Stable Base + Rolling Layer “Customers want new stuff? Easy! Just keep the base stable and release faster stuff on top”

Slide 33

Slide 33 text

Copyright © SUSE 2020 33 This is Theseus’ Paradox

Slide 34

Slide 34 text

Copyright © SUSE 2020 34 Case Study – GregKH’s Tumbleweed (pre 2014) Tumbleweed was originally started by Greg Kroah-Hartman in 2010 Rolling base atop regular openSUSE releases Focus on Kernel, KDE, GNOME and some desktop Apps Would overwrite/supersede packages from regular release ”Reset-to-zero” every regular release

Slide 35

Slide 35 text

Copyright © SUSE 2020 35 Case Study – GregKH’s Tumbleweed (pre 2014) “Partially Rolling” was painful for both users and engineers Constant breakage over the growing chasm between the ‘stable’ base and rolling top Ad-hoc tinkering/superseding of the ‘stable’ base stops it being stable “Reset-to-zero” rebase to a new stable base every 8 months was brutally disruptive for all users

Slide 36

Slide 36 text

Copyright © SUSE 2020 36 Lessons Learned – Modern Tumbleweed Evolved out of efforts to stabilise openSUSE:Factory Build all packages together, rebuild dependency tree as new/updated packages added (leveraging OBS) Test all relevant use cases, focusing on the way users use them (leveraging openQA, LTP, and various release bots) Sustainable engineered and usable for it’s target audience for 6 years running

Slide 37

Slide 37 text

Copyright © SUSE 2020 37 Case Study – Modules/Package Hub jsc#PM-2220 “Usability issue with installation of packages from package hub” Package Hub packages can require packages from ANY module Customers have no way of knowing WHICH module is required when `zypper install` fails with a “nothing provides XXXX” error

Slide 38

Slide 38 text

Copyright © SUSE 2020 38 Containerisation/Sandboxing “Customers want new stuff? Easy! Just keep the base stable and release faster stuff in containers”

Slide 39

Slide 39 text

Copyright © SUSE 2020 39 Case Study - AppImage Portable Software format, containing binaries and required libraries in an executable archive Promises “Linux apps that run anywhere” Used by various upstreams to distribute their own binaries

Slide 40

Slide 40 text

Copyright © SUSE 2020 40 Case Study - AppImage Does not run everywhere

Slide 41

Slide 41 text

Copyright © SUSE 2020 41 Case Study - FlatPak os-tree based packaging and distribution format Uses common ‘runtimes’ to aid portability Promises to work on every Linux distribution

Slide 42

Slide 42 text

Copyright © SUSE 2020 42 Case Study - FlatPak

Slide 43

Slide 43 text

Copyright © SUSE 2020 43 Case Study - FlatPak Works on every Linux distribution that packages FlatPak ‘Runtimes’ are just containerised distributions, some focused on a specific ecosystem (eg. gtk, Qt) others with as broad a scope as a full blown distribution Base system still needs to have flatpak and all of its dependencies and kernels/core libraries that provide all the functionality required

Slide 44

Slide 44 text

Copyright © SUSE 2020 44 Master openSUSE Kubic Case Study - Kubernetes Kubelet Container Runtime Control Plane Node openSUSE Kubic Kubelet Container Runtime Containers Node openSUSE Kubic Kubelet Container Runtime Containers Node openSUSE Kubic Kubelet Container Runtime Containers

Slide 45

Slide 45 text

Copyright © SUSE 2020 45 Case Study - Kubernetes Control Plane containers must contain binaries equal to or newer than kubelet versions kubelet & container runtime are traditional binaries on the host OS Control Plane containers must be updated & cluster reconfigured before kubelet can be updated kubelet and container runtime must be compatible versions

Slide 46

Slide 46 text

Copyright © SUSE 2020 46 Lessons Learned – openSUSE Kubic Control Plane containers built as part of traditional Tumbleweed/Kubic snapshot New Containers, kubelet, and container runtimes released at the same time kubelet packaged for multi-versioning (/usr/bin/kubelet1.xx) All supported kubelet versions installed on all nodes user-space tooling enables correct kubelet binary for deployed cluster version

Slide 47

Slide 47 text

Copyright © SUSE 2020 47 Lessons Learned – openSUSE Kubic Distribution neutral/system isolated containers is a myth Building & releasing containers in alignment with traditional RPM packages is essential Containers can impose ‘unfair’ dependencies on the host OS that traditional packaging cannot model or resolve

Slide 48

Slide 48 text

48 Copyright © SUSE 2020 Engineering “Rolling First”

Slide 49

Slide 49 text

Copyright © SUSE 2020 49 “Rolling First” “Rolling First” Engineering is the concept of building our products with the following assumptions ingrained the mindset & processes: Binary & library versions will change over the lifespan of the product Functionality will be added and removed over the lifespan of the product Essential functionality and usability must be ensured throughout

Slide 50

Slide 50 text

Copyright © SUSE 2020 50 “Rolling First”

Slide 51

Slide 51 text

Copyright © SUSE 2020 51 The Rolling Engineering Axiom “In order to move ANYTHING quickly, you need to be able to move EVERYTHING quickly”

Slide 52

Slide 52 text

Copyright © SUSE 2020 52 Unsafe At Any Speed? Full speed ahead is not the only speed Tumbleweed has proven processes for releasing at the pace of upstream/contributions ”Rolling First” within SUSE needs to release at the pace of the market/customers

Slide 53

Slide 53 text

Copyright © SUSE 2020 53 “Rolling First” - Benefits Ability to respond to customer demands more rapidly Ability to keep up with and more effectively contribute to our essential upstreams Ability to distribute work more evenly throughout the calendar year Reduce corrective work by teams dependent on other products Does not necessarily need to impact how we release to customers

Slide 54

Slide 54 text

Copyright © SUSE 2020 54 “Rolling First” - Risks Not every customer wants to move at the same speed as every other Not every upstream is aligned with their related and dependent codebases We will never know every use case our customers have for our products, so we can never be sure our products are suitable for everything

Slide 55

Slide 55 text

Copyright © SUSE 2020 55 Testing, Testing, Testing Essential functionality for customers must be captured, modeled, tested and used as a release gate continuously No chance that manual testing can keep up openQA and other CI/CD tools and processes provide a good start, but investment will be needed from PM, QE, and Engineering

Slide 56

Slide 56 text

Copyright © SUSE 2020 56 Scoping, Scoping, Scoping openSUSE MicroOS and Kubic have shown that narrowing the scope of the OS and leveraging containers can mitigate some of the risks of this approach Containers can ‘drift’ ahead or behind the base OS for brief periods without impacting compatibility A more narrowly defined base OS has fewer changes, and fewer risks introduced by those changes

Slide 57

Slide 57 text

Copyright © SUSE 2020 57 Core Next? openSUSE Tumbleweed Leap 15.2 SLE 15 SP2 Core 15.2 Leap/Jump 15.3 SLE 15 SP3 Core 15.3

Slide 58

Slide 58 text

Copyright © SUSE 2020 58 Leap/Jump Next? CAP Next? SES Next? SUMA Next? CaaSP Next? ? ? SLE Next? Core Next? openSUSE Tumbleweed Leap 15.2 SLE 15 SP2 Core 15.2 Leap/Jump 15.3 SLE 15 SP3 Core 15.3

Slide 59

Slide 59 text

Copyright © SUSE 2020 59 Leap/Jump Next? CAP Next? SES Next? SUMA Next? CaaSP Next? ? ? SLE Next? Core Next? openSUSE Tumbleweed Leap 15.2 SLE 15 SP2 Core 15.2 Leap/Jump 15.3 SLE 15 SP3 Core 15.3

Slide 60

Slide 60 text

60 Copyright © SUSE 2020 Releasing “Rolling First”

Slide 61

Slide 61 text

Copyright © SUSE 2020 61 Changing Customer Expectations

Slide 62

Slide 62 text

Copyright © SUSE 2020 62 Changing Customer Expectations We have a history best aligned with late majority/laggard innovators Our newer products & acquisitions and Track 2 initiatives give us an opportunity to target the earlier adopters and majority This can be done either by offering something rolling & faster in parallel to our current offerings, or by adapting our current offerings to be rolling & faster Customers must be convinced, but the facts are on this side

Slide 63

Slide 63 text

Thank you. © 2020 SUSE LLC. All Rights Reserved. SUSE and the SUSE logo are registered trademarks of SUSE LLC in the United States and other countries. All third-party trademarks are the property of their respective owners. For more information, contact SUSE at: +1 800 796 3700 (U.S./Canada) +49 (0)911-740 53-0 (Worldwide) SUSE Maxfeldstrasse 5 90409 Nuremberg www.suse.com