Slide 1

Slide 1 text

Ask the Product Manager: Top 5 Bugs in September Mike Barrett, Senior Director of Product Management September 21, 2020

Slide 2

Slide 2 text

Envoy | Layer7 Inside Apps | mTLS | Cert Lifecycle | Application Patterns Quarkus | gRPC | Cloud Native Business Rules | Kafka | AI toolkits | Structureless Data Density | Complex Scheduling | Vertical Scaling | Eviction and Limit Automations | Problem Detection | Groups V2 | KMS to Vaults Self Compliance | Artifact Freshness | Tenant Level Observability | Storage to Backup Automations KubeVirt | RHCOS| Katacontainers | Bare Metal | Edge Formfactor | High Performance Networking Deeper Automations with Vendored Clouds Service Access | Networking | Routing | Machine Scaling Amazon Web Services Microsoft Azure Google Cloud IBM Cloud OpenStack Serverless Code & Event Based Merger with Integration Services Multi-Cluster & Multi-Vendor | Placement Policy | Configuration Enforcement | Governance | Compliance | Recovery API Management The Next 24 Months AWS OutPost Azure Arc Google Anthos IBM Cloud RHT Open Hybrid Cloud VMware Tanzu Extending IaaS via Network and Remote Control Points Pipelines | GitOps | Builds | Workspaces Autonomous Platform with Connected Intelligence

Slide 3

Slide 3 text

3 Supported Releases for Binary Fixing (Patching) June, 2018 Kubernetes 1.11 4 months Oct, 2018 OpenShift 3.11 (Until June, 2022) Sept, 2019 Kubernetes 1.16 4 months Jan, 2020 OpenShift 4.3 (Until 4.6) Dec, 2019 Kubernetes 1.17 4 months April, 2020 OpenShift 4.4 (Until 4.7) March, 2020 Kubernetes 1.18 4 months July, 2020 OpenShift 4.5 (Until 4.8) Aug, 2020 Kubernetes 1.19 2 months Oct, 2020 OpenShift 4.6 (Until May 2022) https://kubernetes.io/docs/setup/release/version-skew-policy/

Slide 4

Slide 4 text

Security fixes 100s of defect and performance fixes 200+ validated integrations Middleware & Storage integrations (container images, storage, networking, cloud services, etc) Enterprise lifecycle management Certified Kubernetes Kubernetes Release OpenShift Release 1-4 months hardening What it takes to create an OpenShift Product Release https://bugzilla.redhat.com/ https://issues.redhat.com/

Slide 5

Slide 5 text

5 Why Trail the Upstream The Sweet Spot is 1 Release Behind for Production Level Support

Slide 6

Slide 6 text

6 Sprint Releases Sprint Start Sprint Start Sprint Start Sprint Start Sprint Start Begin Next Release Begin RCM Process Deploy to dev-prev-prod Deploy to dev-prev-prod Release Start Note: Total number of sprints may vary by release Deploy to dev-prev-prod Stage 1 Dependencies Due Deploy to dev-prev-prod Kube rebase delivered Kube rebase #2 delivered (if needed) Sprint Start Deploy to dev-prev-prod Feature Complete OCP GA No New Features/Bug Burn Down Code Freeze / Begin Final Regression

Slide 7

Slide 7 text

7 Dec Kube 1.20 Kube 1.21 branch Jan Kube 1.20.z Mar Kube 1.21 Kube 1.22 branch Upstream Fix Sept Kube 1.19.2 Kube 1.20 branch Oct OCP 4.6 Backport & Ship fix OCP 4.6.z z-stream Nov Kube 1.20 Nov OKD 4.7 Nightlies Backport Fix Kube 1.19.z If allowed Downstream Fix OKD 4.7 Nightlies 1 2 3 4 May OCP 4.8 Apr Kube 1.21.z OpenShift Dedicated on OCP 4.7 OpenShift Release and Example Fix Every 1 week Feb OCP 4.7 Cherry Pick Back https://github.com/kubernetes/sig-release/blob/master/releases/patch-releases.md https://github.com/kubernetes/community/blob/master/contributors/devel/sig-release/release.md Bug Hits 4.6!

Slide 8

Slide 8 text

8 Bug 1: After installation infra and audit index pattern not available in Kibana https://bugzilla.redhat.com/show_bug.cgi?id=1866619 https://bugzilla.redhat.com/show_bug.cgi?id=1877414 https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#subjectaccessreviewspec- v1-authorization-k8s-io $ oc adm groups new test quicklab group.user.openshift.io/test created $ oc get group test NAME USERS test quicklab $ oc adm policy add-cluster-role-to-group cluster-admin test clusterrole.rbac.authorization.k8s.io/cluster-admin added: "test" $ oc whoami quicklab $ oc auth can-i get pods --subresource=log yes $ oc auth can-i get pods --subresource=log --as=quicklab no $ oc auth can-i get pods --subresource=log --as=quicklab --as-groups=test

Slide 9

Slide 9 text

9 Bug 2: KubeAPIErrorsHigh firing on daily base but at random times https://bugzilla.redhat.com/show_bug.cgi?id=1748434 https://github.com/kubernetes/enhancements/pull/1878 https://bugzilla.redhat.com/show_bug.cgi?id=1877346 https://github.com/kubernetes/kubernetes/issues/91073 Had to do with handling a API server reboot or network outage better from a kubelet point of view.

Slide 10

Slide 10 text

10 Bug 3: Machine Config Daemon Daemon Set does not set universal Toleration (and therefore gets booted if taints are set on a node) https://bugzilla.redhat.com/show_bug.cgi?id=1780318 Had to do with remembering to place a toleration on your Kubernetes Operator’s operand node.

Slide 11

Slide 11 text

11 Bug 4: Etcd cluster "etcd": 100% of requests for Watch failed on etcd instance :2379. grpc_service="etcdserverpb.Watch" https://bugzilla.redhat.com/show_bug.cgi?id=1677689 https://github.com/etcd-io/etcd/pull/11375 https://github.com/etcd-io/etcd/pull/12196 More conclusively determine that a leader has actually been lost before propagating a ErrGRPCNoLeader error.

Slide 12

Slide 12 text

12 Bug 5: Unable to provision vSphere volume https://bugzilla.redhat.com/show_bug.cgi?id=1821280 https://github.com/kubernetes/kubernetes/pull/93971 https://github.com/kubernetes/kubernetes/pull/90836 When the vSphere Kubernetes secret is updated that is used by the dynamic storage provider it doesn’t pick up the new secret and fails to create the PV.

Slide 13

Slide 13 text

linkedin.com/company/red-hat youtube.com/user/RedHatVideos facebook.com/redhatinc twitter.com/RedHat Thank you 13