Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ask the Product Manager: Top 5 Bugs in September with Mike Barrett, Senior Director of Product Management

Ask the Product Manager: Top 5 Bugs in September with Mike Barrett, Senior Director of Product Management

Top 5 problems with Kubernetes and how we are fixing them with Mike Barrett

OpenShift is used by over 1,000 customers. Those customers call Red Hat support when they have questions. I'm going to take you through the top 5 issues that come up the most.

Red Hat Livestreaming

September 21, 2020
Tweet

More Decks by Red Hat Livestreaming

Other Decks in Technology

Transcript

  1. Ask the Product Manager: Top 5 Bugs in September Mike

    Barrett, Senior Director of Product Management September 21, 2020
  2. Envoy | Layer7 Inside Apps | mTLS | Cert Lifecycle

    | Application Patterns Quarkus | gRPC | Cloud Native Business Rules | Kafka | AI toolkits | Structureless Data Density | Complex Scheduling | Vertical Scaling | Eviction and Limit Automations | Problem Detection | Groups V2 | KMS to Vaults Self Compliance | Artifact Freshness | Tenant Level Observability | Storage to Backup Automations KubeVirt | RHCOS| Katacontainers | Bare Metal | Edge Formfactor | High Performance Networking Deeper Automations with Vendored Clouds Service Access | Networking | Routing | Machine Scaling Amazon Web Services Microsoft Azure Google Cloud IBM Cloud OpenStack Serverless Code & Event Based Merger with Integration Services Multi-Cluster & Multi-Vendor | Placement Policy | Configuration Enforcement | Governance | Compliance | Recovery API Management The Next 24 Months AWS OutPost Azure Arc Google Anthos IBM Cloud RHT Open Hybrid Cloud VMware Tanzu Extending IaaS via Network and Remote Control Points Pipelines | GitOps | Builds | Workspaces Autonomous Platform with Connected Intelligence
  3. 3 Supported Releases for Binary Fixing (Patching) June, 2018 Kubernetes

    1.11 4 months Oct, 2018 OpenShift 3.11 (Until June, 2022) Sept, 2019 Kubernetes 1.16 4 months Jan, 2020 OpenShift 4.3 (Until 4.6) Dec, 2019 Kubernetes 1.17 4 months April, 2020 OpenShift 4.4 (Until 4.7) March, 2020 Kubernetes 1.18 4 months July, 2020 OpenShift 4.5 (Until 4.8) Aug, 2020 Kubernetes 1.19 2 months Oct, 2020 OpenShift 4.6 (Until May 2022) https://kubernetes.io/docs/setup/release/version-skew-policy/
  4. Security fixes 100s of defect and performance fixes 200+ validated

    integrations Middleware & Storage integrations (container images, storage, networking, cloud services, etc) Enterprise lifecycle management Certified Kubernetes Kubernetes Release OpenShift Release 1-4 months hardening What it takes to create an OpenShift Product Release https://bugzilla.redhat.com/ https://issues.redhat.com/
  5. 5 Why Trail the Upstream The Sweet Spot is 1

    Release Behind for Production Level Support
  6. 6 Sprint Releases Sprint Start Sprint Start Sprint Start Sprint

    Start Sprint Start Begin Next Release Begin RCM Process Deploy to dev-prev-prod Deploy to dev-prev-prod Release Start Note: Total number of sprints may vary by release Deploy to dev-prev-prod Stage 1 Dependencies Due Deploy to dev-prev-prod Kube rebase delivered Kube rebase #2 delivered (if needed) Sprint Start Deploy to dev-prev-prod Feature Complete OCP GA No New Features/Bug Burn Down Code Freeze / Begin Final Regression
  7. 7 Dec Kube 1.20 Kube 1.21 branch Jan Kube 1.20.z

    Mar Kube 1.21 Kube 1.22 branch Upstream Fix Sept Kube 1.19.2 Kube 1.20 branch Oct OCP 4.6 Backport & Ship fix OCP 4.6.z z-stream Nov Kube 1.20 Nov OKD 4.7 Nightlies Backport Fix Kube 1.19.z If allowed Downstream Fix OKD 4.7 Nightlies 1 2 3 4 May OCP 4.8 Apr Kube 1.21.z OpenShift Dedicated on OCP 4.7 OpenShift Release and Example Fix Every 1 week Feb OCP 4.7 Cherry Pick Back https://github.com/kubernetes/sig-release/blob/master/releases/patch-releases.md https://github.com/kubernetes/community/blob/master/contributors/devel/sig-release/release.md Bug Hits 4.6!
  8. 8 Bug 1: After installation infra and audit index pattern

    not available in Kibana https://bugzilla.redhat.com/show_bug.cgi?id=1866619 https://bugzilla.redhat.com/show_bug.cgi?id=1877414 https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#subjectaccessreviewspec- v1-authorization-k8s-io $ oc adm groups new test quicklab group.user.openshift.io/test created $ oc get group test NAME USERS test quicklab $ oc adm policy add-cluster-role-to-group cluster-admin test clusterrole.rbac.authorization.k8s.io/cluster-admin added: "test" $ oc whoami quicklab $ oc auth can-i get pods --subresource=log yes $ oc auth can-i get pods --subresource=log --as=quicklab no $ oc auth can-i get pods --subresource=log --as=quicklab --as-groups=test
  9. 9 Bug 2: KubeAPIErrorsHigh firing on daily base but at

    random times https://bugzilla.redhat.com/show_bug.cgi?id=1748434 https://github.com/kubernetes/enhancements/pull/1878 https://bugzilla.redhat.com/show_bug.cgi?id=1877346 https://github.com/kubernetes/kubernetes/issues/91073 Had to do with handling a API server reboot or network outage better from a kubelet point of view.
  10. 10 Bug 3: Machine Config Daemon Daemon Set does not

    set universal Toleration (and therefore gets booted if taints are set on a node) https://bugzilla.redhat.com/show_bug.cgi?id=1780318 Had to do with remembering to place a toleration on your Kubernetes Operator’s operand node.
  11. 11 Bug 4: Etcd cluster "etcd": 100% of requests for

    Watch failed on etcd instance <ETCD>:2379. grpc_service="etcdserverpb.Watch" https://bugzilla.redhat.com/show_bug.cgi?id=1677689 https://github.com/etcd-io/etcd/pull/11375 https://github.com/etcd-io/etcd/pull/12196 More conclusively determine that a leader has actually been lost before propagating a ErrGRPCNoLeader error.
  12. 12 Bug 5: Unable to provision vSphere volume https://bugzilla.redhat.com/show_bug.cgi?id=1821280 https://github.com/kubernetes/kubernetes/pull/93971

    https://github.com/kubernetes/kubernetes/pull/90836 When the vSphere Kubernetes secret is updated that is used by the dynamic storage provider it doesn’t pick up the new secret and fails to create the PV.