Ask an OpenShift Admin | Ep 132 | Multi-Cluster Observability

Slide 1

Slide 1 text

Ask an OpenShift Admin MultiCluster Observability Date: ● July 31th, 2024 1

Slide 2

Slide 2 text

2 Red Hat Advanced Cluster Management What’s new in 2.11 - Highlights ▸ Expanded Kubernetes support matrix ▸ Automated ROSA import - DP ▸ Global Hub Search - TP ▸ Enhanced OLM operator integration - GA ▸ Policy violation debug (diff) in the UI ▸ OpenShift Virtualization enhancements ▸ Right-sizing - Enhanced DP ▸ Fine-grained RBAC for RHACM Observability - TP

Slide 3

Slide 3 text

OpenShift Virtualization enhancements Observe containerized and virtualized workloads across the hybrid cloud 3 Red Hat Advanced Cluster Management for Kubernetes 2.11 Use enhanced Search to quickly see all virtual machines on your fleet. Gain deeper insights and visibility into your OpenShift Virtualization inventory with a ready-to-use dashboard. Generally Available (GA)

Slide 4

Slide 4 text

RHACM Observability v2.11 4

Slide 5

Slide 5 text

Components/Architecture 5 Components: ▸ Hub side ･ MultiCluster observability operator ･ Thanos components ･ Grafana ･ AlertManager ▸ Managed cluster side ･ Endpoint operator ･ Metrics collector ･ Prometheus and exporters (Architecture diagram) Export metrics to 3rd party tools Kafka Victoria Metrics

Slide 6

Slide 6 text

Console Navigation 6 Click link “Grafana” in “Clusters” page, navigate to Grafana console

Slide 7

Slide 7 text

What’s New in this Release 7 ▸ ACM-8879 - [Dev Preview] T-Shirt Sizing configuration for ACM Observability instances ▸ ACM-9865 - Observability Support for Microshift ▸ ACM-10865 - [Tech-preview] support fine-grain RBAC

Slide 8

Slide 8 text

ACM-8879 - [Dev Preview] T-Shirt Sizing configuration for ACM Observability instances 8 Control the scale of ACM Observability without AdvancedConfig! With Instance Sizes, users can now configure a set of resource requests across all their Observability components, sized proportionally, using a single field in their MCO CR, InstanceSize (provided Hub cluster has enough resources). Supported sizes: default, minimal, small, medium, large, 2xlarge, 4xlarge More details in: https://github.com/stolostron/stolostron/tree/main/dev-preview -

Slide 9

Slide 9 text

ACM-9865 - Observability Support for Microshift 9 Feature Description/Why Microshift spokes were not supported due to various issues. They are now partially supported: ● All spoke components are running like any *KS spoke, including prometheus. ● Some metrics like ETCD ones are not scraped yet. How to Enable/Use Import the cluster as any other cluster. -

Slide 10

Slide 10 text

ACM-10865 - [Tech-preview] support fine-grain RBAC 10 Feature Description/Why - By default, RBAC is enforced per managed cluster (all or none) - Large managed clusters shared by multiple teams / applications in the organization want metrics access restricted to just their application's metrics Fine-grain RBAC provides Namespace level granularity by limiting access to specific namespaces. How to Enable/Use 1. Define Cluster roles for granular metrics access, rules define - apiGroup: cluster.open-cluster-management.io # always - resources: managedclusters # always - resourceNames: - verbs: metrics/ - metrics/* indicates all namespaces # special case - (no other wild cards allowed) 2. Define Cluster role bindings to bind users to groups Documentation link here kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: ocm-metrics-access rules: - verbs: - metrics/open-cluster-management-agent - metrics/open-cluster-management-agent-addon apiGroups: - cluster.open-cluster-management.io resources: - managedclusters resourceNames: - devcluster1 - devcluster2 kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: ocm-metrics-access-binding subjects: - kind: User apiGroup: rbac.authorization.k8s.io name: user1 roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: ocm-metrics-access

Slide 11

Slide 11 text

Known Issues 11 ▸ cluster-monitoring-config config map has some options overwritten by ACM Observability (see https://issues.redhat.com/browse/ACM-11724 or https://github.com/stolostron/rhacm-docs/pull/6518 for more details). ･ Planned to be fixed in 2.12.

Slide 12

Slide 12 text

RHACM Observability Architecture 12

Slide 13

Slide 13 text

▸ Metrics and alerting only ▸ All-in-one solution provisioned by ACM ▸ User experience unified in Grafana ▸ One-click fleetwide monitoring stack configuration ▸ Integration with custom-tailored storage/forwarding systems What is OpenShift Observability today? 13 An All-In-One Solution with an Opinionated Design ACM Multi-Cluster Observability as of today

Slide 14

Slide 14 text

▸ MCOA packaged as an additional addon w/ MCO ▸ Standalone community Thanos Operator ▸ Native CRD driven configuration, distributed and managed via addon framework 14 “Observe a fleet like a single cluster” Future ACM Observability

Slide 15

Slide 15 text

Multi-Cluster Observability Configuration Learning From Our Own OCP/ACM Ecosystem Key Objectives ● Customer flexibility ○ Selecting only the most relevant signals ○ Adapting to different infrastructure ● Cost efficiency ○ Maximizing single-cluster solutions ● Single-pane control ○ Providing a single pane-of-glass across the fleet

Slide 16

Slide 16 text

Thanos-Operator 16 Why standalone Thanos Operator? ▸ Kubebuilder architecture with full lifecycle control of Thanos components ▸ Closely built with focus on compatibility with widely used Prometheus Operator (OCP CMO is based on this) ▸ Community-friendly, expecting more and more contributions from upstream users/adopters of Thanos ▸ No upstream widely-adopted operator for Thanos exists yet ▸ In ACM, benefits of opinionated Thanos Operator CRDs and Red Hat guided customisation capabilities