web application performance • Discover error • Track user behavior (session) Real User Monitoring (RUM) Source: Web Vitals, User-centric performance metrics, Grafana Faro OSS 4
user monitoring that instruments browser frontend applications to capture observability signals.” Key features: • Monitoring applications performance • Captures errors, logs, user activity • Instrument performance and observe full stack Grafana Faro Web SDK Source: Grafana Faro OSS | Web SDK for real user monitoring (RUM) 9
to deploy alloy service automatically • Control traffic load sent from real user Nice to have: • Easy application for new tenant • Slack workflow • Sample code for SSR and CSR app • Nextjs based sample app 22
cluster? • Unified traffic control by SRE • Decouple the business logic and telemetry traffic • Easy deployment for alloy • Cut down the Security Review procedure 25
and tuning for Contour and Alloy • 3 levels of protections 1. Client side sampling 2. Contour rate limit 3. Grafana Alloy rate limit • Increasing load from Loki and Tempo • Continuously tuning for Loki and Tempo • Individual rate limit for each tenant Challenges Load Test Report: Alloy: 1500 RPS (1 core, 1Gi) Envoy: 10000 connection (3 core, 1Gi) 27
• Adopt Loki Rulers to ingest Loki query result into Prometheus • Faster loading for real user monitoring dashboard • Constrained trace propagation in present architecture • Upgrade or update the trace propagation in the intermediate • Block trace propagation header from API gateway • Add allowed list for trace context header (e.g., TraceParent, Uber-Trace-Id) Challenges 28
the issue of unbalanced requests to OTEL collector • Zero-code instrumentation by eBPF (e.g., Grafana Beyla) • Continuously tuning for Tempo, Loki, and Alloy 31