How to achieve full-stack Observability with AWS

© SMS Co., Ltd. Takashi Kaga (SMS Co.,Ltd) JAWS PANKRATION
2024 How to achieve full-stack Observability with AWS

© SMS Co., Ltd. 1. 2. 3. 4. About Me
Observability Learning from CNCF How to Achieve full-stack Observability Using Amazon CloudWatch summary Agenda

© SMS Co., Ltd. About Me • Takashi Kaga (@TAKA_0411)
• SRE at SMS Co.,Ltd • AWS Community Builder (Cloud Operations, since 2023) • Core Member : Media-JAWS

© SMS Co., Ltd. About CNCF Cloud Native Computing Foundation
(CNCF) https://www.cncf.io/ - Cloud Native Computing Foundation - CNCF is part of the Linux Foundation and was founded in 2015 - CNCF offers support for growing cloud-native projects - CNCF is creating an Observability Whitepaper

© SMS Co., Ltd. CNCF and representative projects Graduated and
Incubating Projects https://www.cncf.io/projects/ - Argo (Continuous Integration & Delivery) - Fluentd (Observability) - Istio (Service Mesh) - Kubernetes (Scheduling & Orchestration) - Prometheus (Observability) - OpenTelemetry (Observability)

© SMS Co., Ltd. Observability as defined by CNCF 「It
is a function of a system with which humans and machines can observe, understand and act on the state of said system.」 Observability Whitepaper : What is Observability? https://github.com/cncf/tag-observability/blob/main/whitepaper.md

© SMS Co., Ltd. What is observing the state of
the system? Telemetry correlation for deeper insights https://ubuntu.com/observability/what-is-observability Observe and analyze various data output by the system to be able to estimate and address the internal state of the system.

© SMS Co., Ltd. Observe the system : Monitoring and
Observability Monitoring - Monitoring sets specific conditions and thresholds and is intended to periodically check the status of the system. - Monitoring reveals the “when” and “what” of system errors. Observability - Observability is aimed at understanding the internal status of the system, preventing problems and identifying their causes. - Observability reveals the “why” and “how” of system errors.

© SMS Co., Ltd. Observe the system : analyze data
output by system Telemetry correlation for deeper insights https://ubuntu.com/observability/what-is-observability Observe and analyze various data output by the system to be able to estimate and address the internal state of the system.

© SMS Co., Ltd. analyze data : Telemetry Data Telemetry
Data - It is an important element in understanding system status. - It is collected to detect and resolve system anomalies, optimize performance, etc. - CNCF defines Metrics, Logs, and Traces as primary signals. - In addition, signals such as Profiles and Dumps are also important.

© SMS Co., Ltd. Primary signals as defined by CNCF
Observability Whitepaper : Observability Signals https://github.com/cncf/tag-observability/blob/main/whitepaper.md

© SMS Co., Ltd. Telemetry Data Metrics - Quantified data
on various activities. - Already quantified data - CPU utilization, Memory utilization - Data broken down as numerical values - Number of requests Observability Whitepaper : Metrics https://github.com/cncf/tag-observability/blob/main/whitepaper.md

© SMS Co., Ltd. Telemetry Data Logs - Describes activities
and operations that occur in an OS, application, server, etc. - System logs - Application logs - Security logs - Audit logs Observability Whitepaper : Logs https://github.com/cncf/tag-observability/blob/main/whitepaper.md

© SMS Co., Ltd. Telemetry Data Traces - A description
of what happened in a distributed transaction, such as a request initiated by an end user. Observability Whitepaper : Traces https://github.com/cncf/tag-observability/blob/main/whitepaper.md

© SMS Co., Ltd. Example : Traces in Datadog -
Horizontal Axis : Time Axis - Vertical Axis : Call Relationship

© SMS Co., Ltd. Telemetry Data Profiles - Sampling of
stack traces in runtime - CPU Profiler, Heap Profiler, IO profiler, etc. - Program language specific Profiler - pprof (Go), Xdebug (PHP) Dumps - Snapshot at a point in time - core dump, etc. Observability Whitepaper : Profiles, Dumps https://github.com/cncf/tag-observability/blob/main/whitepaper.md

© SMS Co., Ltd. Summary so far - CNCF's Observability
Whitepaper will help you understand the concept and practice of Observability. - CNCF definition of observability is “It is a function of a system with which humans and machines can observe, understand and act on the state of said system.” - Telemetry data is important to achieve observability. Metrics, Logs, Traces, Profiles, Dumps

© SMS Co., Ltd. 03 How to Achieve full-stack Observability
Using Amazon CloudWatch

© SMS Co., Ltd. Definition of full-stack Observability An observability
solution that monitors the entire stack of services, including front-end and back-end as well as end-user experience and security. (※) (※) Definitions vary by vendor offering Observability SaaS

© SMS Co., Ltd. full-stack Observability by Splunk Cisco and
Splunk Bring Full-Stack Observability to the Entire Enterprise https://www.splunk.com/en_us/blog/devops/cisco-and-splunk-bring-full-stack-observability-to-the-entire-enterprise.html

© SMS Co., Ltd. full-stack Observability by Dynatrace What is
full-stack observability? https://www.dynatrace.com/knowledge-base/full-stack-observability/

© SMS Co., Ltd. Observability Services in AWS Amazon CloudWatch
- Collect key telemetry data from AWS services and other sources - CloudWatch Metrics - CloudWatch Logs - CloudWatch Application Signals - Numerous other functions besides data collection

© SMS Co., Ltd. Amazon CloudWatch Feature List Amazon CloudWatchの概要と基本
: AWS Black Belt Online Seminar https://pages.awscloud.com/rs/112-TZM-766/images/AWS-Black-Belt_2023_AmazonCloudWatch_0330_v1.pdf Application Signals (Trace) is not yet available as this is a March 2023 document.

© SMS Co., Ltd. Topic : Simple Web Services front-end
(user experience) back-end (infrastructure) Build / Deploy Pipelines Security Focus on front-end and back-end this time

© SMS Co., Ltd. 1. front-end monitoring front-end (user experience)
back-end (infrastructure) Build / Deploy Pipelines Security

© SMS Co., Ltd. 1. front-end monitoring Users execute CloudWatch
RUM's JavaScript in their browsers to metrics, logs, etc. Link metrics and real-time logs output by CloudFront to CloudWatch. Headless browsers access endpoints (URLs) to obtain metrics, traces, screenshots, etc.

© SMS Co., Ltd. 2. back-end monitoring front-end (user experience)
back-end (infrastructure) Build / Deploy Pipelines Security

© SMS Co., Ltd. 2. back-end monitoring Link metrics output
by ALB and Fargate to CloudWatch. Linking trace data to CloudWatch. (Automatic instrumentation of OpenTelemetry) Metrics output by data store services are linked to CloudWatch. Link task logs and DB logs to CloudWatch Logs.

© SMS Co., Ltd. About full-stack Observability with AWS -
First, read CNCF's Observability Whitepaper to understand the concept of Observability and telemetry data. - If the workload is in AWS, use Amazon CloudWatch to proceed with collection and analysis of telemetry data. - Full stack observability with Amazon CloudWatch is feasible, although there are challenges in getting Profiles and Dumps.

© SMS Co., Ltd. Amazon CloudWatch or Observability SaaS Amazon
CloudWatch Observability SaaS Introduction Available as soon as the workload is on AWS. There is a slight time lag before it can be used, including account sign-up and initial setup. Feature Features are being added, but not at the same pace as SaaS. Many features have been added, ranging from front-end, mobile application and security visualization. Integration with other services If you need to integrate with SaaS or other services, you will need to set up and build it yourself. It is possible to monitor a large amount of data by linking with various services. Cost It is an inexpensive way to collect key telemetry data, and it is excellent for cost analysis. Traces and Logs are useful but can spike costs; CloudWatch integration can increase AWS costs. Support You can expect extensive follow-up by AWS support. Support quality may vary depending on the case, e.g., English-only support.

How to achieve full-stack Observability with AWS

How to achieve full-stack Observability with AWS

More Decks by SMS tech

Other Decks in Technology

Featured

Transcript