Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Observability for Modern Applications

Observability for Modern Applications

In modernen, auf Microservices-basierenden Anwendungen ist es wichtig, dass die einzelnen Komponenten und die Kommunikation zwischen ihnen vollständig überwacht werden können, um Probleme schnell identifizieren und beheben zu können. Der Vortrag zeigt Techniken und Tools mit denen die nötigen Daten gesammelt und verwendet werden können, um eine durchgängige "Observability" der Anwendung zu erreichen. Das beinhaltet die Themen Monitoring, Tracing, Logging und Service Mesh.

Dennis Kieselhorst

November 20, 2019
Tweet

More Decks by Dennis Kieselhorst

Other Decks in Programming

Transcript

  1. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Dennis Kieselhorst Sr. Solutions Architect Observability for Modern Applications
  2. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Listen Iterate Experiment Innovation Flywheel Experiments power the engine of rapid innovation
  3. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark What changes do you need to make to adopt these best practices? Serverless No provisioning/management Automatic scaling Pay for value billing Availability and resiliency Microservices Componentization Business capabilities Products not projects Infrastructure automation DevOps Cultural philosophies Cross-disciplinary teams CI/CD Automation tools DEV OPS Architectural patterns Operational Model Software Delivery
  4. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Approaches to modern application development • Simplify environment management • Reduce the impact of code changes • Automate operations • Accelerate the delivery of new, high-quality services • Gain insight across resources and applications • Protect customers and the business
  5. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Approaches to modern application development • Simplify environment management with serverless technologies • Reduce the impact of code changes with microservice architectures • Automate operations by modeling applications & infrastructure as code • Accelerate the delivery of new, high-quality services with CI/CD • Gain insight across resources and applications by enabling observability • Protect customers and the business with end-to-end security & compliance
  6. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Approaches to modern application development • Simplify environment management with serverless technologies • Reduce the impact of code changes with microservice architectures • Automate operations by modeling applications & infrastructure as code • Accelerate the delivery of new, high-quality services with CI/CD • Gain insight across resources and applications by enabling observability • Protect customers and the business with end-to-end security & compliance
  7. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Microservices increase release agility Monolithic application Microservices
  8. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Monolith
  9. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Service Service Service Service Service Service Service Service Service Service Service Service
  10. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Rust Database DB Database Rust Go Node.js Java Node.js Node.js
  11. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Containers Database DB Database Containers λ Containers VMs Managed Service
  12. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Proactive operations helps mitigate issues Degraded state Outage Latency Time (ms)
  13. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Observability in Control Theory On the General Theory of Control Systems R. E. KALMAN Introduction In no small measure, the great technological progress in automatic control and communication systems during the past two decades has depended on advances and refinements in the mathematical study of such systems. Conversely, the growth of technology brought forth many new problems (such as those related to using digital computers in control, etc.) to challenge the ingenuity and competence of research workers concerned with theoretical questions. Despite the appearance and effective resolution of many new problems, our understanding of fundamental aspects of control has remained superficial. The only basic advance so far appears to be the theory of information created by Shannon 1. The chief significance of his work in our present interpretation is the discovery of general' laws' underlying the process of information transmission, which are quite independent of the particular models being considered or even the methods used for the des- cription and analysis of these models. These results could be compared with the' laws' of physics, with the crucial difference that the' laws' governing man-made objects cannot be discovered by straightforward experimentation but only by a purely abstract analysis guided by intuition gained in observing present-day examples of technology and economic organization. We may thus classify Shannon's result as belonging to the pure theory of communication and control, while everything else can be labelled as the applied theory; this terminology reflects the well- known distinctions between pure and applied physics or mathematics. For reasons pointed out above, in its methodo- logy the pure theory of communication and control closely resembles mathematics, rather than physics; however, it is not a. branch of mathematics because at present we cannot (yet?) d1sregard questions of physical realizability in the study of mathematical models. This paper initiates study of the pure theory of control imitating the spirit of Shannon's investigations but using entirely different techniques. Our ultimate objective is to answer questions of the following type: What kind and how much information is needed to achieve a desired type of control? What intrinsic properties characterize a given unalterable plant as far as control is concerned? At present only superficial answers are available to these questions, and even then only in special cases. Initial results presented in this Note are far from the degree of generality of Shannon's work. By contrast, however, only metho?s are employed here, giving some hope of beIng able to aVOld the well-known difficulty of Shannon's theory: methods of proof which are impractical for actually constructing practical solutions. In fact, this paper arose fr.om the need for a better understanding of some recently d1scovered computation methods of control-system syn- thesis 2-s. Another by-product of the paper is a new com- putation method for the solution of the classical Wiener filtering problem 7. The organization of the paper is as follows: 16 In Section 3 we introduce the models for which a fairly complete theory is available: dynamic systems with a finite dimensional state space and linear transition functions (i.e. systems obeying linear differential or difference equations). The class of random processes considered consists of such dynamic systems excited by an uncorrelated gaussian random process. Other assumptions, such as stationarity, discretiza- tion, single input/single output, etc., are made only to facilitate the presentation and will be absent in detailed future accounts of the theory. In Section 4 we define the concept of controllability and show that this is the' natural' generalization of the so-called' dead- beat' control scheme discovered by Oldenbourg and Sartorius 21 and later rederived independently by Tsypkin22 and the author17• We then show in Section 5 that the general problem of optimal regulation is solvable if and only if the plant is completely controllable. In Section 6 we introduce the concept of observability and solve the problem of reconstructing unmeasurable state variables from the measurable ones in the minimum possible length of time. We formalize the similarities between controllability and observability in Section 7 by means of the Principle of Duality and show that the Wiener filtering problem is the natural dual of the problem of optimal regulation. Section 8 is a brief discussion of possible generalizations and currently unsolved problems of the pure theory of control. Notation and Terminology The reader is assumed to be familiar with elements of linear algebra, as discussed, for instance, by Halmos 8. Consider an n-dimensional real vector space X. A basis in X is a set of vectors at ... , all in X such that any vector x in X can be written uniquely as (I) the Xi being real numbers, the components or coordinates of x. Vectors will be denoted throughout by small bold-face letters. The set X* of all real-valued linear functions x* (= covec- tors) on X. with the' natural' definition of addition and scalar multiplication, is an n-dimensional vector space. The value of a covector y* at any vector x is denoted by [y*, x]. We call this the inner product of y* by x. The vector space X* has a natural basis a* 1 ... , a* n associated with a given basis in X; it is defined by the requirement that [a*j, aj] = Ojj Using the' orthogonality relation' 2, we may write form n X = L [a*j, x]aj j= t which will be used frequently. (2) in the (3) For purposes of numerical computation, a vector may be considered a matrix with one column and a covector a matrix 481 491 J.S.I.A.M. CONTROI Ser. A, Vol. 1, No. Printed in U.,q.A., 1963 MATHEMATICAL DESCRIPTION OF LINEAR DYNAMICAL SYSTEMS* R. E. KALMAN Abstract. There are two different ways of describing dynamical systems: (i) by means of state w.riables and (if) by input/output relations. The first method may be regarded as an axiomatization of Newton’s laws of mechanics and is taken to be the basic definition of a system. It is then shown (in the linear case) that the input/output relations determine only one prt of a system, that which is completely observable and completely con- trollable. Using the theory of controllability and observability, methods are given for calculating irreducible realizations of a given impulse-response matrix. In par- ticular, an explicit procedure is given to determine the minimal number of state varibles necessary to realize a given transfer-function matrix. Difficulties arising from the use of reducible realizations are discussed briefly. 1. Introduction and summary. Recent developments in optimM control system theory are bsed on vector differential equations as models of physical systems. In the older literature on control theory, however, the same systems are modeled by ransfer functions (i.e., by the Laplace trans- forms of the differential equations relating the inputs to the outputs). Two differet languages have arisen, both of which purport to talk about the same problem. In the new approach, we talk about state variables, tran- sition equations, etc., and make constant use of abstract linear algebra. In the old approach, the key words are frequency response, pole-zero pat- terns, etc., and the main mathematical tool is complex function theory. Is there really a difference between the new and the old? Precisely what are the relations between (linear) vector differential equations and transfer- functions? In the literature, this question is surrounded by confusion [1]. This is bad. Communication between research workers and engineers is impeded. Important results of the "old theory" are not yet fully integrated into the new theory. In the writer’s view--which will be argued t length in this paperthe diiIiculty is due to insufficient appreciation of the concept of a dynamical system. Control theory is supposed to deal with physical systems, and not merely with mathematical objects such as a differential equation or a trans- fer function. We must therefore pay careful attention to the relationship between physical systems and their representation via differential equations, transfer functions, etc. * Received by the editors July 7, 1962 and in revised form December 9, 1962. Presented at the Symposium on Multivariable System Theory, SIAM, November 1, 1962 at Cambridge, Massachusetts. This research was supported in part under U. S. Air Force Contracts AF 49 (638)-382 and AF 33(616)-6952 as well as NASA Contract NASr-103. Research Institute for Advanced Studies (RIAS), Baltimore 12, Maryland. 152 Downloaded 11/11/13 to 152.3.159.32. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 1961-62
  14. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Observability In control theory, observability is a measure of how well internal states of a system can be inferred from knowledge of its external outputs. https://en.wikipedia.org/wiki/Observability
  15. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Levels of Observability Network Machine (HW, OS) Application
  16. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark The Three Pillars of Observability Distributed Systems Observability by Cindy Sridharan
  17. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark The Three Pillars of Observability Event Logs Metrics Tracing Distributed Systems Observability by Cindy Sridharan
  18. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Using Observability Event Logs Metrics Tracing Log aggregation & analytics Visualizations Alerting
  19. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Observability on AWS CloudWatch Logs CloudWatch Metrics AWS X-Ray Traces CloudWatch Insights CloudWatch Dashboard CloudWatch Alarms AWS X-Ray ServiceGraph CloudWatch Metric Filter
  20. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark https://www.youtube.com/watch?v=rgfww8tLM0A
  21. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark CloudWatch API PutMetricData const metricData = await cloudWatch.putMetricData({ MetricData: [ { MetricName: 'My Business Metric', Dimensions: [ { Name: 'Location', Value: 'Paris' } ], Timestamp: new Date, Value: 123.4 } ], Namespace: METRIC_NAMESPACE }).promise(); • Metric name • Dimensions • Timestamp • Value • Namespace
  22. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Add correlation IDs to logs – CloudWatch Logs + Insights
  23. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark CloudWatch Anomaly Detection https://aws.amazon.com/blogs/aws/new-amazon-cloudwatch-anomaly-detection/ N EW
  24. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark CloudWatch Cross-Account Cross-Region Dashboards https://aws.amazon.com/blogs/aws/cross-account-cross-region-dashboards-with-amazon-cloudwatch/ N EW
  25. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark user request Amazon DynamoDB Table Amazon SQS Queue response Trace Segment Sub-segment Frontend API AWS X-Ray concepts
  26. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark AWS X-Ray concepts Trace End-to-end data related a single request across services Segments Portions of the trace that correspond to a single service Sub-segments Remote call or local compute sections within a service Annotations Business data that can be used to filter traces Metadata Business data that can be added to the trace but not used for filtering traces Errors Normalized error message and stack trace Sampling Percentage of requests to your application to capture as traces
  27. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Demo
  28. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark End-to-end tracing – AWS X-Ray Service Map
  29. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark End-to-end tracing – AWS X-Ray Traces
  30. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark AWS X-Ray Key Concepts Segments Subsegments
  31. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Enabling X-Ray tracing AWS Lambda Console Amazon API Gateway Console
  32. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Enabling X-Ray tracing in your code const AWS = require('aws-sdk'); const AWSXRay = require('aws-xray-sdk'); const AWS = AWSXRay.captureAWS(require('aws-sdk'));
  33. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Enabling X-Ray tracing in your code const AWSXRay = require('aws-xray-sdk’); const app = express(); app.use(AWSXRay.express.openSegment('my-segment')); app.get('/send', function (req, res) { res.setHeader('Content-Type', 'application/json’); res.send('{"hello": "world"}'); }); app.use(AWSXRay.express.closeSegment());
  34. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Understand performance… Systems Performance by Brendan Gregg
  35. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Understand performance… and latency… Systems Performance by Brendan Gregg
  36. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Understand performance… and latency… and percentiles! P50 P90 P99 P100
  37. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Choose the right integration patterns ✓ Decouple and scale distributed systems ✓ Decouple producers from subscribers ✓ Combine multiple tasks and manage distributed state Message queue Pub/sub messaging Workflows Amazon Simple Notification Service (SNS) Amazon Simple Queue Service (SQS) AWS Step Functions
  38. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark What is needed Consistent communications management Complete visibility Failure isolation and protection Fine-grained deployment controls
  39. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Client side traffic management Traffic Shaping Service discovery Retries Timeouts Circuit breakers Health checks Routing Controls Protocols support Header based Cookie based Path based Host based
  40. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Instrumentation options Microservice Container In-process (SDK) Option 1
  41. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Option 1: In-process SDK Java Scala Node.js Python C++ Django .NET GO …
  42. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Instrumentation options Microservice Container In-process (SDK) Out-of-process (sidecar proxy) Option 1 Option 2 Microservice Container Proxy
  43. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Option 2: Side-car proxy Application Code Microservice Proxy Monitoring Routing Discovery Deployment
  44. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Option 2: Side-car proxy Proxy runs as a container Task or Pod External traffic Application Code
  45. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Option 2: Proxy Proxy
  46. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Why service mesh proxy vs. Libraries or app code Overall—migrate to microservices safer and faster Reduce work required by developers Follow best practices Use any language or platform Simplify visibility, troubleshooting, and deployments
  47. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark App Mesh configures every proxy https://www.youtube.com/watch?v=GVni3ruLSe0
  48. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark OSS project Wide community support, numerous integrations Stable and production-proven “Graduated Project” in Cloud Native Computing Foundation Started at Lyft in 2016 App Mesh uses Envoy proxy
  49. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Application observability + others Faster troubleshooting due to consistent data across services Existing tools or dashboards with a lot more metrics, logs and traces Distinguish between service and network issues
  50. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Representing your sample app in App Mesh Elastic Load Balancing Microservices App Mesh Mesh – [myapp] Virtual Node A Service Discovery Listener Backends Virtual Node B Service Discovery Listener Backends
  51. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Virtual Node Virtual Node Service Discovery Backends Listeners Virtual Node Logical representation of runtime services Backends Set of destinations that this node will communicate with (hostnames) Service Discovery Describes how its callers and locate this node (DNS hostname or AWS Cloud Map* namespace, serviced, and selectors) Listeners Policies to handle incoming traffic Ed: port, Health check*, Circuit breaker*, Retries*
  52. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Virtual routes Destination’s virtual router and route Virtual router: B HTTP routes Match Prefix: / Action: Targets B Route B Virtual node destination + weight Route Name: B1 Match Action: Route Name: B2 Other Protocol routes
  53. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Update routes Virtual router: B HTTP route targets: prefix: / B B’ Route B Virtual node destination + weight Route B’ New service or service version
  54. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Mesh – [myapp] Virtual Node A Service Discovery Backend Listener Virtual router Domains action: match: / B B’ Service B Service B’ Virtual Node B’ Service Discovery Listener Backends Virtual Node B Service Discovery Listener Backends
  55. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Virtual Node B1 Mesh Service A Service B Service C Virtual router Virtual router Service D Virtual router
  56. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark Takeaways 1. Build the instrumentation you need to understand what is happening in your (distributed) application 2. Use technical and business metrics together to get better insights 3. Use correlation IDs in log and tracing frameworks to understand distributed architectures (microservices) 4. Think at scale and plan for a service mesh control plane
  57. © 2019, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon Confidential and Trademark © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Thank you! Dennis Kieselhorst, Sr. Solutions Architect [email protected] Feedback form: https://amzn.to/35cfKWx