Slide 1

Slide 1 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Danilo Poccia Principal Evangelist, Serverless @danilop Observability for Serverless apps: What should you look at

Slide 2

Slide 2 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Monolithic Application Services Microservices

Slide 3

Slide 3 text

© 2019, Amazon Web Services, Inc. or its Affiliates. “Complexity arises when the dependencies among the elements become important.” Scott E. Page, John H. Miller Complex Adaptive Systems

Slide 4

Slide 4 text

© 2019, Amazon Web Services, Inc. or its Affiliates. How Amazon SQS works Front End Back End Metadata Amazon DynamoDB Load Manager

Slide 5

Slide 5 text

© 2019, Amazon Web Services, Inc. or its Affiliates. © 2019, Amazon Web Services, Inc. or its Affiliates. “A complex system that works is invariably found to have evolved from a simple system that worked.” Gall’s Law

Slide 6

Slide 6 text

© 2019, Amazon Web Services, Inc. or its Affiliates. © 2019, Amazon Web Services, Inc. or its Affiliates. “A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system.”

Slide 7

Slide 7 text

© 2019, Amazon Web Services, Inc. or its Affiliates. “Amazon S3 is intentionally built with a minimal feature set. The focus is on simplicity and robustness.” – Amazon S3 Press Release, March 14, 2006

Slide 8

Slide 8 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Amazon S3 8 → more than 200 microservices Mai-Lan Tomsen Bukovec VP and GM, Amazon S3

Slide 9

Slide 9 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Monolith

Slide 10

Slide 10 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Service Service Service Service Service Service Service Service Service Service Service Service

Slide 11

Slide 11 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Rust Database DB Database Rust Go Node.js Java Node.js Node.js

Slide 12

Slide 12 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Containers Database DB Database Containers λ Containers VMs Managed Service

Slide 13

Slide 13 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Don’t build a network of connected “black boxes” Observability is a developer responsibility

Slide 14

Slide 14 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Observability in Control Theory On the General Theory of Control Systems R. E. KALMAN Introduction In no small measure, the great technological progress in automatic control and communication systems during the past two decades has depended on advances and refinements in the mathematical study of such systems. Conversely, the growth of technology brought forth many new problems (such as those related to using digital computers in control, etc.) to challenge the ingenuity and competence of research workers concerned with theoretical questions. Despite the appearance and effective resolution of many new problems, our understanding of fundamental aspects of control has remained superficial. The only basic advance so far appears to be the theory of information created by Shannon 1. The chief significance of his work in our present interpretation is the discovery of general' laws' underlying the process of information transmission, which are quite independent of the particular models being considered or even the methods used for the des- cription and analysis of these models. These results could be compared with the' laws' of physics, with the crucial difference that the' laws' governing man-made objects cannot be discovered by straightforward experimentation but only by a purely abstract analysis guided by intuition gained in observing present-day examples of technology and economic organization. We may thus classify Shannon's result as belonging to the pure theory of communication and control, while everything else can be labelled as the applied theory; this terminology reflects the well- known distinctions between pure and applied physics or mathematics. For reasons pointed out above, in its methodo- logy the pure theory of communication and control closely resembles mathematics, rather than physics; however, it is not a. branch of mathematics because at present we cannot (yet?) d1sregard questions of physical realizability in the study of mathematical models. This paper initiates study of the pure theory of control imitating the spirit of Shannon's investigations but using entirely different techniques. Our ultimate objective is to answer questions of the following type: What kind and how much information is needed to achieve a desired type of control? What intrinsic properties characterize a given unalterable plant as far as control is concerned? At present only superficial answers are available to these questions, and even then only in special cases. Initial results presented in this Note are far from the degree of generality of Shannon's work. By contrast, however, only metho?s are employed here, giving some hope of beIng able to aVOld the well-known difficulty of Shannon's theory: methods of proof which are impractical for actually constructing practical solutions. In fact, this paper arose fr.om the need for a better understanding of some recently d1scovered computation methods of control-system syn- thesis 2-s. Another by-product of the paper is a new com- putation method for the solution of the classical Wiener filtering problem 7. The organization of the paper is as follows: 16 In Section 3 we introduce the models for which a fairly complete theory is available: dynamic systems with a finite dimensional state space and linear transition functions (i.e. systems obeying linear differential or difference equations). The class of random processes considered consists of such dynamic systems excited by an uncorrelated gaussian random process. Other assumptions, such as stationarity, discretiza- tion, single input/single output, etc., are made only to facilitate the presentation and will be absent in detailed future accounts of the theory. In Section 4 we define the concept of controllability and show that this is the' natural' generalization of the so-called' dead- beat' control scheme discovered by Oldenbourg and Sartorius 21 and later rederived independently by Tsypkin22 and the author17• We then show in Section 5 that the general problem of optimal regulation is solvable if and only if the plant is completely controllable. In Section 6 we introduce the concept of observability and solve the problem of reconstructing unmeasurable state variables from the measurable ones in the minimum possible length of time. We formalize the similarities between controllability and observability in Section 7 by means of the Principle of Duality and show that the Wiener filtering problem is the natural dual of the problem of optimal regulation. Section 8 is a brief discussion of possible generalizations and currently unsolved problems of the pure theory of control. Notation and Terminology The reader is assumed to be familiar with elements of linear algebra, as discussed, for instance, by Halmos 8. Consider an n-dimensional real vector space X. A basis in X is a set of vectors at ... , all in X such that any vector x in X can be written uniquely as (I) the Xi being real numbers, the components or coordinates of x. Vectors will be denoted throughout by small bold-face letters. The set X* of all real-valued linear functions x* (= covec- tors) on X. with the' natural' definition of addition and scalar multiplication, is an n-dimensional vector space. The value of a covector y* at any vector x is denoted by [y*, x]. We call this the inner product of y* by x. The vector space X* has a natural basis a* 1 ... , a* n associated with a given basis in X; it is defined by the requirement that [a*j, aj] = Ojj Using the' orthogonality relation' 2, we may write form n X = L [a*j, x]aj j= t which will be used frequently. (2) in the (3) For purposes of numerical computation, a vector may be considered a matrix with one column and a covector a matrix 481 491 J.S.I.A.M. CONTROI Ser. A, Vol. 1, No. Printed in U.,q.A., 1963 MATHEMATICAL DESCRIPTION OF LINEAR DYNAMICAL SYSTEMS* R. E. KALMAN Abstract. There are two different ways of describing dynamical systems: (i) by means of state w.riables and (if) by input/output relations. The first method may be regarded as an axiomatization of Newton’s laws of mechanics and is taken to be the basic definition of a system. It is then shown (in the linear case) that the input/output relations determine only one prt of a system, that which is completely observable and completely con- trollable. Using the theory of controllability and observability, methods are given for calculating irreducible realizations of a given impulse-response matrix. In par- ticular, an explicit procedure is given to determine the minimal number of state varibles necessary to realize a given transfer-function matrix. Difficulties arising from the use of reducible realizations are discussed briefly. 1. Introduction and summary. Recent developments in optimM control system theory are bsed on vector differential equations as models of physical systems. In the older literature on control theory, however, the same systems are modeled by ransfer functions (i.e., by the Laplace trans- forms of the differential equations relating the inputs to the outputs). Two differet languages have arisen, both of which purport to talk about the same problem. In the new approach, we talk about state variables, tran- sition equations, etc., and make constant use of abstract linear algebra. In the old approach, the key words are frequency response, pole-zero pat- terns, etc., and the main mathematical tool is complex function theory. Is there really a difference between the new and the old? Precisely what are the relations between (linear) vector differential equations and transfer- functions? In the literature, this question is surrounded by confusion [1]. This is bad. Communication between research workers and engineers is impeded. Important results of the "old theory" are not yet fully integrated into the new theory. In the writer’s view--which will be argued t length in this paperthe diiIiculty is due to insufficient appreciation of the concept of a dynamical system. Control theory is supposed to deal with physical systems, and not merely with mathematical objects such as a differential equation or a trans- fer function. We must therefore pay careful attention to the relationship between physical systems and their representation via differential equations, transfer functions, etc. * Received by the editors July 7, 1962 and in revised form December 9, 1962. Presented at the Symposium on Multivariable System Theory, SIAM, November 1, 1962 at Cambridge, Massachusetts. This research was supported in part under U. S. Air Force Contracts AF 49 (638)-382 and AF 33(616)-6952 as well as NASA Contract NASr-103. Research Institute for Advanced Studies (RIAS), Baltimore 12, Maryland. 152 Downloaded 11/11/13 to 152.3.159.32. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 1961-62

Slide 15

Slide 15 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Control Theory PV SP Controlled Process Variable Reference or Set Point Actual Value Desired Value SP-PV error

Slide 16

Slide 16 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Observability In control theory, observability is a measure of how well internal states of a system can be inferred from knowledge of its external outputs. https://en.wikipedia.org/wiki/Observability

Slide 17

Slide 17 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Levels of Observability Machine (HW, OS) Application Network

Slide 18

Slide 18 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Machine (HW, OS) Application Network The Three Pillars of Observability Distributed Systems Observability by Cindy Sridharan

Slide 19

Slide 19 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Machine (HW, OS) Application Network The Three Pillars of Observability Logs Metrics Tracing Distributed Systems Observability by Cindy Sridharan

Slide 20

Slide 20 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Metric Filters & Correlations IDs Logs Tracing Metric Filter Correlation ID Metrics

Slide 21

Slide 21 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Using Observability Logs Tracing Log aggregation & analytics Visualizations Alerting Metric Filter Correlation ID Metrics

Slide 22

Slide 22 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Using Observability on AWS CloudWatch Logs AWS X-Ray Traces CloudWatch Insights CloudWatch Dashboard CloudWatch Alarms AWS X-Ray ServiceGraph Metric Filter CloudWatch Metrics

Slide 23

Slide 23 text

© 2019, Amazon Web Services, Inc. or its Affiliates. CloudWatch Anomaly Detection O pen Preview

Slide 24

Slide 24 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Dive Deep with Tracing

Slide 25

Slide 25 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Understand performance… Systems Performance by Brendan Gregg

Slide 26

Slide 26 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Understand performance… and latency… Systems Performance by Brendan Gregg

Slide 27

Slide 27 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Understand performance… and latency… and percentiles! P100 P99 P90 P50

Slide 28

Slide 28 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Proactive operations helps mitigate issues Degraded state Outage Latency Time (ms)

Slide 29

Slide 29 text

© 2019, Amazon Web Services, Inc. or its Affiliates.

Slide 30

Slide 30 text

© 2019, Amazon Web Services, Inc. or its Affiliates. What is Serverless? No infrastructure to manage Automatic scaling Pay for value Highly available and secure

Slide 31

Slide 31 text

© 2019, Amazon Web Services, Inc. or its Affiliates. How does Serverless work? Storage Databases Analytics Machine Learning . . . Your unique business logic User uploads a picture Customer data updated Anomaly detected API call . . . Fully-managed services Events Functions

Slide 32

Slide 32 text

© 2019, Amazon Web Services, Inc. or its Affiliates. What is an “event” ? “something that happens” Events tell us a fact Immutable time series Time What 2019 06 21 08 07 06 CustomerCreated 2019 06 21 08 07 09 OrderCreated 2019 06 21 08 07 13 PaymentSuccessful 2019 06 21 08 07 17 CustomerUpdated . . . . . .

Slide 33

Slide 33 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Time is important “Modelling events forces you to have a temporal focus on what’s going on in the system. Time becomes a crucial factor of the system.” – Greg Young, A Decade of DDD, CQRS, Event Sourcing, 2016

Slide 34

Slide 34 text

© 2019, Amazon Web Services, Inc. or its Affiliates. © 2019, Amazon Web Services, Inc. or its Affiliates. How to simplify event management? Photo by Adam Jang on Unsplash

Slide 35

Slide 35 text

© 2019, Amazon Web Services, Inc. or its Affiliates. TweetSource: Type: AWS::Serverless::Application Properties: Location: ApplicationId: arn:aws:serverlessrepo:... SemanticVersion: 2.0.0 Parameters: TweetProcessorFunctionName: !Ref MyFunction SearchText: '#serverless -filter:nativeretweets' Nested apps to simplify solving recurring problems Standard Component Custom Business Logic aws-serverless-twitter-event-source app Polling schedule (CloudWatch Events rule) trigger TwitterProcessor SearchCheckpoint TwitterSearchPoller Twitter Search API

Slide 36

Slide 36 text

© 2019, Amazon Web Services, Inc. or its Affiliates. AWS Event Fork Pipelines https://github.com/aws-samples/aws-serverless-event-fork-pipelines Amazon SNS topic Event storage & backup pipeline Event search & analytics pipeline Event replay pipeline Your event processing pipeline filtered events events to replay all events Standard Components Custom Business Logic

Slide 37

Slide 37 text

© 2019, Amazon Web Services, Inc. or its Affiliates. AWS Event Fork Pipelines – Event Storage & Backup Pipeline sns-fork-storage-backup app Amazon S3 backup bucket fan out filtered events Amazon SNS topic Amazon SQS queue AWS Lambda function

Slide 38

Slide 38 text

© 2019, Amazon Web Services, Inc. or its Affiliates. AWS Event Fork Pipelines – Event Search & Analytics Pipeline sns-fork-search-analytics app Amazon S3 dead-letter bucket fan out filtered events Amazon SNS topic Amazon SQS queue AWS Lambda function Kibana dashboard Store dead-letter events

Slide 39

Slide 39 text

© 2019, Amazon Web Services, Inc. or its Affiliates. AWS Event Fork Pipelines – Event Replay Pipeline sns-fork-message-replay app fan out filtered events Amazon SNS topic Amazon SQS replay queue AWS Lambda replay function Your regular event processing pipeline Amazon SQS processing queue enqueue events to replay Your operators enable/disable replay reprocess events…

Slide 40

Slide 40 text

© 2019, Amazon Web Services, Inc. or its Affiliates. AWS Event Fork Pipelines – E-Commerce Example

Slide 41

Slide 41 text

© 2019, Amazon Web Services, Inc. or its Affiliates. AWS Event Fork Pipelines in the Serverless Application Repository

Slide 42

Slide 42 text

Photo by J W on Unsplash Can we help more?

Slide 43

Slide 43 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Amazon EventBridge A serverless event bus service for SaaS and AWS services • Fully managed, pay-as-you-go • Native integration with SaaS providers • 15 target services • Easily build event-driven architectures N ew

Slide 44

Slide 44 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Amazon EventBridge Event source SaaS event bus Custom event bus Default event bus Rules AWS Lambda Amazon Kinesis AWS Step Functions Additional targets

Slide 45

Slide 45 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Amazon EventBridge AWS services Custom events SaaS apps Event source SaaS event bus Custom event bus Default event bus Rules AWS Lambda Amazon Kinesis AWS Step Functions Additional targets "detail-type": "source": "aws.partner/example.com/123", "detail": "ticketId": "department": "creator":

Slide 46

Slide 46 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Amazon EventBridge AWS services Custom events SaaS apps Event source SaaS event bus Custom event bus Default event bus Rules AWS Lambda Amazon Kinesis AWS Step Functions Additional targets "detail-type": "source": "aws.partner/example.com/123" "detail": "ticketId": "department": "creator": "source":

Slide 47

Slide 47 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Amazon EventBridge AWS services Custom events SaaS apps Event source SaaS event bus Custom event bus Default event bus Rules AWS Lambda Amazon Kinesis AWS Step Functions Additional targets "detail-type": "source": "aws.partner/example.com/123", "detail": "ticketId": "department": "billing" "creator": "detail": "department": ["billing", "fulfillment"]

Slide 48

Slide 48 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Amazon EventBridge AWS services Custom events SaaS apps Event source SaaS event bus Custom event bus Default event bus Rules AWS Lambda Amazon Kinesis AWS Step Functions Additional targets "detail-type": "Ticket Created" "source": "aws.partner/example.com/123", "detail": "ticketId": "department": "billing", "creator": "detail-type": ["Ticket Resolved"]

Slide 49

Slide 49 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Amazon EventBridge integration partners

Slide 50

Slide 50 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Common use cases

Slide 51

Slide 51 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Common use cases

Slide 52

Slide 52 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Takeaways 1. Build the instrumentation you need to understand what is happening inside your distributed application 2. Mix technical and business metrics together to get better insights 3. Use correlation IDs in log and tracing frameworks to understand the actual flow of data 4. Leverage anomaly detection to understand when you are not in a normal state 5. Store, analyze, and reply events, they can be the source of truth to understand the behavior (and not just the structure) of your application

Slide 53

Slide 53 text

© 2019, Amazon Web Services, Inc. or its Affiliates. AWS Lambda monitoring partners

Slide 54

Slide 54 text

© 2019, Amazon Web Services, Inc. or its Affiliates. © 2019, Amazon Web Services, Inc. or its Affiliates. Thank you! @danilop