Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Software Observability

Avatar for Bryan Cantrill Bryan Cantrill
February 28, 2003
120

Software Observability

Presentation given on February 28, 2003 at Sun's internal Software Technology Conference at the Agnews Auditorium in Santa Clara. This presentation is one of the earliest that ties the work we were doing on DTrace to the more abstract idea of software observability.

Avatar for Bryan Cantrill

Bryan Cantrill

February 28, 2003
Tweet

Transcript

  1. Sun Proprietary/Confidential: For Internal Use Only Software Observability Bryan Cantrill

    Solaris Kernel Development Software Technology Conference February 27-28, 2003 The Auditorium, Santa Clara Campus
  2. Sun Proprietary/Confidential: For Internal Use Only Agenda Why is software

    observability important? Why is observability important to Sun? What is required of any new software observability infrastructure? Introduction to one such infrastructure Observability demo (time permitting)
  3. Sun Proprietary/Confidential: For Internal Use Only Software Layering In the

    limit, software's cost tends to zero Zero cost makes software more amenable to layering than any other technology Building upon extant components has many advantages; system is: More feature-rich More reliable More scalable Delivered exponentially faster
  4. Sun Proprietary/Confidential: For Internal Use Only Complexity Arises Connections between

    software layers are fluid; layers are rife with interdepencies When systems exhibit non-fatal misbehavior, their sheer complexity often prevents understanding Complexity undermines the natural low costs of software Complexity inhibits componentization
  5. Sun Proprietary/Confidential: For Internal Use Only Mitigating complexity When layers

    are opaque, behavior can only be inferred by changing the system Changing the system is Slow Labor intensive Error-prone Often not possible in production Must make layers and their interactions observable
  6. Sun Proprietary/Confidential: For Internal Use Only Observability When the interactions

    between layers can be observed, complexity can be mitigated Observability fights complexity When layers and their interactions can be observed, their inevitable inefficiences can be quickly found and rectified Observability begets higher performance
  7. Sun Proprietary/Confidential: For Internal Use Only Observability Advantage Building a

    unified observability architecture leverages Sun's unique position as a total systems provider Unified observability of Java components/applications, C/C++-based applications and the operating system would be a tremendous competitive advantage The trick is to make the system observable in ways that are meaningful to the observer
  8. Sun Proprietary/Confidential: For Internal Use Only Meaningful Observability To be

    meaningful, observability must allow for a question-focused approach Questions should be allowed to be essentially arbitrary The answer itself must be concise; multiple data points should be coalesced into a single data point wherever possible The answer must be prompt; the questioner must be able to quickly iterate from an answer to the next question that the answer inevitably provokes
  9. Sun Proprietary/Confidential: For Internal Use Only Example of an Ideal

    Dialogue: What is causing all the cross calls? The X servers. What are the X servers doing to cause cross calls? They're mapping and unmapping “/dev/null.” Why are they doing that? They're creating and destroying Pixmaps. Who is asking them to do that? Several instances of a stock ticker application. How often is each stock ticker making this request? 100 times per second. Why is the application doing that? It was written by 10,000 monkeys at 10,000 keyboards.
  10. Sun Proprietary/Confidential: For Internal Use Only Availability Observability infrastructure must

    be available in production environments! Giving customers instrumented binaries for use in production environments is extraordinarily error-prone Forcing customers to reproduce problems in non-production environments is prohibitively difficult and expensive -- and can result in solving the wrong problem
  11. Sun Proprietary/Confidential: For Internal Use Only Implications Observability infrastructure must

    have zero probe effect when not explicitly enabled System must be dynamically instrumented to answer a specific question Dynamic instrumentation must extend no further than that required to answer a specific question
  12. Sun Proprietary/Confidential: For Internal Use Only Introducing DTrace New dynamic

    tracing infrastructure that will ship in Solaris 10 Designed to provide concise answers to arbitrary questions Designed for rapid adoption of novel instrumentation methodologies Designed to have zero probe effect when not enabled; all facilities available in production
  13. Sun Proprietary/Confidential: For Internal Use Only DTrace: Providers Historically, tracing

    infrastructures have been tied to a single instrumentation methodology DTrace allows for multiple tracing providers, each with its own instrumentation methodology Allows new methodologies to leverage existing infrastructure
  14. Sun Proprietary/Confidential: For Internal Use Only DTrace: Providers, cont. Each

    provider has its own way of dynamically instrumenting the system For example, the “Function Boundary Tracing” provider knows how to dynamically instrument every function entry and return in the kernel Currently, there are eight different providers, together providing tens of thousands of different probes
  15. Sun Proprietary/Confidential: For Internal Use Only DTrace: Predicates and Actions

    DTrace allows probes to be enabled with an optional predicate Corresponding actions are taken if and only if the predicate evaluates to true Probes can be enabled by different consumers with different predicates and actions; DTrace handles the multiplexing Predicates and actions are formulated in a C-like language
  16. Sun Proprietary/Confidential: For Internal Use Only DTrace: Introducing “D” Complete

    access to native kernel C types Complete access to statics and globals Complete support for all ANSI-C operators Support for strings as a first-class citizen Support for built-in variables (probe arguments, machine registers, etc.), thread-local variables, associative arrays Compiler provided as a library API
  17. Sun Proprietary/Confidential: For Internal Use Only module symtab CTF module

    symtab CTF module symtab CTF drv/dtrace DIF engine DIF libdtrace.so.1 parser codegen assembler module symtab CTF module symtab CTF module cache symtab CTF lexer disassembler svc routines Clients send D expressions to library for compilation Compiler stack produces D Intermediate Format (DIF) objects that can be bound to probe locations DTrace driver stores DIF in the kernel and executes program at probe firing time DTrace: Implementing D dtrace(1M)
  18. Sun Proprietary/Confidential: For Internal Use Only DTrace: DIF Safety All

    DIF objects are validated by the kernel: Opcodes, string references, variables, and registers are checked for validity Reserved bits must be zero Only forward branches are permitted DTrace runtime handles invalid loads, misaligned loads, and division by zero DTrace runtime prevents access to I/O space addresses
  19. Sun Proprietary/Confidential: For Internal Use Only DTrace: Aggregations An aggregating

    function is a function f(x), where x is a set of data, for which there exists an aggregating function f'(x) such that: f'(f(x0 ) ∪ f(x1 ) ∪ ... ∪ f(xn )) = f(x0 ∪ x1 ∪ ... ∪ xn ) E.g., count, mean, maximum, and minimum are aggregating functions; median, and mode are not
  20. Sun Proprietary/Confidential: For Internal Use Only DTrace: Aggregations, cont. An

    aggregation is an associative table keyed by an n-tuple where each value is the result of an aggregating function n-tuple consists of a list of D expressions Aggregating functions are provided by the DTrace framework Aggregations allow for data to be coalesced at the source Aggregations allow for concise answers to performance-related questions
  21. Sun Proprietary/Confidential: For Internal Use Only DTrace and C/C++ Applications

    The “fasttrap” provider allows for DTrace probe creation at any user-level instruction of any process fasttrap probes have much less overhead than a traditional breakpoint DTrace processing occurs in the kernel Allows application instrumentation without recompilation or relinking Allows for unified user- and kernel-level tracing
  22. Sun Proprietary/Confidential: For Internal Use Only DTrace and Java Applications

    Java is a highly dynamic environment, presenting unique challenges DTrace team is currently working with Java engineers to support Java symbol translation as a DTrace action Only a first step, but will allow quantum leap in Java observability Future steps: Ability to add probes to Java bytecode Ability to formulate mixed D/Java expressions
  23. Sun Proprietary/Confidential: For Internal Use Only Bug 4770261 Bug 4770261

    Example of an Ideal Dialogue: What is causing all the cross calls? The X servers. What are the X servers doing to cause cross calls? They're mapping and unmapping “/dev/null.” Why are they doing that? They're creating and destroying Pixmaps. Who is asking them to do that? Several instances of a stock ticker application. How often is each stock ticker making this request? 100 times per second. Why is the application doing that? It was written by 10,000 monkeys at 10,000 keyboards. Actual
  24. Sun Proprietary/Confidential: For Internal Use Only DTrace: More information Prototype

    is available for internal use To date, DTrace has helped root-cause 31 bugs: http://dtrace.eng/foundbugs.html Much more information available at http://dtrace.eng Questions to [email protected] E-mail [email protected] to join the interest list
  25. Sun Proprietary/Confidential: For Internal Use Only Conclusions and Directions Observability

    is critical to Sun's future With DTrace: We have made a tremendous leap in kernel observability We have made great strides in C/C++ application observability We are making initial steps in Java observability We must continue up the software stack, adhering to the principles that have made DTrace successful