Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Observability Pipeline

The Observability Pipeline

The pervasiveness of cloud and containers has led to systems that are much more distributed and dynamic in nature. Highly elastic microservice and serverless architectures mean containers spin up on demand and scale to zero when that demand goes away. In this world, servers are very much cattle, not pets. This shift has exposed deficiencies in some of the tools and practices we used in the world of servers-as-pets. Specifically, there are questions around how we monitor and debug these types of systems at scale. And with the rise of DevOps and product mindset, making data-driven decisions is becoming increasingly important for agile development teams.

In this talk, we discuss a new approach to system monitoring and data collection: the observability pipeline. For organizations that are heavily siloed, this approach can help empower teams when it comes to operating their software. The observability pipeline provides a layer of abstraction that allows you to get operational data such as logs and metrics everywhere it needs to be without impacting developers and the core system. Unlocking this data can also be a huge win for the business with things like auditability, business analytics, and pricing. Lastly, it allows you to change backing data systems easily or test multiple in parallel. With the amount of data and the number of tools modern systems demand these days, we'll see how the observability pipeline becomes just as essential to the operations of a service as the CI/CD pipeline.

Tyler Treat

April 29, 2019
Tweet

More Decks by Tyler Treat

Other Decks in Programming

Transcript

  1. @tyler_treat Toby likes to go on long walks,
 so sometimes

    we’ll take him 
 offline for a bit.
 (usually just nights and weekends)
  2. @tyler_treat Node 1 App Server Reporting
 DB Node 2 Node

    3 Node 4 Node 5 Database Cluster App Server App Server rver
  3. @tyler_treat Node 1 App Server Reporting
 DB Node 2 Node

    3 Node 4 Node 5 Database Cluster App Server App Server rver
  4. @tyler_treat Node 1 App Server Reporting
 DB Node 2 Node

    3 Node 4 Node 5 Database Cluster App Server App Server rver
  5. @tyler_treat Node 1 App Server Reporting
 DB Node 2 Node

    3 Node 4 Node 5 Database Cluster App Server App Server rver
  6. @tyler_treat Node 1 App Server Reporting
 DB Node 2 Node

    3 Node 4 Node 5 Database Cluster App Server App Server rver
  7. @tyler_treat Node 1 App Server Reporting
 DB Node 2 Node

    3 Node 4 Node 5 Database Cluster App Server App Server rver Node 1 Node 2 Node 3 Node 4 Node 5 Cache Cluster
  8. @tyler_treat Node 1 App Server Reporting
 DB Node 2 Node

    3 Node 4 Node 5 Database Cluster App Server App Server rver Node 1 Node 2 Node 3 Node 4 Node 5 Cache Cluster
  9. @tyler_treat App Server Node 1 Node 2 Node 3 Node

    4 Node 5 Database Cluster App Server App Server rver Node 1 Node 2 Node 3 Node 4 Node 5 Cache Cluster Node 1 Node 2 Node 3 Node 4 Node 5 BI Data Cluster BI Server BI Server Data Pipeline
  10. @tyler_treat App Server Node 1 Node 2 Node 3 Node

    4 Node 5 Database Cluster App Server App Server rver Node 1 Node 2 Node 3 Node 4 Node 5 Cache Cluster Node 1 Node 2 Node 3 Node 4 Node 5 BI Data Cluster BI Server BI Server Data Pipeline
  11. @tyler_treat Node 1 Node 2 Node 3 Node 4 Node

    5 BI Data Cluster BI Server BI Server 1 2 3 4 5 Database Cluster 1 2 3 4 5 Cache Cluster Microservice 1 2 3 4 5 Database Cluster 1 2 3 4 5 Cache Cluster Microservice 1 2 3 4 5 Database Cluster 1 2 3 4 5 Cache Cluster Microservice 1 2 3 4 5 Database Cluster 1 2 3 4 5 Cache Cluster Microservice Data Pipeline
  12. @tyler_treat Node 1 Node 2 Node 3 Node 4 Node

    5 BI Data Cluster BI Server BI Server 1 2 3 4 5 Database Cluster 1 2 3 4 5 Cache Cluster Microservice 1 2 3 4 5 Database Cluster 1 2 3 4 5 Cache Cluster Microservice 1 2 3 4 5 Database Cluster 1 2 3 4 5 Cache Cluster Microservice 1 2 3 4 5 Database Cluster 1 2 3 4 5 Cache Cluster Microservice Data Pipeline
  13. @tyler_treat North America BI Server BI Server Microservice Microservice Microservice

    Microservice Asia Pacific BI Server BI Server Microservice Microservice Microservice Microservice
  14. @tyler_treat North America BI Server BI Server Microservice Microservice Microservice

    Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice CDN
  15. @tyler_treat North America BI Server BI Server Microservice Microservice Microservice

    Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice CDN Infrastructure Load Balancers Orchestrators DNS Configuration . . .
  16. @tyler_treat North America BI Server BI Server Microservice Microservice Microservice

    Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice CDN CI/CD Repo Repo Repo Repo Builder Builder Builder Builder Builder Builder Artifacts Artifacts Artifacts Deployer Deployer Infrastructure Load Balancers Orchestrators DNS Configuration . . .
  17. @tyler_treat North America BI Server BI Server Microservice Microservice Microservice

    Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice CDN CI/CD Repo Repo Repo Repo Builder Builder Builder Builder Builder Builder Artifacts Artifacts Artifacts Deployer Deployer Infrastructure Load Balancers Orchestrators DNS Configuration . . .
  18. @tyler_treat North America BI Server BI Server Microservice Microservice Microservice

    Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice CDN CI/CD Repo Repo Repo Repo Builder Builder Builder Builder Builder Builder Artifacts Artifacts Artifacts Deployer Deployer “DevOps” Infrastructure Load Balancers Orchestrators DNS Configuration . . .
  19. @tyler_treat North America BI Server BI Server Microservice Microservice Microservice

    Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice CDN CI/CD Repo Repo Repo Repo Builder Builder Builder Builder Builder Builder Artifacts Artifacts Artifacts Deployer Deployer Infrastructure Load Balancers Orchestrators DNS Configuration . . . “DevOps”
  20. @tyler_treat North America BI Server BI Server Microservice Microservice Microservice

    Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice CDN CI/CD Repo Repo Repo Repo Builder Builder Builder Builder Builder Builder Artifacts Artifacts Artifacts Deployer Deployer Infrastructure Load Balancers Orchestrators DNS Configuration . . . “DevOps”
  21. @tyler_treat North America BI Server BI Server Microservice Microservice Microservice

    Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice North America BI Server BI Server Microservice Microservice Microservice Microservice CDN CI/CD Repo Repo Repo Repo Builder Builder Builder Builder Builder Builder Artifacts Artifacts Artifacts Deployer Deployer Infrastructure Load Balancers Orchestrators DNS Configuration . . . “DevOps”
  22. @tyler_treat Many companies rely on a separate operations team to

    monitor, triage, and even resolve issues.
  23. @tyler_treat This shift in how we build systems has caused

    an explosion of new tools and terminology.
  24. @tyler_treat Data Available Understanding Known Knowns • Things we are

    aware of and understand • “The system has a 1GB memory limit”
  25. @tyler_treat Data Available Understanding Known Knowns • Things we are

    aware of and understand • “The system has a 1GB memory limit” Known Unknowns • Things we are aware of but don’t understand • “The system exceeded its memory limit and crashed, causing an outage”
  26. @tyler_treat Data Available Understanding Unknown Knowns • Things we understand

    but are not aware of • “We implemented an orchestrator to ensure the system is always running” Known Knowns • Things we are aware of and understand • “The system has a 1GB memory limit” Known Unknowns • Things we are aware of but don’t understand • “The system exceeded its memory limit and crashed, causing an outage”
  27. @tyler_treat Data Available Understanding Unknown Knowns • Things we understand

    but are not aware of • “We implemented an orchestrator to ensure the system is always running” Known Knowns • Things we are aware of and understand • “The system has a 1GB memory limit” Unknown Unknowns • Things we are neither aware of nor understand • “Instances churn because the orchestrator restarts the process when it approaches its memory limit, causing
 sporadic failures and slowdowns” Known Unknowns • Things we are aware of but don’t understand • “The system exceeded its memory limit and crashed, causing an outage”
  28. @tyler_treat Data Available Understanding Unknown Knowns • Things we understand

    but are not aware of • “We implemented an orchestrator to ensure the system is always running” Known Knowns • Things we are aware of and understand • “The system has a 1GB memory limit” Unknown Unknowns • Things we are neither aware of nor understand • “Instances churn because the orchestrator restarts the process when it approaches its memory limit, causing
 sporadic failures and slowdowns” Known Unknowns • Things we are aware of but don’t understand • “The system exceeded its memory limit and crashed, causing an outage” FACTS
  29. @tyler_treat Data Available Understanding Unknown Knowns • Things we understand

    but are not aware of • “We implemented an orchestrator to ensure the system is always running” Known Knowns • Things we are aware of and understand • “The system has a 1GB memory limit” Unknown Unknowns • Things we are neither aware of nor understand • “Instances churn because the orchestrator restarts the process when it approaches its memory limit, causing
 sporadic failures and slowdowns” Known Unknowns • Things we are aware of but don’t understand • “The system exceeded its memory limit and crashed, causing an outage” FACTS HYPOTHESES
  30. @tyler_treat Data Available Understanding Unknown Knowns • Things we understand

    but are not aware of • “We implemented an orchestrator to ensure the system is always running” Known Knowns • Things we are aware of and understand • “The system has a 1GB memory limit” Unknown Unknowns • Things we are neither aware of nor understand • “Instances churn because the orchestrator restarts the process when it approaches its memory limit, causing
 sporadic failures and slowdowns” Known Unknowns • Things we are aware of but don’t understand • “The system exceeded its memory limit and crashed, causing an outage” ASSUMPTIONS FACTS HYPOTHESES
  31. @tyler_treat Unknown Unknowns • Things we are neither aware of

    nor understand • “Instances churn because the orchestrator restarts the process when it approaches its memory limit, causing
 sporadic failures and slowdowns” DISCOVERIES Data Available Understanding Unknown Knowns • Things we understand but are not aware of • “We implemented an orchestrator to ensure the system is always running” Known Knowns • Things we are aware of and understand • “The system has a 1GB memory limit” Known Unknowns • Things we are aware of but don’t understand • “The system exceeded its memory limit and crashed, causing an outage” ASSUMPTIONS FACTS HYPOTHESES
  32. @tyler_treat Unknown Unknowns • Things we are neither aware of

    nor understand • “Instances churn because the orchestrator restarts the process when it approaches its memory limit, causing
 sporadic failures and slowdowns” DISCOVERIES Data Available Understanding Known Unknowns • Things we are aware of but don’t understand • “The system exceeded its memory limit and crashed, causing an outage” HYPOTHESES Monitoring Observability
  33. @tyler_treat Unknown Unknowns • Things we are neither aware of

    nor understand • “Instances churn because the orchestrator restarts the process when it approaches its memory limit, causing
 sporadic failures and slowdowns” DISCOVERIES Data Available Understanding Known Unknowns • Things we are aware of but don’t understand • “The system exceeded its memory limit and crashed, causing an outage” HYPOTHESES Testing Exploring
  34. @tyler_treat Some
 challenges… 
 Observability Data application logs system logs

    audit logs application metrics distributed traces events - Locked up inside a single vendor’s solution - Not readily available across the enterprise
 (or in some cases, too readily available) - Many tools and products needed for
 different data and use cases - Tool and data needs vary from team to
 team - Ever-changing landscape of tools, products,
 and services - Sheer volume of data can be overwhelming
  35. @tyler_treat System Splunk Universal Forwarder Datadog Metrics Agent Datadog APM

    Agent Universal Analytics Client Amazon Glacier S3 Client
  36. @tyler_treat System Splunk Universal Forwarder Datadog APM Agent Universal Analytics

    Client Amazon Glacier S3 Client … Datadog Metrics Agent
  37. System Splunk Universal Forwarder Datadog APM Agent Universal Analytics Client

    S3 Client … Datadog Metrics Agent System Splunk Universal Forwarder Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Splunk Universal Forwarder Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Sp Un For Datad A Universal Analytics Client S3 Client … Datado A System Splunk Universal Forwarder Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Splunk Universal Forwarder Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Splunk Universal Forwarder Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Sp Un For Datad A Universal Analytics Client S3 Client … Datado A Splunk Universal Forwarder Universal Analytics Client Splunk Universal Forwarder Universal Analytics Client Splunk Universal Forwarder Universal Analytics Client Sp Un For Universal Analytics Client System System System System
  38. System Splunk Universal Forwarder Datadog APM Agent Universal Analytics Client

    S3 Client … Datadog Metrics Agent System Splunk Universal Forwarder Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Splunk Universal Forwarder Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Sp Un For Datad A Universal Analytics Client S3 Client … Datado A System Splunk Universal Forwarder Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Splunk Universal Forwarder Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Splunk Universal Forwarder Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Sp Un For Datad A Universal Analytics Client S3 Client … Datado A Splunk Universal Forwarder Universal Analytics Client Splunk Universal Forwarder Universal Analytics Client Splunk Universal Forwarder Universal Analytics Client Sp Un For Universal Analytics Client System System System System
  39. System Splunk Universal Forwarder Datadog APM Agent Universal Analytics Client

    S3 Client … Datadog Metrics Agent System Splunk Universal Forwarder Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Splunk Universal Forwarder Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Sp Un For Datad A Universal Analytics Client S3 Client … Datado A System Splunk Universal Forwarder Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Splunk Universal Forwarder Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Splunk Universal Forwarder Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Sp Un For Datad A Universal Analytics Client S3 Client … Datado A Splunk Universal Forwarder Universal Analytics Client Splunk Universal Forwarder Universal Analytics Client Splunk Universal Forwarder Universal Analytics Client Sp Un For Universal Analytics Client System System System System
  40. System Sumo Logic Collector Datadog APM Agent Universal Analytics Client

    S3 Client … Datadog Metrics Agent System Sumo Logic Collector Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Sumo Logic Collector Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Sum Co Datad A Universal Analytics Client S3 Client … Datado A System Sumo Logic Collector Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Sumo Logic Collector Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Sumo Logic Collector Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Sum Co Datad A Universal Analytics Client S3 Client … Datado A Sumo Logic Collector Universal Analytics Client Sumo Logic Collector Universal Analytics Client Sumo Logic Collector Universal Analytics Client Sum Co Universal Analytics Client System System System System
  41. System Sumo Logic Collector Datadog APM Agent Universal Analytics Client

    S3 Client … Datadog Metrics Agent System Sumo Logic Collector Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Sumo Logic Collector Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Sum Co Datad A Universal Analytics Client S3 Client … Datado A System Sumo Logic Collector Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Sumo Logic Collector Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Sumo Logic Collector Datadog APM Agent Universal Analytics Client S3 Client … Datadog Metrics Agent System Sum Co Datad A Universal Analytics Client S3 Client … Datado A Sumo Logic Collector Universal Analytics Client Sumo Logic Collector Universal Analytics Client Sumo Logic Collector Universal Analytics Client Sum Co Universal Analytics Client System System System System
  42. System Sumo Logic Collector Universal Analytics Client S3 Client …

    New Relic APM Agent System Sumo Logic Collector Universal Analytics Client S3 Client … New Relic APM Agent System Sumo Logic Collector Universal Analytics Client S3 Client … New Relic APM Agent System Sum Co Universal Analytics Client S3 Client … New R A System Sumo Logic Collector Universal Analytics Client S3 Client … New Relic APM Agent System Sumo Logic Collector Universal Analytics Client S3 Client … New Relic APM Agent System Sumo Logic Collector Universal Analytics Client S3 Client … New Relic APM Agent System Sum Co Universal Analytics Client S3 Client … New R A Sumo Logic Collector Universal Analytics Client Sumo Logic Collector Universal Analytics Client Sumo Logic Collector Universal Analytics Client Sum Co Universal Analytics Client System System System System
  43. System Sumo Logic Collector Universal Analytics Client S3 Client …

    New Relic APM Agent System Sumo Logic Collector Universal Analytics Client S3 Client … New Relic APM Agent System Sumo Logic Collector Universal Analytics Client S3 Client … New Relic APM Agent System Sum Co Universal Analytics Client S3 Client … New R A System Sumo Logic Collector Universal Analytics Client S3 Client … New Relic APM Agent System Sumo Logic Collector Universal Analytics Client S3 Client … New Relic APM Agent System Sumo Logic Collector Universal Analytics Client S3 Client … New Relic APM Agent System Sum Co Universal Analytics Client S3 Client … New R A Sumo Logic Collector Universal Analytics Client Sumo Logic Collector Universal Analytics Client Sumo Logic Collector Universal Analytics Client Sum Co Universal Analytics Client System System System System
  44. System Sumo Logic Collector Universal Analytics Client S3 Client …

    New Relic APM Agent System Sumo Logic Collector Universal Analytics Client S3 Client … New Relic APM Agent System Sumo Logic Collector Universal Analytics Client S3 Client … New Relic APM Agent System Sum Co Universal Analytics Client S3 Client … New R A System Sumo Logic Collector Universal Analytics Client S3 Client … New Relic APM Agent System Sumo Logic Collector Universal Analytics Client S3 Client … New Relic APM Agent System Sumo Logic Collector Universal Analytics Client S3 Client … New Relic APM Agent System Sum Co Universal Analytics Client S3 Client … New R A Sumo Logic Collector Universal Analytics Client Sumo Logic Collector Universal Analytics Client Sumo Logic Collector Universal Analytics Client Sum Co Universal Analytics Client System System System System Honeytail Agent Honeytail Agent Honeytail Agent Honey Honeytail Agent Honeytail Agent Honeytail Agent Honey
  45. @tyler_treat How big of a lift is it for your

    organization to change tools?
  46. @tyler_treat Data Sources • VMs • Containers • Load balancers

    • Service meshes • Audit logs • VPC flow logs • Firewall logs • … Data Sinks • Centralized logging • SIEM • Monitoring • APM • Alerting • Cold storage • BI • … What data to send? Where to send it? How to send it?
  47. @tyler_treat What data to send? Where to send it? How

    to send it? Data Sources • VMs • Containers • Load balancers • Service meshes • Audit logs • VPC flow logs • Firewall logs • … Data Sinks • Centralized logging • SIEM • Monitoring • APM • Alerting • Cold storage • BI • … Observability Pipeline
  48. @tyler_treat { “timestamp”: “2019-04-05 13:26.42”, “level”: “ERROR”, “event”: “user_login_error”, “user”:

    “tylertreat”, “email”: “[email protected]”, “error”: “Invalid username or password”, “message”: “User login failed” }
  49. @tyler_treat { “timestamp”: “2019-04-05 13:26.42”, “level”: “ERROR”, “event”: “user_login_error”, “context”:

    { “id”: “accfbb8315c44a52ad893ca6772e1caf”, “http_method”: “POST”, “http_path”: “/login”, “user”: “tylertreat”, “email”: “[email protected]”, }, “error”: “Invalid username or password”, “message”: “User login failed” }
  50. @tyler_treat { “timestamp”: “2019-04-05 13:26.42”, “level”: “ERROR”, “event”: “user_login_error”, “context”:

    { “id”: “accfbb8315c44a52ad893ca6772e1caf”, “http_method”: “POST”, “http_path”: “/login”, “user”: “tylertreat”, “email”: “[email protected]”, }, “error”: “Invalid username or password”, “message”: “User login failed” }
  51. @tyler_treat { “timestamp”: “2019-04-05 13:26.42”, “level”: “INFO”, “event”: “user_login”, “context”:

    { “id”: “accfbb8315c44a52ad893ca6772e1caf”, “http_method”: “POST”, “http_path”: “/login”, “user”: “tylertreat”,
 “user_id”: “3bb12f6c63274abe87fd1ee4ee37f3d2”,
 “license”: “942e6543f0844be680e72003d5e060fd”, “email”: “[email protected]”, } }
  52. @tyler_treat We need libraries that implement the specs and make

    it easy for devs to instrument their systems.
  53. @tyler_treat • Java: log4j • Go: logrus • Python: structlog

    • Ruby: ruby-cabin • .NET: serilog • JS: structured-log • etc. There are many existing libraries for structured logging.
  54. @tyler_treat We need a lightweight agent that can collect data

    from hosts/containers. 3. Data Collector
  55. @tyler_treat We need a scalable, fault-tolerant data stream to handle

    the firehose of observability data generated. 4. Data Pipeline
  56. @tyler_treat System Splunk Universal Forwarder Datadog APM Agent Universal Analytics

    Client Amazon Glacier S3 Client … Datadog Metrics Agent
  57. @tyler_treat System Splunk Universal Forwarder Datadog APM Agent Universal Analytics

    Client Amazon Glacier S3 Client … Datadog Metrics Agent
  58. @tyler_treat We need a component to consume data from the

    pipeline, perform filtering, and write it to the appropriate backends. 5. Data Router
  59. @tyler_treat May perform transformations and processing of data, but heavy

    processing should be the responsibility of a backend system (e.g. alerting or aggregations).
  60. @tyler_treat Evolving to an Observability Pipeline • Adopt structured logging

    • Move log/data collection out of process • Use a centralized logging system • Introduce a streaming data solution • Start adding data consumers
  61. @tyler_treat This maps to VMs and containers as well as

    it does to “serverless” models.
  62. @tyler_treat Dev/Ops/SRE Systems Production Audit Business Analytics Pricing Decisions Data-Driven

    Product Decisions Threat Detection Monitoring Debugging & Operational Insights ...
  63. @tyler_treat Benefits • Pattern can be evolved to with quick

    wins along the way • Maps to elastic and serverless architectures better • Empowers teams in siloed organizations and unlocks data for other parts of the business • Enables teams to use the tools best suited to their needs • Easier to change tools or evaluate them side-by-side by decoupling • Minimizes impact on developers and the core system
  64. @tyler_treat Downsides • Moving away from agent-based model means we

    have to handle data routing ourselves • A lot of the Data Router components might need to be custom-made using various vendor SDKs or client libraries (assuming they have APIs) • This also means we might lose some of the value-add features of certain agents • Unclear how well this maps to pull-based models (e.g. Prometheus)