Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Keynote: Kishore Gopalakrishna, StarTree - The ...

Keynote: Kishore Gopalakrishna, StarTree - The Rise of Real-Time Analytics | RTA Summit 2023

In recent years, real-time streaming has revolutionized transactional data, and while legacy data warehouse processes have been replaced by data lakes and cloud-native solutions, traditional batch stack are no longer adequate to meet the needs of today’s fast-paced world where everyone is a decision maker.

Leading companies like LinkedIn, Uber, Stripe, and others worldwide have realized that analytics must keep up with the transformation that application architectures have undergone. Merely generating reports and dashboards for internal decision-makers is no longer sufficient. In today’s competitive environment, businesses require real-time, actionable insights that can guide their interactions with users on websites, mobile apps, and services. The rise of real-time analytics is imperative to empower businesses to make data-driven decisions in the moment, enabling them to stay ahead in the ever-evolving world where every user is a decision maker. Embracing this shift towards real-time analytics is essential to deliver exceptional user experiences and meet the dynamic demands of the modern business landscape.

StarTree

May 23, 2023
Tweet

More Decks by StarTree

Other Decks in Technology

Transcript

  1. User-Facing Analytics Customers and users Everyone (billions) Everyone in the

    company (millions) Executives (thousands) Real-Time Analytics World is constantly changing
  2. Use Case: User-Facing + Real-Time Analytics Analysts Data Science Workbench

    Internal Users Operational Workbench Operators External Users Eater Rider Restaurant Deliver Order Pickup
  3. Internal Users Batch Rise of Real-Time Architecture Database ETL Datalake

    / DWH Hours/Days Real-Time Event Source Events Insights
  4. Internal Users Batch Rise of Real-Time Architecture Database ETL Datalake

    / DWH Hours/Days Real-Time Event Source Real-Time Processing Events Insights
  5. Internal Users Batch Rise of Real-Time Architecture Database ETL Datalake

    / DWH Hours/Days Batch Real-Time Event Source Real-Time Processing Events Insights
  6. Internal Users Batch Rise of Real-Time Architecture Database ETL Datalake

    / DWH Hours/Days Real-Time Event Source Real-Time Processing Batch Real-Time Database Missing piece Milliseconds/secs Events Insights
  7. Internal Users Batch Rise of Real-Time Architecture Database ETL Datalake

    / DWH Hours/Days Real-Time Event Source Real-Time Processing Batch Real-Time Database Missing piece External Users Milliseconds/secs Events Insights
  8. Apache Pinot Impact 1,000 Nodes 75 Nodes 45X Improvement in

    Efficiency • 5,000 queries/sec • ~5ms average latency • <100ms 95th percentile After Before Before Pinot After Pinot 1,000 queries/sec 5,000 queries/sec
  9. Dimensions for Real-Time Database Freshness Minutes Seconds Minutes Seconds Milliseconds

    1 User 10’s 100’s - Millions Days Latency Concurrency Data Warehouse Real-Time Database
  10. The Power of Indexes Other databases try and do the

    same work faster, Pinot works differently Indexes: Startree, Inverted, Sorted, JSON, GEO Users can run a lot more queries with the same resources
  11. Multiple Use Cases, Single System Time Value Raw Data Decreasing

    value over time for single event Aggregated Data Increasing value over time for aggregated event
  12. Data age Query frequency Local Storage Cloud Storage Ultra-low latency

    but tightly coupled Slight latency trade-off in decoupled StarTree: Cost, Performance Trade-Off 30 days and older
  13. Public SaaS Customer Network StarTree Network Control Plane Data Plane

    End Users Apps Systems End Users Private SaaS (Bring Your Own Cloud) Data Plane Apps Systems Control Plane StarTree Flexible Deployment
  14. Startree Customers “Pinot enables us to execute sub-second petabyte-scale aggregation

    queries over fresh financial events in our internal ledger. We choose Pinot because of its rich feature set and scalability, which has enabled better performance than our previous solution – at a lower cost” Stripe “StarTree Cloud made it easy to get started with Pinot and real-time applications. We were able to ingest batch data and use real-time apps to significantly reduce Mean Time to Detect and Mean Time to Respond for key business metrics ” Just Eats Takeaway
  15. Make the leap to real-time. We are here to help.

    Everyone Random / Intuition Based Executives Data-Driven (Internal) Data-Driven (Users/Customers) Operators Engineers Everyone Data-Driven (Executives, CXOs)
  16. And prepare for the future. Everyone Random / Intuition Based

    Executives Data-Driven (Internal) Data-Driven (Users/Customers) Operators Engineers Everyone Data-Driven (Executives, CXOs) Data-Driven (Machines) Machines
  17. The power of indexes Indexes: Startree, Inverted, Sorted, JSON, GEO

    Users can run a lot more queries with the same resources Other databases try and do the same work faster, Pinot works differently
  18. Dimensions for Real-Time Database Freshness Minutes Seconds Minutes Seconds Milliseconds

    1 User 10’s 100’s - Millions Days Latency Concurrency Data Warehouse Real-Time Database