Slide 1

Slide 1 text

The Rise of Real-Time Analytics Kishore Gopalakrishna

Slide 2

Slide 2 text

Java Database Real-Time

Slide 3

Slide 3 text

Decisions, Decisions, Decisions… 35K Per day 1 Billion In a lifetime

Slide 4

Slide 4 text

Decisions, Decisions, Decisions… 35K Per day 1 Billion In a lifetime

Slide 5

Slide 5 text

But the real question is: HOW are the decisions being made?

Slide 6

Slide 6 text

Everyone Intuition Based Executives Data-Driven (Internal) Operators Engineers Data-Driven (Executives, CXOs) Evolution of Decision-Making

Slide 7

Slide 7 text

What about our users and customers? and data freshness?

Slide 8

Slide 8 text

User-Facing Analytics Customers and users Everyone (billions) Everyone in the company (millions) Executives (thousands) Real-Time Analytics World is constantly changing

Slide 9

Slide 9 text

Use Case: User-Facing + Real-Time Analytics Analysts Data Science Workbench Internal Users Operational Workbench Operators External Users Eater Rider Restaurant Deliver Order Pickup

Slide 10

Slide 10 text

Real-Time Analytics Maximize Value Accuracy Agility Real-Time Batch Value Time Milliseconds Seconds Minutes Days Months

Slide 11

Slide 11 text

Defining Real-Time? Value Time Maximize the Value Based on Use Case Impact Point

Slide 12

Slide 12 text

REAL-TIME INTERNAL EXTERNAL BATCH User-Facing + Real-Time Analytics

Slide 13

Slide 13 text

Real-Time + User-Facing Batch + Internal

Slide 14

Slide 14 text

Walk Bike Car Flight Rocket Real-Time != Micro-Batching

Slide 15

Slide 15 text

Internal Users Batch Existing Batch Architecture Database ETL Datalake / DWH Hours/Days Events Insights

Slide 16

Slide 16 text

Internal Users Batch Rise of Real-Time Architecture Database ETL Datalake / DWH Hours/Days Real-Time Event Source Events Insights

Slide 17

Slide 17 text

Internal Users Batch Rise of Real-Time Architecture Database ETL Datalake / DWH Hours/Days Real-Time Event Source Real-Time Processing Events Insights

Slide 18

Slide 18 text

Internal Users Batch Rise of Real-Time Architecture Database ETL Datalake / DWH Hours/Days Batch Real-Time Event Source Real-Time Processing Events Insights

Slide 19

Slide 19 text

Internal Users Batch Rise of Real-Time Architecture Database ETL Datalake / DWH Hours/Days Real-Time Event Source Real-Time Processing Batch Real-Time Database Missing piece Milliseconds/secs Events Insights

Slide 20

Slide 20 text

Internal Users Batch Rise of Real-Time Architecture Database ETL Datalake / DWH Hours/Days Real-Time Event Source Real-Time Processing Batch Real-Time Database Missing piece External Users Milliseconds/secs Events Insights

Slide 21

Slide 21 text

IBM Db2 mSQL Rocket M204 Which Database?

Slide 22

Slide 22 text

1PB+ DATA SIZE 200K+ QUERIES/SEC < 100ms QUERY LATENCY

Slide 23

Slide 23 text

Apache Pinot Impact 1,000 Nodes 75 Nodes 45X Improvement in Efficiency ● 5,000 queries/sec ● ~5ms average latency ● <100ms 95th percentile After Before Before Pinot After Pinot 1,000 queries/sec 5,000 queries/sec

Slide 24

Slide 24 text

Powered by Apache Pinot Retail FinTech/Banking Food/Logistics Media/Comms Cloud Native/ SaaS Other industries

Slide 25

Slide 25 text

Community Contributors 300+ Slack Members 4,600+ Docker Downloads 5.6M+

Slide 26

Slide 26 text

Dimensions of Real-Time Analytics

Slide 27

Slide 27 text

Dimensions for Real-Time Database Freshness Minutes Seconds Minutes Seconds Milliseconds 1 User 10’s 100’s - Millions Days Latency Concurrency Data Warehouse Real-Time Database

Slide 28

Slide 28 text

The Power of Indexes Other databases try and do the same work faster, Pinot works differently Indexes: Startree, Inverted, Sorted, JSON, GEO Users can run a lot more queries with the same resources

Slide 29

Slide 29 text

Multiple Use Cases, Single System Time Value Raw Data Decreasing value over time for single event Aggregated Data Increasing value over time for aggregated event

Slide 30

Slide 30 text

Data age Query frequency Local Storage Cloud Storage Ultra-low latency but tightly coupled Slight latency trade-off in decoupled StarTree: Cost, Performance Trade-Off 30 days and older

Slide 31

Slide 31 text

Data age Query frequency Local Storage Cloud Storage StarTree: Cost, Performance Trade-Off

Slide 32

Slide 32 text

Apache Pinot™ as a Service

Slide 33

Slide 33 text

No content

Slide 34

Slide 34 text

Public SaaS Customer Network StarTree Network Control Plane Data Plane End Users Apps Systems End Users Private SaaS (Bring Your Own Cloud) Data Plane Apps Systems Control Plane StarTree Flexible Deployment

Slide 35

Slide 35 text

StarTree Applications

Slide 36

Slide 36 text

Startree Customers “Pinot enables us to execute sub-second petabyte-scale aggregation queries over fresh financial events in our internal ledger. We choose Pinot because of its rich feature set and scalability, which has enabled better performance than our previous solution – at a lower cost” Stripe “StarTree Cloud made it easy to get started with Pinot and real-time applications. We were able to ingest batch data and use real-time apps to significantly reduce Mean Time to Detect and Mean Time to Respond for key business metrics ” Just Eats Takeaway

Slide 37

Slide 37 text

Make the leap to real-time. We are here to help. Everyone Random / Intuition Based Executives Data-Driven (Internal) Data-Driven (Users/Customers) Operators Engineers Everyone Data-Driven (Executives, CXOs)

Slide 38

Slide 38 text

And prepare for the future. Everyone Random / Intuition Based Executives Data-Driven (Internal) Data-Driven (Users/Customers) Operators Engineers Everyone Data-Driven (Executives, CXOs) Data-Driven (Machines) Machines

Slide 39

Slide 39 text

Thank you.

Slide 40

Slide 40 text

The power of indexes Indexes: Startree, Inverted, Sorted, JSON, GEO Users can run a lot more queries with the same resources Other databases try and do the same work faster, Pinot works differently

Slide 41

Slide 41 text

Dimensions for Real-Time Database Freshness Minutes Seconds Minutes Seconds Milliseconds 1 User 10’s 100’s - Millions Days Latency Concurrency Data Warehouse Real-Time Database

Slide 42

Slide 42 text

Customer Slides Retail FinTech/Banking Food/Logistics Media/Comms Cloud Native/ SaaS Other industries

Slide 43

Slide 43 text

The Power of Indexes