The Rise of
Real-Time Analytics
Kishore Gopalakrishna
Slide 2
Slide 2 text
Java Database Real-Time
Slide 3
Slide 3 text
Decisions, Decisions, Decisions…
35K
Per day
1 Billion
In a lifetime
Slide 4
Slide 4 text
Decisions, Decisions, Decisions…
35K
Per day
1 Billion
In a lifetime
Slide 5
Slide 5 text
But the real question is:
HOW are the decisions being made?
Slide 6
Slide 6 text
Everyone
Intuition Based
Executives
Data-Driven
(Internal)
Operators Engineers
Data-Driven
(Executives, CXOs)
Evolution of Decision-Making
Slide 7
Slide 7 text
What about our users and customers?
and data freshness?
Slide 8
Slide 8 text
User-Facing Analytics
Customers and users
Everyone
(billions)
Everyone in
the company
(millions)
Executives
(thousands)
Real-Time Analytics
World is constantly changing
Slide 9
Slide 9 text
Use Case: User-Facing + Real-Time Analytics
Analysts
Data Science
Workbench
Internal Users
Operational
Workbench
Operators
External Users
Eater
Rider Restaurant
Deliver Order
Pickup
Slide 10
Slide 10 text
Real-Time Analytics
Maximize Value
Accuracy
Agility
Real-Time
Batch
Value
Time
Milliseconds Seconds Minutes Days Months
Slide 11
Slide 11 text
Defining Real-Time?
Value
Time
Maximize the Value
Based on Use Case
Impact Point
1PB+
DATA SIZE
200K+
QUERIES/SEC
< 100ms
QUERY LATENCY
Slide 23
Slide 23 text
Apache Pinot Impact
1,000 Nodes
75
Nodes
45X Improvement in Efficiency
● 5,000 queries/sec
● ~5ms average latency
● <100ms 95th percentile
After
Before
Before Pinot
After Pinot
1,000 queries/sec
5,000 queries/sec
Slide 24
Slide 24 text
Powered by Apache Pinot
Retail FinTech/Banking Food/Logistics Media/Comms
Cloud Native/
SaaS
Other industries
Slide 25
Slide 25 text
Community
Contributors
300+
Slack Members
4,600+
Docker Downloads
5.6M+
Slide 26
Slide 26 text
Dimensions of Real-Time Analytics
Slide 27
Slide 27 text
Dimensions for Real-Time Database
Freshness Minutes Seconds
Minutes Seconds Milliseconds
1 User 10’s 100’s - Millions
Days
Latency
Concurrency
Data Warehouse Real-Time Database
Slide 28
Slide 28 text
The Power of Indexes
Other databases try and do the same work faster, Pinot works differently
Indexes: Startree, Inverted, Sorted, JSON, GEO
Users can run a lot more queries with the same resources
Slide 29
Slide 29 text
Multiple Use Cases, Single System
Time
Value
Raw Data
Decreasing value over
time for single event
Aggregated Data
Increasing value over time for
aggregated event
Slide 30
Slide 30 text
Data age
Query frequency
Local Storage
Cloud Storage
Ultra-low latency but
tightly coupled
Slight latency trade-off in
decoupled
StarTree: Cost, Performance Trade-Off
30 days and older
Slide 31
Slide 31 text
Data age
Query frequency
Local Storage
Cloud Storage
StarTree: Cost, Performance Trade-Off
Slide 32
Slide 32 text
Apache Pinot™ as a Service
Slide 33
Slide 33 text
No content
Slide 34
Slide 34 text
Public SaaS
Customer Network StarTree Network
Control Plane
Data Plane
End Users
Apps
Systems
End Users
Private SaaS
(Bring Your
Own Cloud)
Data
Plane
Apps
Systems
Control Plane
StarTree Flexible Deployment
Slide 35
Slide 35 text
StarTree Applications
Slide 36
Slide 36 text
Startree Customers
“Pinot enables us to execute sub-second petabyte-scale
aggregation queries over fresh financial events in our internal
ledger. We choose Pinot because of its rich feature set and
scalability, which has enabled better performance than our
previous solution – at a lower cost”
Stripe
“StarTree Cloud made it easy to get started with Pinot and
real-time applications. We were able to ingest batch data and
use real-time apps to significantly reduce Mean Time to Detect
and Mean Time to Respond for key business metrics ”
Just Eats Takeaway
Slide 37
Slide 37 text
Make the leap to real-time. We are here to help.
Everyone
Random /
Intuition Based
Executives
Data-Driven
(Internal)
Data-Driven
(Users/Customers)
Operators Engineers
Everyone
Data-Driven
(Executives, CXOs)
Slide 38
Slide 38 text
And prepare for the future.
Everyone
Random /
Intuition Based
Executives
Data-Driven
(Internal)
Data-Driven
(Users/Customers)
Operators Engineers
Everyone
Data-Driven
(Executives, CXOs)
Data-Driven
(Machines)
Machines
Slide 39
Slide 39 text
Thank you.
Slide 40
Slide 40 text
The power of indexes
Indexes: Startree, Inverted, Sorted, JSON, GEO
Users can run a lot more queries with the same resources
Other databases try and do the same work faster, Pinot works differently
Slide 41
Slide 41 text
Dimensions for Real-Time Database
Freshness Minutes Seconds
Minutes Seconds Milliseconds
1 User 10’s 100’s - Millions
Days
Latency
Concurrency
Data
Warehouse
Real-Time
Database