Inside Apache Druid: Designed for Performance

Inside Apache Druid Designed for Performance Gian Merlino [email protected]

Who am I? Gian Merlino Committer & PMC member on
Cofounder at 10 years working on scalable systems 2

Agenda • Why does Apache Druid exist? • The problem
• How does it work? • Do try this at home! 3

Why does Apache Druid exist? 4

Where Druid ﬁts in 5 Conﬁdential. Do not redistribute. Data
Data Data Data Sources Stream processors Data lake / stream hub Real-time analytics Stream processing ETL Storage Apps Archive to data lake

The problem 6

7 “Druid is an open source data store designed for
real-time exploratory analytics on large data sets. Druid was originally developed to power a slice-and-dice analytical UI built on top of large event streams. The original use case for Druid targeted ingest rates of millions of records/sec, retention of over a year of data, and query latencies of sub-second to a few seconds.” Source: https://wiki.apache.org/incubator/DruidProposal

The problem 8

Challenges • Scale: when data is large, we need a
lot of servers • Speed: aiming for sub-second response time • Complexity: too much ﬁne grain to precompute • High dimensionality: 10s or 100s of dimensions • Concurrency: many users and tenants • Freshness: load from streams 9

10 high performance real-time analytics database

What is Druid? • “high performance”: bread-and-butter fast scan rates
+ ‘tricks’ • “real-time”: streaming ingestion, interactive query speeds • “analytics”: counting, ranking, groupBy, time trend • “database”: the cluster stores a copy of your data and helps you manage it 11

Key features • Column oriented • High concurrency • Scalable
to 100s of servers, millions of messages/sec • Continuous, real-time ingest • Query through SQL • Target query latency sub-second to a few seconds 12

Use cases • Clickstreams, user behavior • Digital advertising •
Application performance management • Network ﬂows • IoT 13

Powered by Druid 14 Source: http://druid.io/druid-powered.html + many more!

Powered by Druid “The performance is great ... some of
the tables that we have internally in Druid have billions and billions of events in them, and we’re scanning them in under a second.” 15 Source: https://www.infoworld.com/article/2949168/hadoop/yahoo-struts-its-hadoop-stuﬀ.html From Yahoo:

Why this works • Computers are fast these days •
Indexes help save work and cost • But don’t be afraid to scan tables — it can be done eﬃciently 16

17 How does it work?

“Bad programmers worry about the code. Good programmers worry about
data structures and their relationships.” ― Linus Torvalds

Tricks of the trade • Druid stores data in immutable
segments • Global time index • Secondary indexes on individual columns • Memory-mappable compressed columns • Late tuple materialization • Query engines operate directly on compressed data • Vectorization (coming soon) • Rollup (partial aggregation)

Druid’s logical data model Timestamp Dimensions Metrics

Druid Segments 2011-01-01T00:01:35Z Justin Bieber SF 10 5 2011-01-01T00:03:45Z Justin
Bieber LA 25 37 2011-01-01T00:05:62Z Justin Bieber SF 15 19 2011-01-01T01:06:33Z Ke$ha LA 30 45 2011-01-01T01:08:51Z Ke$ha LA 16 8 2011-01-01T01:09:17Z Miley Cyrus DC 75 10 2011-01-01T02:23:30Z Miley Cyrus DC 22 12 2011-01-01T02:49:33Z Miley Cyrus DC 90 41 Segment 2011-01-01T00/2011-01-01T01 Segment 2011-01-01T01/2011-01-01T02 Segment 2011-01-01T02/2011-01-01T03 timestamp page city added deleted Enables global time index.

page (STRING) Anatomy of a Druid Segment Physical storage format
removed (LONG) __time (LONG) 1293840000000 1293840000000 1293840000000 1293840000000 1293840000000 1293840000000 1293840000000 1293840000000 DATA DICT INDEX 0 0 0 1 1 2 2 2 Justin = 0 Ke$ha = 1 Miley = 2 [0,1,2](11100000) [3,4] (00011000) [5,6,7](0000111) 25 42 17 170 112 67 53 94 DATA 2 1 2 1 1 0 0 0 IND [0,2] (10100000) [1,3,4](01011000) [5,6,7](00000111) DICT DC = 0 LA=1 SF = 2 INDEX city (STRING) added (LONG) 1800 2912 1953 3194 5690 1100 8423 9080 Dict encoded (sorted) Bitmap index (stored compressed)

Filtering with indexes timestamp page 2011-01-01T00:01:35Z Justin Bieber 2011-01-01T00:03:45Z Justin
Bieber 2011-01-01T00:05:62Z Justin Bieber 2011-01-01T00:06:33Z Ke$ha 2011-01-01T00:08:51Z Ke$ha JB or KS [ 1 1 1 1 1] Justin Bieber [1 1 1 0 0] Ke$ha [0 0 0 1 1]

Query execution pipeline 24 Scan Filter Project Aggregate Filter Project
Sort Druid’s query execution pipeline

Late materialization 25 Scan Filter Aggregate Scan is a no-op!
Load column data necessary for ﬁltering, cache it in case it’s used for later operators too Load column data necessary for aggregation

Operating on compressed data 26 • Recall that columnar data
is dictionary-encoded as a form of compression • Aggregation operators read dictionary codes and use them as array slots or keys in hashtable • Only when merging inter-segment results (diﬀerent dictionaries) is a dictionary lookup done Aggregate

Vectorization 27 public interface SumAggregator { void aggregate(int input); int
get(); } public interface VectorAggregator { void aggregate(int[] input); int get(); } vs. Aggregate

Vectorization 28 public interface SumAggregator { void aggregate(int input); int
get(); } public interface VectorAggregator { void aggregate(int[] input); int get(); } vs. • Minimize # of JVM function calls and related overhead • Improve CPU cache locality • Opens up possibility of using vectorized CPU instructions

Rollup • Pre-aggregation at ingestion time • Saves space, better
compression • Query performance boost

Rollup timestamp page city count sum_added sum_deleted 2011-01-01T00:00:00Z Justin Bieber
SF 3 50 61 2011-01-01T00:00:00Z Ke$ha LA 2 46 53 2011-01-01T00:00:00Z Miley Cyrus DC 4 198 88 timestamp page city added deleted 2011-01-01T00:01:35Z Justin Bieber SF 10 5 2011-01-01T00:03:45Z Justin Bieber SF 25 37 2011-01-01T00:05:62Z Justin Bieber SF 15 19 2011-01-01T00:06:33Z Ke$ha LA 30 45 2011-01-01T00:08:51Z Ke$ha LA 16 8 2011-01-01T00:09:17Z Miley Cyrus DC 75 10 2011-01-01T00:11:25Z Miley Cyrus DC 11 25 2011-01-01T00:23:30Z Miley Cyrus DC 22 12 2011-01-01T00:49:33Z Miley Cyrus DC 90 41

Roll-up vs no roll-up Do roll-up • No need to
retain high cardinality dimensions (like user id, precise location information). • All queries are some form of “GROUP BY”. Don’t roll-up • Need the ability to retrieve individual events. • May need to group or ﬁlter on any column.

Download Druid community site: https://druid.apache.org/ Druid community site (legacy): http://druid.io/
Imply distribution: https://imply.io/get-started 33

Contribute 34 https://github.com/apache/druid

Stay in touch 35 @druidio Join the community! http://druid.apache.org/ Follow
Apache Druid on Twitter!

Inside Apache Druid: Designed for Performance

Inside Apache Druid: Designed for Performance

More Decks by Imply

Other Decks in Technology

Featured

Transcript